"A tree-of-objects representation is more difficult to publish in a human-readable form. The objects that comprise the tree tend to all be very different, and thus it is tricky to come up with a consistent and simple table representation with which to display the objects. Any any such table representation that you do come up with would almost certainly have more than six columns, probably many more. The problem of rendering a tree-of-objects as a table is sufficiently difficult that nobody does it"
To further elaborate on this important point.
There is an 'impedance mismatch' (conceptual difficulty mapping between the two logic models) between the tree abstract data type and the table abstract data type. Specifically, there are four key differences between the simple table data structure and the more complex tree data structure that makes mapping between them a non-trivial operation.
Hierarchical: A table has a flat representation; a tree has a hierarchical
representation.
Order: The order of the rows in a table typically do not matter (they may have a unique rowid). The order of branches (nodes) and leaves in a tree is important, and the ordering itself in an encoding of valuable information.
Semi-structured: A table has a fixed structure (rows multiplied by columns). A
tree has a flexible structure - an arbitrary combination of branches (internal nodes) and leaves (terminal nodes). Semi-structured data has a structure that may not necessarily be known in advance, the tree has irregular and variable formation; the tree may have branches with missing or supplementary nodes.
Meta-data: The information describing the meaning of the data in a table
is typically stored separately from the table - consequently a schema is often mandatory. A schema is optional for a tree abstract data type.
As an aside, I have been visiting hacker news almost daily since 2010. This is my first comment on hacker news. I want to say thank you to the community for the many valuable insights over this years.
What I want to do is represent my tree-structure as a table in a relational database and then be able to efficiently get the tree structure back by transforming the table-representation back into a tree.
Further I would like to do that in plain standard SQL. This must be a common problem, any documented solutions out there?
Great question - you have touched on the key difference between a labeling scheme and an encoding scheme for tree data structures.
As mentioned previously, the tree is an abstract data type, that is to say, a conceptual model that defines the nodes in a tree, and their relationships.
To be able to evaluate a expression that processes a tree, one needs a labeling scheme. The purpose of a labeling scheme is to assign unique labels to each node in the tree and these labels must facilitate node ordering, and often (but not always) a labeling scheme will permit the reconstruction of the tree structure.
However, no labeling scheme captures the node type, names or the content stored at the nodes. For that we need an encoding scheme. An encoding scheme is constructed upon a labeling scheme and augments it with the information necessary to fully represent the tree in a table-like data structure. In answer to your question, it also permits the full transformation from the table representation to the original tree structure.
Thus, it sounds like what you are looking for is an encoding scheme.
There are many different labeling schemes out there for tree structure data, and virtually all of them can be augmented with additional information to construct a complete encoding scheme. Concerning documented solutions - I have not been active in this space for a number of years, so off the bat - I don't have a recommended documented solution to point you too.
But to help, I will put a link to my PhD thesis [1] which gives a more in-depth understanding of labeling schemes and encoding schemes for tree structured data with an example of a simple implementation of an encoding scheme enabling the full transformation from the table representation to the original tree structure (pages 5-9) and a survey of the advantages and disadvantages of existing labeling schemes concerning their usefulness to be part of an encoding scheme you could use in your solution (see chapter 2)
Caveat 1: My thesis was written in the context of updating dynamic (XML) trees but it addresses the transformation between tree and table data structures.
Caveat 2: The thesis was written 11 years ago, but every now and then I have kept in touch with the latest developments in the area, and to my knowledge, there have been no major developments since.
"A tree-of-objects representation is more difficult to publish in a human-readable form. The objects that comprise the tree tend to all be very different, and thus it is tricky to come up with a consistent and simple table representation with which to display the objects. Any any such table representation that you do come up with would almost certainly have more than six columns, probably many more. The problem of rendering a tree-of-objects as a table is sufficiently difficult that nobody does it"
To further elaborate on this important point.
There is an 'impedance mismatch' (conceptual difficulty mapping between the two logic models) between the tree abstract data type and the table abstract data type. Specifically, there are four key differences between the simple table data structure and the more complex tree data structure that makes mapping between them a non-trivial operation.
Hierarchical: A table has a flat representation; a tree has a hierarchical representation.
Order: The order of the rows in a table typically do not matter (they may have a unique rowid). The order of branches (nodes) and leaves in a tree is important, and the ordering itself in an encoding of valuable information.
Semi-structured: A table has a fixed structure (rows multiplied by columns). A tree has a flexible structure - an arbitrary combination of branches (internal nodes) and leaves (terminal nodes). Semi-structured data has a structure that may not necessarily be known in advance, the tree has irregular and variable formation; the tree may have branches with missing or supplementary nodes.
Meta-data: The information describing the meaning of the data in a table is typically stored separately from the table - consequently a schema is often mandatory. A schema is optional for a tree abstract data type.
As an aside, I have been visiting hacker news almost daily since 2010. This is my first comment on hacker news. I want to say thank you to the community for the many valuable insights over this years.