How do I implement threaded comments?

Niyaz picture Niyaz · Feb 28, 2009 · Viewed 12k times · Source

I am developing a web application that can support threaded comments. I need the ability to rearrange the comments based on the number of votes received. (Identical to how threaded comments work in reddit)

I would love to hear the inputs from the SO community on how to do it.

How should I design the comments table? Here is the structure I am using now:

Comment
    id
    parent_post
    parent_comment
    author
    points

What changes should be done to this structure?

How should I get the details from this table to display them in the correct manner? (Implementation in any language is welcome. I just want to know how to do it in the best possible manner)

What are the stuff I need to take care while implementing this feature so that there is less load on the CPU/Database?

Thanks in advance.

Answer

Frans Bouma picture Frans Bouma · Feb 28, 2009

Storing trees in a database is a subject which has many different solutions. It depends on if you want to retrieve a subhierarchy as well (so all children of item X) or if you just want to grab the entire set of hierarchies and build the tree in an O(n) way in memory using a dictionary.

Your table has the advantage that you can fetch all comments on a post in 1 go, by filtering on the parentpost. As you've defined the comment's parent in the textbook/naive way, you have to build the tree in memory (see below). If you want to obtain the tree from the DB, you need a different way to store a tree: See my description of a pre-calc based approach here: http://www.llblgen.com/tinyforum/GotoMessage.aspx?MessageID=17746&ThreadID=3208 or by using balanced trees described by CELKO here:

or yet another approach: http://www.sqlteam.com/article/more-trees-hierarchies-in-sql

If you fetch everything in a hierarchy in memory and build the tree there, it can be more efficient due to the fact that the query is pretty simple: select .. from Comment where ParentPost = @id ORDER BY ParentComment ASC

After that query, you build the tree in memory with just 1 dictionary which keeps track of the tuple CommentID - Comment. You now walk through the resultset and build the tree on the fly: every comment you run into, you can lookup its parentcomment in the dictionary and then store the comment currently processed also in that dictionary.