I am creating a threaded message system much like gmail and facebook where the inbox lists the most recent threads displaying the subject, the sender's name, and time-stamp of the most recent message.
Here's How my tables are set up:
users:
user_id
user_name
thread:
thread_id
title
to_id
to_keep
to_read
from_id
from_keep
date
message:
message_id
thread_id
to_id
from_id
message_text
date
What I'm doing right now is when a user creates a new message, it creates a new thread in the thread table and then a new message in the message table and if a user responds to a thread, it duplicates the current thread in the thread table except it swaps the to_id
and from_id
and then creates a new message based on that.
Also, for the inbox view, I'm able to just query all threads based on a user_id
. so something like SELECT * FROM thread WHERE to_id = 2 and to_keep = TRUE ORDER BY date DESC
or if I want to view messages in the outbox it would be something like SELECT * FROM thread WHERE from_id = 2 and from_keep = TRUE ORDER BY date DESC
.
If a user opens a thread when there's a new message, then to_read is updated to true UPDATE thread SET to_read = TRUE WHERE thread_id = 4
.
I feel like I'm over complicating this process and that there should be a better way to do this.
Any help or ideas would be appreciated.
This way let's me just select everything from the thread table and then do a join with the user table to display everything I need. However I feel like there should be a better way to do this.
Why don't you separate out message relationships from user's view of each message?
I would do the threading by a self-referencing relationship on message. In other words, the message has a "responding_to_message_id" column.
I'm not sure I understand why you have a "to_id". Are messages directed to individual users? This seems very limited. I would think that you would either have no recipient (i.e. the recipient is a message board that anyone can read) or you would have the ability to specify multiple recipients, just like with an e-mail. Perhaps you can explain more about how the system is to be used.
Assuming (for simplicity) that you are posting to a board, so only the "from" is important, then you have your message table, with self-referencing relationship for threading, a user table, and then an intersection table between user and message that stores which messages have been read by each user.
That way, if you want to know if a user has read a message or not, just attempt to read the user ID in the intersection table for the given message. If it isn't there, then that message is unread by that user.
Note that if you want to have single recipients this design holds and if you want to have multiple recipients you can use an intersection table to hold the list of recipients for each message. If you do have a recipient intersection table, it can do double-duty as your read status table.
EDIT: ERD Sketch:
Here is a quick sketch of what I'm talking about...
Whether or not the sender has chosen to keep the message is flagged on the message itself. If the message is the start of a new thread, the reply_to_message_id column is NULL, otherwise it is the message_id of the parent message. There can be mulitple recipients, each of which have their own ability to keep the message or not, as well as the ability to track the date and time when the recipient reads the message.
EDIT 2: Alternate ERD and Querying for Most Recent Message
@OP asked how to query for the most recent message in a thread. The answer depends on the form of the thread. You can either have a flat thread where every message goes to the end of the linear stream of messages or you can have a tree-shaped thread where each message has a specific parent, unless it's the root of the thread. In the ERD above, the reply_to_message_id field could be used either way. If the thread is flat, then the FK is always to the root MESSAGE. If the thread is tree-shaped, then the FK is to the immediate parent of the reply MESSAGE.
If a typical query you want to run is "what is the most recent message in a thread?" and your threads are flat, then you can use SQL like this:
select top 1
M.message_id
, M.sent_datetime
, M.title
, M.message_text
, S.user_id
, S.user_name
-- and anything else you want...
from MESSAGE M inner join USER S
on M.sender_user_id = U.user_id
where M.reply_to_message_id = @ThreadRootMessageID
order by
M.sent_datetime desc
If, on the other hand, your threads are tree-shaped and this is a query you want to be able to run quickly and easily, then the schema in the ERD above is not very easy to work with. SQL is not good at trees. You can solve the problem with a little bit of denormalization. See the ERD below:
Note that there is now one FK to show the immediate parent and one FK to show the root. Since threads aren't subject to editing - at least to edits where the root of a message is changed to point at a different thread, the denormalization that this entails does not imply risk of update anomallies so the redundancy is not too problematic.
If you use this ERD then the query for "most recent message in thread X" is the same as above, but with M.thread_root_message_id in the where clause instead of M.reply_to_message_id.