. 55
( 132 .)


replies to an existing topic, or replies to replies, or replies to replies to replies, and
so on. Put another way, the board must be able to deal with an indefinite level of
depth. The script must be able to react appropriately, whether the discussion goes
one level deep, five levels deep, or ten levels deep, which requires some new tech-
niques, both in your data design and in your scripts.

What do you need?
You need only two files to generate all the views needed for this application. But
these two files can have very different looks, depending on the information that is
The first file displays topics and their replies. The first time users come to the
message board they will not know what threads they wish to read. Therefore, a list
of topics will be displayed. Figure 10-1 shows the list of top-level topics.
Once a user chooses a topic the page lists all the posts within that topic. As you
can see in Figure 10-2, the top of the page shows the text and subject of the post
being read. Below that, immediate replies to that post are indicated with a colored
border, and the text of the immediate replies is also printed. Figure 10-2 also shows
that the application provides a subject, a name, and a link to posts that are more
than one level deep in the thread. You can see that it is rather easy to tell who has
replied to what.
This same page provides another view. If a user clicks through to a post that does
not start a topic, the page shows all threads beneath that post. At the top of the
page the script will print the top-level post (or root) and the post immediately prior
to the one being viewed (or parent). Figure 10-3 shows an example of this view.
Chapter 10: Threaded Discussion 313

Figure 10-1: List of top-level topics

Figure 10-2: Display of a thread
314 Part IV: Not So Simple Applications

Figure 10-3: View further down a thread

Everything you saw in the previous figures was handled by one page. The sec-
ond page posts threads to the board. This posting requires only a simple form that
contains form elements for a subject, a name, and room for the comment. The form
needs to be aware of where in the thread the message belongs. For new top-level
topics a form without any context is fine (see Figure 10-4), but for replies within an
existing thread some context is helpful (see Figure 10-5).

What do you need to prevent?
As you™ve seen in previous chapters, you need to spend quite a bit of time making
sure things work properly. Unless every post is reviewed before it becomes available
on the site, there is no good way of preventing users from posting nonsense and
then replying to their own meaningless posts. This kind of thing can get pretty dis-
tracting ” and again, no foolproof way of preventing it exists. However, you can
make it a bit more obvious to other users who is making the nefarious postings. For
that reason, this application uses the IP of origin to generate a unique ID number,
which can make it more plain who is posting what. This strategy isn™t great protec-
tion, but it is better than nothing.
Chapter 10: Threaded Discussion 315

Figure 10-4: Form for posting a top-level topic

Figure 10-5: Form for posting a lower-level topic
316 Part IV: Not So Simple Applications

The Data
Of all the applications discussed in this book, this one has perhaps the most unex-
pected data structure.
We™ll take a moment right here to tell you a little secret about database develop-
ment: Though you can usually figure out the structure of a database by going
through the normalization process, sometimes you™re better off concentrating more
on the hoped-for end result. You™ll see what we mean as you read the rest of this
But before we show you what we created and why it works so well, let us show
you an example of what you might have expected ” and why it would have been so
problematic. You might think that this application would start with a table looking
something like Table 10-1.


root_ root_ root_ root_ root_
topic_id topic_date topic_name topic_subject topic_text

1 08/20/2003 Jack Snacks Rule I love em.
2 08/20/2003 Edith More Cheetos I want my fingers orange.
3 9/1/2003 Archie M&Ms Mmmmore.

This table, as you can probably guess, would list the root topics. A simple
SELECT * FROM root_topics returns a record set of all the root topics. This table
doesn™t allow for any data below the root level. To take care of this, you might
envision a structure in which each root_topic_id is associated with another table.
Whenever you inserted a row into the root_topics table, you™d also run a CREATE
TABLE statement to make a table that would store the replies to the root topic.
For example, all the replies to the “Snacks Rule” post are stored in a table that
looks like Table 10-2. This arrangement works. A one-to-many relationship between
the tables exists, and information is available pretty readily. But now consider what
happens when somebody wants to reply to one of these posts. You have to create
yet another table. And what if you were to go another level or two deeper? It™s easy
to see that before long this would get completely out of control. With just a couple
of active threads you could end up with dozens of tables that need to be managed
and joined ” no fun at all.
Chapter 10: Threaded Discussion 317


topic_id topic_date topic_author topic_subject topic_text

1 08/20/2003 Ellen Re: Snacks Rule You betcha
2 08/20/2003 Erners Re: Snacks Rule Indeed

Now we move away from this ill-considered idea and move toward a more
sound plan. Think about what information needs to be stored for each post to the
mailing list. Start with the obvious stuff. You need a column that stores the subject
of the thread (for example, “Nachos, food of the gods”), one that stores the author™s
name, and one that records the date the item was posted. So the table starts with
these columns ” we™ve thrown in some sample information in Table 10-3 and an
auto_increment primary key just to keep it clear.


topic_id subject author date

1 Nachos rule Jay 3/12/2003
2 Cheetos are the best Brad 3/12/2003

But of course this isn™t enough. Somehow you need a way to track the ancestry
and lineage of any specific topic. (Look again at Figure 10-1 if you are not sure
what we mean.) So how are you going to do this? If you are looking to track the
ancestry of any particular thread, it probably makes sense to add a field that indi-
cates the topic that started the thread ” the root topic.
Take a close look at Table 10-4. Start with the first row. Here the root_id is the
same as the topic_id. Now look at the third row. Here the root_id (1) matches the
topic_id of the first row. So you know that the thread to which row 3 belongs
started with topic_id 1 ” “Nachos rule.” Similarly, row 6 must be a reply to row 2.
Now look at rows 1, 2, and 5. Notice that in these rows the topic_id and the
root_id are identical. At this point you can probably guess that whenever these
two are the same, it indicates a root-level topic. Easy enough, right? The following
SQL statement retrieves all the root-level topics:

select * from topics where root_id=topic_id.
318 Part IV: Not So Simple Applications


topic_id root_id subject author date

1 1 Nachos rule Jay 3/12/2003
2 2 Cheetos are the best Ed 3/12/2003
3 1 Re: Nachos rule Don 3/12/2003
4 1 Re: Nachos rule Bill 3/13/2003
5 5 What about cookies Evany 3/14/2003
6 2 Re: Cheetos are the best Ed 3/13/2003

Now that you™ve added a root_id field to the table, you should know the begin-
ning of a thread. But how can you get all the entries that came between the origi-
nal topic and the one you™re interested in? Initially you might think it would be
prudent to add a column that lists the ancestors. You could call the column ances-
tors and in it you™d have a listing of topic_ids. It might contain a string like 1,
6, 9, 12. Taking this approach would be a very, very bad idea. Why, you ask?
Well, the most important reason worth mentioning is that you should never put
multiple values in a single field ” you™ll open yourself up to all kinds of hassles.

MySQL does have a column type that takes multiple values. It is called set. It
is not used anywhere in this book because Dr. Codd would not approve. Do
you remember Dr. Codd from Chapter 1? He™s the guy who originally devel-
oped relational-database theory in the first place. Generally, it™s a bad idea to
put multiple values in a single field because, except in cases in which the
multiple values are always used together (in which case they™re not really
multiple values), you invariably end up parsing the group to use the values
separately.That™s extra work you don™t need.

So what options are you left with? Create another table to keep track of a topic™s
lineage? That isn™t necessary. The easiest thing to do is add to the previous table a
single column that tracks the parent of the current topic, as shown in Table 10-5.
Chapter 10: Threaded Discussion 319


topic_id root_id parent_id subject author date

1 1 0 Nachos rule Jay 3/12/2003
2 2 0 Cheetos are the best Ed 3/12/2003
3 1 1 Re: Nachos rule Don 3/12/2003
4 1 3 Re: Nachos rule Bill 3/13/2003
5 5 0 What about cookies Evany 3/13/2003
6 2 2 Re: Cheetos are the best Ed 3/14/2003
7 1 4 Cheetos, are you kidding Jeff 3/15/2003
8 5 5 Re: What about cookies Jay 3/15/2003

When you look at the first couple of rows in Table 10-5, you might see little dif-
ference between the fields. And that sort of makes sense: If the topic_id and the
parent_id are the same, you already know that it™s a root level and that therefore
the parent is irrelevant. Move your attention to row 7. Here you can see that root is
row 1, “Nachos rule.” That™s easy enough. Now look at the parent_id, which is row
4. If you look at the parent of row 4, you will find that it™s row 3 ” and further that
the parent of that row is row 1, which is also the root. So with just this information
you can follow a thread to its origin. A very simple script that traces a topic to its
origin looks something like this:

Select all fields from current topic
If parent_id is not equal to 0 and parent_id does not equal root_id
Make parent ID current topic
Go to line 1

So that will about do it. Using this data structure, you can get all the information
you are going to need. Throw in a couple of timestamps for safekeeping, and you™re
all set. Listing 10-1 shows the SQL statement that will create the table (using name
for the subject of the topic, create_dt for the date, and description for the text):

Listing 10-1: create table Statement for Threaded Discussion
create table topics (
topic_id integer not null auto_increment
, parent_id integer default 0

320 Part IV: Not So Simple Applications

Listing 10-1 (Continued)
, root_id integer default 0
, name varchar(255)
, description text
, create_dt timestamp(14)
, modify_dt timestamp(14)
, author varchar(255)
, author_addr_id int
, primary key (topic_id)

One other table exists in the database, to create the author-ID values we men-
tioned earlier:

create table author_addrs (
author_addr_id integer default 0 not null auto_increment
, author_addr varchar(255)
, entry_dt timestamp


. 55
( 132 .)