EIDETIC: JUNCTION TABLES

By Bryce Harrington

This document describes the design for the Eidetic junction table system. It explains what junction tables are, describes what they're used for, and tells how to set them up in Eidetic. Junction tables are a very powerful capability of Eidetic and make good leverage of the relational aspects of Relational Databases.

Intro to Junction Tables

Databases store information in an organized fashion. Relational databases (such as Oracle, MySQL, Postgresql, Sybase, et al.) are also able to handle "relations" between data in separate tables. Separating data out into different tables is useful because it helps eliminate redundancy, which saves storage space and maintenance work. In relational database theory parlance, this process is called "normalization".

As an example, say you wished to create a database for your musical CD collection. Each CD is produced by a particular artist. You could simply list the full name of the artist in each CD, but if several of your CD's are by the same artist, you'd need to list the artist's name over and over again. If you wished to provide additional information about the artist - such as a photo and description - the amount of redundant information could get out of hand quick!

So instead, the database can be normalized by introducing a second table specifically for artists. Each record in the artist table has an ID number, and each record in the CD table is given a artist_id number instead of a artist name. Now the amount of redundant information in the CD table is minimized to just an ID number.

In the above example we have what is called a "many to one" relationship. Artists can have many CD's, but each CD can have only one artist.

Or can it?

If we think about movie soundtracks, for example, we know that some CD's can have multiple artists - a different artist for each song. So the "many to one" relationship is an imperfect fit for this situation. We actually need a "many to many" relationship.

Unfortunately, relational databases cannot model "many to many" relationships directly. And this is where junction tables come into play.

Many-to-many relationships can be turned into two "many to one" relationships if we can introduce a "middle table" to map ID's from each of the two "outer" tables together. There are many different terms used to describe these middle tables, but one term - and the term we'll use in Eidetic - is a "junction table".

It will probably help at this point to list some sample contents of our imaginary CD database, to help visualize what a junction table is:

Artist Table   
--------------
ID  Descriptor
--------------
1   Aerosmith
2   Jon Bon Jovi
3   ZZ Top

Artist to CD Junction
---------------------
artist_id       cd_id
---------------------
1                  11
2                  11
3                  11
1                  12
2                  13
3                  14

CD Table
--------------
ID  Descriptor
--------------
11  Armageddon (Soundtrack)
12  Pump
13  Slippery When Wet
14  Fandango

CD # 11, the Armageddon soundtrack, is a collection of songs by a bunch of artists, including the three we've listed in our Artists table. Each of those artists, of course, have produced other albums (CD's 12, 13, and 14). The Junction table shows how the information is managed - there are three entries for CD #11, one each for the three artists.

Uses of Junction Tables

By now, the reader certainly can think of a whole host of ways to apply the notion of junction tables to various data representations. They crop up most frequently in situations where a given object can contain a multiplicity of items, each of which can in turn be held by any number of owners. For example:

In a recipe system, each recipe has a list of ingredients. The recipe table will have the name, description, and instructions for the dish, the ingredients table will hold information about the food item (such as calories per gram), and the recipe to ingredient junction table will hold the ID's of recipes and ingredients, and a quantity amount to show how much of that ingredient to use in the particular recipe.

In a roleplaying game, a character will have a set of particular skills, but the descriptions of those skills are shared by many characters. The junction table will hold a character ID, a skill type ID, and a score that the character has in that particular skill.

In an automated software testing environment, such as at the Open Source Development Labs, each type of host machine will have a set of tests that have been configured to run on it; each test will run on a number of different types of machines. The host type to test junction table will map the host type ID's to the tests that have been configured to run on them.

Of course, just knowing how to represent the data is only part of the battle. We've swept under the carpet exactly how all these relationships are set up and managed. Without a general purpose tool such as Eidetic to help take care of these interrelationships, the developer could face a nasty amount of coding.

Junction table support has been added to Eidetic only recently (as of 2.0) so particularly sophisticated relationships involving junction tables might be difficult to set up, but basic three-table many to many relationships such as those listed above can be done. The user will be able to generate detail webpages for an entry (like a recipe) listing all of the subparts associated with it (like the recipe's ingredients). It will automate the generation of checkbox lists to manage the relationships (and can do drop down tables or radio button selection lists for one to many relationships, as well.)

Setting up Junction Tables

Several steps are required to put a junction table into Eidetic:

First we look at how to design a tables in SQL that can be managed by Eidetic. The "outside" tables are handled much like regular Eidetic tables. The table can be named however desired, but it must have the following fields: uid (*), rsf, descriptor, created_on, created_by, last_updated_by, and last_updated_on. These are the same required fields as for regular tables.

The junction table, however, is exempted from the usual table schema requirements, but it has some unique ones of its own. First, due to the peculiarities of how Eidetic (currently) figures out junction relationships, the junction table must be named in the form "firstTable_to_secondTable_junction". It is not important which table is first and which is second, but the "_to_" and "_junction" elements must be in the name, so that Eidetic can parse correctly. The junction table requires at least three fields: "rsf", which is used similarly to regular tables, "firstTable_uid", and "secondTable_uid" (*) - with the names of your tables instead of "firstTable" and "secondTable", obviously. These names must be consistent with the junction table's name, else Eidetic will not be able to determine what the keys are.

Second we turn our attention to the HTML templates for displaying the data. In the 'display.single.tmpl' file for the firstTable module, indicate where you would like the junction data inserted via the following command:

   [% junction firstTable_to_secondTable_junction secondTable %]

Eidetic will then generate a list of elements from secondTable that have the same ID # as the currently displayed item from firstTable. Similarly, in the 'display.single.tmpl' file for the secondTable module, you would do the reverse:

   [% junction firstTable_to_secondTable_junction firstTable %]

Right now, Eidetic isn't terribly fancy in its display of the mapped-in data. In later versions there will be added functionality for pulling in info from extra fields, and to control the formatting in various ways.

Finally, we turn attention to the editing forms. In the 'edit.firstTable.form.tmpl' template you would specify where Eidetic should insert the checkbox list using the following:

   [% checkbox-list firstTable_to_secondTable_junction secondTable %]

and in the 'edit.secondTable.form.tmpl':

   [% checkbox-list firstTable_to_secondTable_junction firstTable %]

In the Eidetic distribution, have a look at the host_types and tests modules for a specific example of how this works.

Setting up Other Relationships

Eidetic supports one-to-many relationships, in addition to the many-to-many relationships via checkbox lists described above. One-to-many relationships differ in that the ID # of the item in the second table is stored directly in the first table. As well, instead of using checkboxes, one-to-many relationships are modeled as either dropdown selection boxes or by radio button lists. Here is how to implement each of these approaches:

In either case, the firstTable must have a field to hold the ID value of the secondTable, and it must be named using either of the following styles: "firstTable_uid" or "my_custom_name_uid_firstTable". firstTable needs to be part of the field's name in order to make it possible for Eidetic to parse out which table to look for the mapped information.

Also, both firstTable and secondTable must include the usual Eidetic fields: uid, descriptor, created_on, created_by, last_updated_on, last_updated_by, and rsf.

Now, for all generic, autogenerated HTML forms, Eidetic will automatically create and manage the data using dropdown boxes. So if the generic forms are okay by you, you're done. If, on the other hand, you wish to create custom forms, read on. (Currently there is no mechanism in place for extracting items from a second table for display on a custom page, but this will likely be added in an upcoming release.)

Dropdowns are specified in an edit form using the following syntax:

 [% dropdown('secondTable', 'd_secondTable_uid', '1') %]

The first parameter is the name of the table to pull the items from (it will use the contents of the 'descriptor' field as the items in the dropdown box). The second is the field in firstTable to store the data. The third parameter is the default item in the dropdown to select. (Currently, it is not possible to set the dropdown to the current value stored in the database - but this will be remedied in an upcoming release.)

If you'd like to have a set of radio buttons that are all on the same line, use this syntax:

   [% radio('secondTable', 'd_secondTable_uid', '1') %]

Radio buttons in vertical lists are created analogously:

   [% radio-list('secondTable', 'd_secondTable_uid', '1') %]

And an even more sophisticated option is to use a 'radio-table', which requires a forth parameter:

   [% radio-table('secondTable', 'd_secondTable_uid', '1', 'fieldA|fieldB|fieldC) %]

This will create a table with radio buttons down the left and data items taken from the secondTable matching the given fields in the columns on the right.

Radio tables will even work with junction tables; simply specify a junction table name in place of secondTable, and Eidetic will take care of the rest:

   [% radio-table('firstTable_to_secondTable_junction', 'd_secondTable_uid', '1', 'fieldA|fieldB|fieldC) %]

Conclusion

Many to many relationships are a very powerful relational database concept, and junction tables the standard, accepted way to implement them. Hand coding the algorithms to manage these relationships can be quite a challenge, but Eidetic provides a very easy to use mechanism that makes implementation of these sorts of systems quick to set up and easy to manage.

Now, while Eidetic has basic junction table support in it now, the story is not yet complete. Expect to see many further enhancements and revisions to come in future releases of this application, and if you discover any glitches as you make use of junction tables, please send me your patches! :-)

(*) Note: All instances of 'uid' may be changed to 'id' or similar in the future, and you can already select an alternate name by specifying it in the $idField variable in Eidetic.pm.