Relationships
Self-Referencing relationships
1Also known as recursive relationships, these are unary type relationships. These type of relationships occur when an entity can reference itself.
Recursive relationships occur often in hierarchies, and also in objects that are part of a whole. Below are some scenarios that could be modeled using recursive relationships:
- A course at a school or university can have 1 or more prerequisites.
- Organizations have people hierarchies (CEO,VP, Director, Managers, Employees).
- A finished product can be made of many sub products or parts.
- Also biological hierarchies(Dad, mom, siblings, son, etc..)
Let’s look at it in action, grab some Java mug, just an advice
We are going to solve this with the four steps as follows:
- Get a visual of the hierarchy (Extremely important).
- Create the database model.
- Fill it with some data.
- Write some queries.
Step #1: Get a visual of the hierarchy
The University in Illinois has one of the most reputable computer science programs in the world, so I have decided to model a very tiny section of their curriculum to illustrate self-referencing relationships. The following mind map represents two master level computer science classes and its prerequisites:
Step #2: Create the database model
The database table that will support this solution looks as follows:
Alert: Notice how the relationship line is optional at each end, you must do this otherwise the recursion would go in an infinite loop.
Step #3: Fill it with some data
The following screen shot shows all the data in the COURSE table that I have manually entered.
Step #4: Write some queries
You can write the queries in two styles:
1. By doing a query for each level of the hierarchy and combining them together by doing union all.
2. By writing a more generic query and providing an in clause. This step is not recommended if you have many rows in the table as it will drive you more insane than the first step.
Style #1: Query by “union all”:
We are going to do the first query technique to demonstrate what the data looks like for the master level course CS 414 (Refer to the course on the left in the mind map).
select c.id, c.course_code, c.course_name, c.course_category, c.next_course from course c inner join course d on c.next_course = d.id where c.id in (1) union all select c.id, c.course_code, c.course_name, c.course_category, c.next_course from course c inner join course d on c.next_course = d.id where c.id in (2) union all select c.id, c.course_code, c.course_name, c.course_category, c.next_course from course c inner join course d on c.next_course = d.id where c.id in (3) union all select c.id, c.course_code, c.course_name, c.course_category, c.next_course from course c inner join course d on c.next_course = d.id where c.id in (4)
The query above produces the resultset as follows:
Style #2: Query by “in” clause
Now we are going to write a shorter query that will demonstrate what the data looks like for the master level course CS 421 (Refer to the courses hanging to the right of CS 421 in the mind map).
To obtain the dependencies hanging to the left of CS 421 we could write the following query:
select c.id,
c.course_code,
c.course_name,
c.course_category,
c.next_course
from course c
inner join course d
on c.next_course = d.id
where c.id in (5,6,8,9)
This query will produce the resultset as follows:
To obtain the dependencies hanging to the right of CS 421 we could write the following query:
select c.id,
c.course_code,
c.course_name,
c.course_category,
c.next_course
from course c
inner join course d
on c.next_course = d.id
where c.id in (5,7,10,11,12,13)
This query will produce the resultset as follows:
Improvements
Now that I have finished writing the post i realize that I could improve the queries by replacing the next_course column with a next_course_name column. If you want to do that comment out the c.next_course column from the select clause and replace it with d.course_name as next_course_name.
Best Practices
The best practice is to have a collection of views that give you all the data representation that you need to have for each hierarchy. That way you only have to pull your hair once and hope that no reorganization is going to happen (EVIL LAUGH HERE) . If it does, then you will have to go through a similar painful exercise again.
Conclusion: Recursive relationships can be very tedious and difficult to implement. Hence, the bullet points below:
- documentation to understand the recursion is very important.
- A Mind Map is the best way to document a recursive relationship.
- A business group should be responsible for owning that document and passing it along to IT staff. This document should be revised at least once a year (or more) to ensure the hierarchy has not changed.
Without the mind map I would have not had the right state of mind to solve this problem. The tool I use for mind mapping is called Novamind (it is not free though) but very good for learning, brainstorming, and other purposes.
Many-to-Many Relationships
1This topic deserves special attention because many people from database designers to application developers (sometimes one person does both) do not get it right. A Many-to-Many relationship occurs when each instance from the entity on the left can have many instances from the entity on the right, and vice versa.
Some examples of many-to-many relationships are as follows:
- Each order can contain many items. Each item can also belong to many orders as it can be purchased by different customers.
- Each student can be signed up for many coursers. Each course can be assigned to multiple students.
Incorrect implementation of many-to-many relationships
We see this problem when people are trying to use foreign keys to solve this problem. Let’s illustrate it with the second example listed above as follows:
What we are trying to say here is that each student can take many courses and that each course can be taken by many students. So to implement this we have put a FK to the student table which will point to the course table.
Do you see any problems with this?
- Yes!!! we will have repeated student rows for each different course a student is taking.
- At the moment you do this your database would no longer be in 3NF because we would have repeated groups.
- This issue above can potentially yield to update anomalies.
- This relationship is said to be unresolved.
Correct implementation of many-to-many relationships
In order to correct the many-to-many implementation above we have to resolve relationship. This is done as follows:
- By adding another table between the two entities; this table is called associative entity or intersection table.
- By making the PK of the associative entity a composite key that consist of each of the parent’s PK (We are using Identifying relationship).
- By naming the intersection table as follows (not required by DBMS, but suggested): ENTITYA_ENTITYB. In our case STUDENT_COURSE or STUDENT_COURSE_REL.
- By Changing the cardinality of the parent tables to be One-to-Many to the associative entity.
Let’s see it in action:
What improvements do you see?
- The student is defined once so we avoid duplicate data.
- The courses are defined once and we avoid the same problem as if we would have put the FK in the course table in scenario #1.
- We do not run the risk of update anomalies.
- We can add meaningful fields in the associative entity. Those are called the Fixed Intersection data.
Alert: The FID will become your FACT tables and the parent entities would become your DIMENSION tables in OLAP database structures. Dimensional modeling is a different modeling technique which uses a lot of denormalization so that you can yield better query performance as data is only 1 join away at most.
One-to-Many Relationships
0A One-to-Many relationship occurs when an entity can have multiple instances of another entity in a relationship. The cardinality of the relationship between the two entities will dictate the number of those instances: 0, 1, or more. The entity on the many side must reference one and only one record in the parent entity.
Implementation
You have to bring the primary key of the parent entity (the one side of a relationship) as a foreign key to the child entity (the many side of the relationship). If your primary key on the parent entity is composed of more than one attribute then the foreign key of the child entity should also have those attributes.
There are two ways of implementing One-to-Many relationships. They are as follows:
- By using Non-Identifying Foreign key.
- By using Identifying Foreign key.
Approach #1: Using Non-Identifying Foreign key
A non-identifying relation happens when the foreign key in the dependent entity is not part of its own primary key; this is the most common approach. In the example below we have a school/student relationship with the following details:
- Business Rules: Each school can have zero or more students.
Each student must attend one school only. - Parent entity: SCHOOL
- Child or Dependent entity: STUDENT
- 1 to 1 side: SCHOOL
- many side: STUDENT
- Parent to child phrase: Each school can have 1 or more students
- Child to parent phrase: Each student must belong to a school
Approach #2: Using Identifying Foreign key
An identifying foreign key relation happens when the foreign key in the dependent entity is part of its own primary key. This approach is less popular, but perfectly valid. Let’s see in action:
- Business Rules: Each student can take many different tests during the school year.
Each test can only be taken once during a school year, and each test must belong to 1 student only. - Parent entity: STUDENT
- Child or Dependent entity: TESTS
- 1 to 1 side: STUDENT
- many side: TESTS
- Parent to child phrase: Each student can take 1 or more students
- Child to parent phrase: Each test must belong to a student
As you see in this approach the FK is part of the primary key in the dependent entity. Also by using identifying relationship here we ensure that one particular test can only be taken once by a given student each school year. So if the student was allowed to take the same test more than once in a given school year we would have to follow the non identifying foreign key approach.
Note: The students/tests relationship could have been better implemented as a many-to-many relationship since many students can take the same test. This style was just done to illustrate the identifying foreign key concept.
One-to-One Relationships
0A One-to-One relationship is a binary relationship. It occurs when one record in one entity references one record in the second entity.
Some modelers suggest that One-to-One relationships should be combined into a single entity, and that is usually considered best practice.
Implementation
The implementation is a little tricky; in order to enforce the semantics of a One-to-One relationship and avoid implementing a One-to-Many relationship we will have to put a unique index constraint in the foreign key.
In the diagram above, the STUDENT_ID foreign key in the STUDENT_DETAILS entity has a unique constraint which forces the DBMS to have only one row with that value. Without the unique constraint we would be allowing the database to have repeated STUDENT_ID values in the STUDENT_DETAILS entity and the relationship line would be out of synch with the actual implementation.
Alert:
When not to combine both entities:
- If you want better separation of information.
- If the two entities are managed by different owners.
- If the information in the referenced entity is optional and it can be null.
- If you anticipate the One-to-One relationship potentially becoming a One-to-Many.
Alert:
When to combine both entities:
- If you do not want to have an extra join in your queries.
- If the two entities are managed by the same group.
- If the information in the referenced entity is optional and it can be null (You would have come up with default values or allow for nulls).
- If you are completely sure that it will never become a One-to-Many.
Anatomy of a database relationship
0Every relationship is said to have cardinality. The cardinality is the degree of the relationship. There are two types of degrees:
- Optionality: Is the minimum degree of the relationship and it can be of two types: Optional (circle) or Mandatory (perpendicular line).
- Cardinality: Is the maximum degree of the relationship and it can be represented with a crow’s feed for (many) or omitted for one.
Also every relationship has two sentences (or at least you should make an effort of typing them for readability and understanding purposes). The relationship phrases are as follows:
- Parent to child phrase.
- Child to parent phrase.
The following diagram illustrates the anatomy of a database relationship.
As you can see it is very important to determine the cardinality of the relationships in your data model as that will dictate your business rules. One last note, the dependent entity is called the child, and the referenced table is called the parent.
Database Relationships
5In the database world the lines that connect the different entities (tables) together are called relationships; these are the relationships that occur in real life between entities as part of a process. Relationships in the database are very important because can play a significant role in the areas as follows:
- Data quality
- Non repeating groups
- business rules enforcement
- Communication on how the business works (or better said how it has been implemented) to IT staff and business users with some modeling skills.
Every RDBMS out there supports four types of relationships. They are as follows:
- One-to-One relationships.
- One-to-Many relationships.
- Many-to-Many relationships.
- Self referencing relationships (aka recursive relationships)
Relationships can be between 1 or more tables. Here is some nomenclature:
| Number of tables in relationship | Tecnical term | Friendly term |
|---|---|---|
| 1 | Unary | Recursive relationship |
| 2 | Binary | Two table relationship |
| 3 | Ternary | Three table relationship |
| 4 or more | N-ary | Four or more tables relationship |














