Paradigms
Update Anomalies and Normalization
3In the database world “update” is a very famous word, one that can make people swallow some saliva in some instances (evil laugh here) . Update anomalies is one of the many computer science paradigms, just so that you can get a feel for the magnitude of the word “update” it also means “insert” and “delete”. Update anomalies in the database space occur as a side effect of having redundant data and a poor normalized data model. The “update” anomalies are as follows:
- Insert anomalies
- Update anomalies
- Delete anomalies
To illustrate these issues we are going to use the database model and its data shown below:
STUDENT_ID STUDENT_FIRST_NAME STUDENT_LAST_NAME
1 Dario Pardo
3 Zinedine Zidane
4 Ronaldinho Gaucho
STUDENT_ID COURSE_CODE COURSE_NAME INSTRUCTOR COURSE_CREDITS
1 1 Data Modeling Graeme Simsion 4
1 2 Analysis Algorithms Weiss 4
3 2 Analysis Algorithms Weiss 4
4 2 Analysis Algorithms Weiss 4
Insert anomalies
What problems do you see?
- Repeated information (course code, course name, instructor, credits).
- Because of the first problem we cannot identify a primary key since every piece of data is repeated across multiple rows.
- Because of the second problem, we have create a composite key that consists of the student id and the course code (This will yield to the insertion anomaly).
The anomaly
The insert anomaly happens when we want to create a course but we do not have a student assigned to it yet. As you can see because of this horrible design we cannot add a new course unless we create a dummy record (in this case a dummy record in the student table) that would be assigned to every new course not available to valid students yet.
Do you see how ugly this solution is?
Do you see how this might be complicated to understand by other developers and business users?
Can you also imagine the ugly code that you will have to do at the application level to filter in/out courses not available yet?
Solution
Normalization. Redesign the database model at least to Third Normal Form.
Update anomalies
What problems do you see?
- Repeated groups of data (Course related and instructor related)
The anomaly
The update anomaly here would happen when we try to update any course related information or any instructor related information information. Say for example that we change the course name “Analysis Algorithms” to “Fundamentals of Algorithms” and that the update only happens on one row instead of on every row possible, we would have left the database in an inconsistent state.
Solution
Normalization. Redesign the database model at least to Third Normal Form.
Delete anomalies
What problems do you see?
- Repeated groups of data (Course related and instructor related)
The anomaly
The delete anomaly here would happen when we try to delete a student . It makes sense that if we delete a student, we also delete any associated rows in the course assignment for that student no? Let’s see what happens if we delete Dario Pardo which is assigned to Data Modeling class among others. Since nobody else is assigned to the Data Modeling class and that was the only row that had information about that class as well as the instructor we would have lost all the data about the course and the instructor. Now…depending on how critical and what type of information that was you or other people above you could loose their job over this; I am not kidding (no evil laugh here)
Solution
Normalization. Redesign the database model at least to Third Normal Form.
A normalized data model
This is a mini solution (it definitely could be much more elaborated, e.g self referencing relationship on courses to handle prerequisites) but for now it will get us away from the update anomalies problem by redesigning the database towards third normal form. Student, Staff, and Courses have been separated out into its own entities and have been connected through associative entities to resolve the many to many relationships to avoid redundant data. With this new design we can do as follows:
- Define new courses without creating a dummy record like we explained in the insertion anomaly.
- Updating course or teacher information now only happens once and touching 1 row only thanks to referential integrity. So we do not have to worry on any possible left outs like explained in the update anomaly.
- Deleting a student and its associated rows in the student_schedule table will not actually delete the course information like explained in the deletion anomaly.





