Entity-relationship model
Introduction
So far in the course, we have studied a lot about extracting useful information from an already-existing
- database. However, we have only barely brushed up against getting data into a database. The next
segment of this course will focus on this latter problem: how to design a useful database schema, and how to load (and alter) data in a database. One of the most important steps in creating a database happens before ever touching a database engine: how to model the slice of the real world that the database will eventually represent. A good database design balances detail and simplicity, providing sufficient detail to answer the kinds of questions that will be asked while providing a simplified view of the world that makes those questions reasonably easy to ask and the answers reasonably easy to interpret. To give a concrete example, suppose we had been tasked to create a database back-end for a CAD (computer-aided design) tool that specializes in the automotive industry. Clearly, the kinds of “car database” examples we’ve used in class would be woefully inadequate: a CAD engineer will not care in the slightest about VIN numbers and license plates (other than where they will be found in the final product); the engineer will care very much about the engine and wheels, but will demand far more detail than anything you would find in a car lot advertisement. At the other extreme, a molecule- accurate model of the car is also not useful to a car designer: nobody wants to manipulate individual molecules and atoms when designing something you can see with the naked eye! Instead, a useful database would probably store a collection of polygons in 3-D space, with adequate support to find and group polygons in various ways (by the part they correspond to, the location relative to a virtual camera’s field of view, etc.). Since very few database designers are CAD experts, any real database design for our imaginary application would have to start by exploring with tool developers what, precisely, they need the database to provide for them; similarly, since the CAD tool developers are not database experts, they would also need some guidance in expressing what they want in terms of what a database engine can possibly do (we really can’t do 3-D rendering, for example). In practice, this early design process is iterative, and proposed designs can change rapidly as people explore different options. Some form of visual representation, or model, of the problem is extremely useful to facilitate the design process and allow participants to express their thoughts clearly; we will focus on a particular model known as the Entity-Relationship Model (ERM).