Saturday, January 24, 2004

Data Modeling

Database Systems 5e; Rob Cornel

Unit2


Chapter 2
  1. A Logical view of data
    1. Entities and Attributes
      Entity something that we need to collect data on.
      Entity Set named collection of entities
      Attributes characteristics of an entity
    2. Tables and Their characteristics
      Table a two dimensional structure composed of rows and columns, a group of related entities. Tables are the easiest way for us to visualize the data. Important characteristics of a relational table
      1. perceived as two dimensional structure composed of rows and columns
      2. Each table row (tuple) represents a single entity within the entity set
      3. Each table column represents an attribute, and each column has a distinct name
      4. Each table must have an attribute or combination of attributes that uniquely identifies each row
      5. All values in a column must conform to the same data format. Example: attribute is assigned integer format, all values must be integers
      6. Each column has a specific range of values known as the attribute domain
      7. The order of rows and columns is immaterial to the DBMS
      The various data types break down as follows:
      • Numeric: strictly numbers that can be manipulated by mathematics
      • character: text or numbers not for mathematics computations.
      • Date: Specific day, usually in the Julian calendar format
      • Logical: true or false, yes or no, 1 or 0
      Each table must have a primary key, something that uniquely identifies a row in the table. In can be one or a combination of attributes. A range of permissible values for an attribute is the domain.
  2. Keys
    A key is one or more attribute that uniquely identify an entity. There are other keys besides the primary key talked about already. Keys are used to find determination, this means that knowing the value of one item we can look up or determine the value of another attribute. It is usually written in the format A -> B. If multiples it would be A -> B,C,D. Related to this is functional dependence, where the value of B is functionally dependant of A if A determines or returns only one value of B. If it takes more that one key to show determination, then we have a composite key. Any part of this composite is a key attribute. A superkey is any key that determines each entity uniquely. A candidate key is the same but without redundancies. A superkey can be multiple keys even if one could stand by itself. the candidate key means it can stand by itself to give desired results. Null values, or values that have no value, are not allowed in a key value. A foreign key represents an attribute whose value matches the primary key in a related table. referential integrity is maintained if the foreign key contains a value that returns a valid row (tuple). A secondary key is one for retrieval purposes only. It helps in narrowing down a search if one does not know the primary key.
  3. Integrity Rules Revisited
    Database's should have entity integrity each entity is represented by a primary key. It should also have referential integrity, where a foreign key returns a unique entity. In some cases a dummy variable may be necessary for this, or a null value. For integrity it is better to use a dummy or flag variable as nulls cause problems with data integrity.
  4. Relational Database Operators
    Relational Algebra manipulating table contents using the eight relational operators.
    1. Union - combines all rows from two tables. (columns and domains must be identical or union compatible)
    2. Intersect - only the rows that appear in both tables. Must be union compatible
    3. Difference - all rows in one table that are not found in another table. Must be union compatible
    4. Product - all possible pairs of rows from two tables.
    5. Select - returns values for rows that mach specifications, or if none given will return all
    6. Project - returns all values for a selected attribute. Attributes can be combined.
    7. Join - Combines information in two tables that have a relationship. Multiple types of joins    
      1. Natural Join - it is broken into three steps
        1. a product of the two tables.
        2. This product fed through select so that only related columns displayed - this returns joined columns
        3.  
        4. A project is then done to return only non-redundant information
         
      2. equi Join - tables linked on basis of equality if not equal it is referred to as theta Join
      3. outer Join - matched pairs retained and others left null. Broken down into left and right
    8. Divide
  5. The Data Dictionary and the System Catalog
    The data dictionary is the information about the database. It contains at a minimum the all the attribute names and characteristics for each table in a system. This is defined as the metadata or data about the data. The system catalog would contain more detailed information about the database including the file structure, creator and other important data. It will check for homonyms in the database, same attribute names in different tables that mean different things. It will also check for synonyms, different names to describe the same attribute.
  6. Relationships Within the Relational Database
    To explain the relationships in a database we use entity relationship (E-R) models. There are two types, 1) Crow's feet and 2) Chen. Chen is better used for conceptual material, Crow's feet better for implementation. They have been explained in notes for last chapter and do not bear repeating here.
  7. Data redundancy Revisited
    Foreign keys will minimize data redundancy, but will not eliminate it altogether. Though we should strive to eliminate data redundancy there may be times when it is necessary evil. In a retail business it is possible that a sales invoice would result in certain price, but that price for a product may change the next day. Changing the price in the database would change the reflections of sales for that day. So even though line_price in the invoice may be the same as product_price in the inventory, it is best to separate them (though the relationship could be used to populate the fields upon initial use.
  8. Indexes
    An index is used to locate information in the database quicker. It wold be the equivalent of using the index in a book to find data rather than having to read the whole book to get the same data.

[Listening to: BVOV Radio 128 kbps Stereo Stream - Kenneth Copeland Ministries - (00:00)]

No comments:

Post a Comment