Web Techniques

Sunday, April 11, 2010

Codds commandments/rules

The Relational Model was propounded by Dr. E.F. Codd of IBM in 1972. For a package to qualify as an RDBMS, compliance with the 12 rules is required. In reality, all the 12 rules don’t carry the same degree of importance and some very good RDBMS products which exist today cannot even claim to obey more than 8 or so of these rules.

RULE 0

Any truly relational database must be manageable entirely through its relational capabilities

This is a single overall rule which tries to cover all others in a single sentence. It simply means that a RDBMS must be Relational, Wholly Relational and nothing but Relational.

Rule 1 : The Information Rule

All information is explicitly and logically represented in exactly one way – by data values in tables

It means that if an item of data does not reside somewhere in a table in the database, then it does not exist and this should be extended even to information such as tables, columns, views, column names, constraints and all other objects should be contained in a table-form

Rule 2: The rule of guaranteed access

Every item of data must be logically addressable by restoring to a combination of table name, primary key value and column name.

Though it is possible to retrieve individual data item in many different ways in a Relational/SQL environment, this rule means that any item can be retrieved by supplying the table name, the primary key values of the rows to easily search the row and the column name which is to be found or propogated.

Rule 3 : The systematic treatment of null values

It means that NULL values are representation of missing and inapplicable information. This support for null values must be consistent throughout the RDBMS and it should be independent of data type( a null value in a CHAR field must mean the same as null in an integer field)

Rule 4 : The database description rule

The description of the database is held and maintained using the same logical structures used to define the data, thus allowing users with appropriate authority to query such information in the same way and using the same language as they would query any other data in the database.

Rule 5 : The comprehensive sub-language rule

This means that the RDBMS must be completely manageable though it’s own sub-categories of SQL-language commands for enabling

Data-Definition(DDL)

Data-Manipulation(DML)

Data Authorisation(DCL)

Transaction boundaries(TCL)

Retrieval of data(DQL)

View definition(Describe)

Integrity constraints(Data Definition/Data Alteration)

Rule 6: The view updating rule

All views that can be updated in theory, can also be updated by the system

It simply means that only simple views are updatable, Complex views are non-updatable,(but can be made updateable through the use of INSTEAD OF TRIGGERS in ORACLE). A simple view is a view which is based on a single table, has non calculated columns,no group function, no group by clause, no distinct, no rownum, no rowid or any such pseudocolumns. Simple view are again updatable subject to non-violation of constraints at the base table level.

Rule 7 : The Insert and Update rule

A RDBMS has to be capable of Inserting, Updating and Deleting data as a relational set(that is more than one row in a single go). Many RDBMS fail to achieve this and hence fall-back to a single-record-at-a-time procedural technique when it comes to data manipulation.

Rule 8 : The Physical independence rule

Users access to the database, via application programs, must remain logically consistent whenever changes to the storage representation or access methods to the data, are changed.

Example, if an index is built and destroyed by the DBA on a table, any user should still retrieve the same data from that table, although a bit slowly.

Rule 9 : The Logical data independence rule

Application programs must be independent of changes made to the base tables, with minor changes in the corresponding queries or use of views(makes changes in the query of the view)

TAB1 FRAG1 FRAG2

A B C D A B A C D

1 A C E 1 A 1 C E

4 A C F 4 A 4 C F

6 B D G 6 B 6 D G

2 B D H 2 B 2 D H

It should be possible to split a table vertically into more than one fragment as long as such splitting preserves all the original data(is non loss) and maintain the primary key in each and every segment

FRAG1 FRAG2 TAB1

A B A C D A B C D

1 A 1 C E 1 A C E

4 A 4 C D 4 A C D

6 B 6 D G 6 B D G

2 B 2 D H 2 B D H

It should be possible to combine base tables into one by way of a non-loss join

Rule 10 : INTEGRITY RULES

The relational model includes two general integrity rules

Integrity Rule 1 (Entity Integrity):

If the attribute A of a relation R is a primary attribute of R, then A cannot accept null values

Integrity Rule 1 (Referential Integrity):

It is concerned with foreign key, that is, with attributes of a relation having domains that are those of the primary key of another/same relation.

Rule 11 : Distribution rule

A RDBMS must have distribution independence

This is one of the attractive aspects of RDBMS packages. It simply means that databases built on relational model are suited to client-server architecture.

Rule 12 : No Subversion rule :

If an RDBMS supports a lower level language that permits for example, row-at-a-time processing, then this language must not be able to bypass any integrity rules or constraints of the relational language.

Basic RDBMS TERMS

1) RELATION
A relationship is an association among several entities
eg:- A cusst_Acct relationship associates a customer with each account that she or he has.
Also, a table or an entity set is sometimes called a relation.
In both cases, a relation is a rectangular output consisting of row(tuples or records) and columns(attributes of fields)

2) REDUNDANCY
If same piece of information is stored in database for number of times the database is said to be redundant. We should check our database should not be redundant as it wastes disk space, reduced efficiency of database, require more processing time, and their are chances of inconsistency due to it in our database.
eg:-If we have to tables emp_details (contains details of employee) and Payroll(contains Payment details to employee), than if we include details of employee in payroll table, than it is said to be redundancy as same piece of information is repeated.

3) INCONSISTENCY
Inconsistency is various copies of the same data which is not consistent. Inconsistency occurs due to redundancy, so redundancy should be reduced.
eg:- If we have details of employee stored in emp_details and payroll table than while updating information we should check that both tables are updated or not, if we update the address of one employee in emp_details and same details is not updated in payroll table, than database is said to be in inconsistent state.

4) PROPAGATING UPDATES
Propagating updates ensures that changes made to records/data of one relation or tables, are automatically made to other tables or relations. This process is known as Propagating updates. Where the term "Updates" is used to cover all the operations of insertion, deletion and modification.
We can avoid inconsistency by using propagating update technique.

5)INSTANCES
The collection of information stored in the database at a particular moment in time is called an instances of the database.

6)SCHEMES
The overall design of the database is called the database scheme. Schemes are changed infrequently, if at all.

7)USERS
There are broadly three different types of database system users.

End Users
They are users who interact with the system by invoking one of the application programs. Thus, they are persons who use the information generated by a computer based system. Retrieval is the most common function for this class of user. Generally application user, data entry operators and report generators are come under this category.

Application Programmers
They are who prepare or code the application. Application programs operates on the data in all the usual ways: retrieving information, inserting new information, deleting or changing existing information.

Database Administrator
These are the highly skilled users responsible for the overall performance and maintenance of the database. They design the database schema, create the database, create users, grant and revoke rights from users and manage the database. Their responsibility includes recovery of database in case of failure and tuning the database for optimum efficiency.

D)KEYS concept in DBMS
A Key is a single attribute or combination of two or more attributes of an
entity set that is used to identify one or more instances of the set.

PRIMARY KEY:-A primary key is a field that uniquely identifies each record in a table. As it uniquely identify each entity, it cannot contain null value and duplicate value.

SUPER KEY :- If we add additional attributes to a primary key, the resulting combination would still uniquely identify an instance of the entity set. Such augmented keys are called superkey.
A primary key is therefore a minimum superkey.

CANDIDATE KEY:-A nominee's for primary key field are known as candidate key.

ALTERNATE KEY:-A candidate key that is not the primary key is called an Alternate key.

COMPOSITE KEY:- Creating a primary key are jointly from more than one attribute is known as composite key.

FOREIGN KEY:- Foreign key is a primary key of master table, which is reference in the current table, so it is known as foreign key in the current table. A foreign key is one or more columns whose value must exist in the primary key of another table or same table

Domain

A domain defines the possible values of an attribute. Domain Integrity rules govern these values. In a database system, the domain integrity is defined by:
The datatype and the length
The NULL value acceptance
The allowable values, through techniques like constraints or rules
The default value

For example, if you define the attribute of Age, of an Employee entity, is an integer, the value of every instance of that attribute must always be numeric and an integer. If you also define that this attribute must always be positive, the a negative value is forbidden. The value of this attribute being compulsory indicates that the attribute cannot be NULL. All of these characteristics form the domain integrity of this attribute.

Integrity
Attribute Constraints
There are two major Integrity rules/constraints
Entity Integrity Constraint: States that “Primary key cannot have NULL value”

Referential Integrity Constraint: States that “Foreign Key can be either a NULL value or should be Primary Key value of some relation.

Also
Semantic Integrity Constraint :General restrictions on data and changes to it. Example Salary must be greater than 20000

Domain Integrity Constraint : Each column contains same type of data thus when you select a data type for a particular domain then DBMS will not accept any value of other data type.

Database Constraints
They are constraints on the database that require relations to satisfy certain properties. Relations that satisfy all such constraints are legal relations.

The basic concept in a relational model is that of a relation(table or output of a Query). A relation can be viewed as a table which has the following properties :

1)It is column homogeneous. That is, in any given column of a table all items are of the same kind.
2)Each item is a simple number or a character string. That is, a table must be in the First Normal Form.
3)All the rows of a table are distinct
4)The ordering of rows within a table is immaterial
5)The columns in a table are assigned distinct names and the ordering of these columns is immaterial

Database Management System

It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications.

Database management systems frees the programmer from the need to worry about the organization and location of data. The primary goal of a DBMS is to provide an environment which is both convenient and efficient to use in retrieving and storing information.

DATABASE STORAGE HIERARCHY

1) DATABASE

A collection of data files integrated and organised into a simple comprehensive file system, which is arranged to minimize duplication of data and to provide convenient access to information within that system to satisfy a wide variety of user needs.

2) DATA FILES

Datafile is the file which actually physically keeps the data in secondary storage device. It represents the physical representation of the database.

3) DATA OBJECTS(Tables)

Objects such as tables, procedures, functions, etc, especially TABLES logically keep the data stored together. These objects may me spread across one or more physical data files. Tables or Relations are Entity Sets.

4) RECORDS

A collection of related items of data that are treated as a unit. It is the information about an entity.

eg:- An employee record would be collection of all fields of one employee.
Record is sometimes referred as tuple.

5) FIELD

Individual element of data is called Field.
eg:- Bank cheque consist of following field cheque no, date, payee, numeric amt, signature, bank,etc.

Field is sometimes referred as Data item Or column or attribute.

Introducing Relational Databases

Relational Model Paradigm

1) In a DBMS, relation between two files is created and maintained programmatically. Hence the relation is temporary.

In a RDBMS, relation between the two tables can be created at the time of table creation or later on, and hence it is permanent

2) In a DBMS, we have the concept of Fields, Records and Database Files

In a RDBMS, we have the concept of Column(attribute), Row(Tuple or Entity) and Table(Relation or Entity Class/Set or Applet)

3) IN a DBMS, Network traffic is high, since most of the processing is done at the client or requestors side.

In a RDBMS, network traffic is low, since most of the processing is done at the Server side

4) DBMS does not support the Client-Server architecture. It is a point-to-point communication between two machines

Most of the RDBMSs support the Client-Server Architecture

Example MS Access is a RDBMS, which does not support the Client-Server Architecture.

5) In a DBMS, we have file –level locking.

In a RDBMS, we have table, as well as Row-level locking

6) In a DBMS, there is no support for Distributed-Database Concept

In a RDBMS, there is support for a Distributed-Database Concept

7) In a DBMS. There is no Security of Data.

In a RDBMS, we have multiple levels of Data-Security like i)Logging in Security ii) Command Level Security iii)Object Level Security

8) In a RDBMS, we have the concept of Data-Dictionary(or Meta-Data or Data about Data) implemented in the form of System Tables and Views.

In a DBMS, there is no Data-Dictionary concept.

Object Oriented DBMS Model

IRIS is an Object Oriented DBMS under development at Hewlett Packard. It supports object model and normalized data.

A class is called a type and represents a collection of objects that share the same properties. A method is called a function and objects belonging to the same type share common functions. Objects respond only to their functions. Objects can be arranged in a hierarchy and inherit properties and functions

Object-Relational Model

Object-relational database management systems (ORDBMSs) add new object storage capabilities to the relational systems at the core of modern information systems. By encapsulating methods with data structures, an ORDBMS server can execute complex analytical and data manipulation operations to search and transform multimedia and other complex objects.

Relational Algebra

Relational Algebra is a procedural language which specifies how to get the required information and how to build a relation(set) from one or more other relations(sets). It specifies operations to be performed on existing relations(sets). The basic operations are the traditional set operations : UNION, DIFFERENCE, INTERSECTION and CARTESIAN PRODUCT. The first three operations, except Cartesian product, require that the operands(sets or relations or tables) be compatible(structurally same number, order and data-type of attributes) for the operation.

Consider the following two Operands(sets or relations or tables) which are Union compatible.

P Q

ID NAME ID NAME

101 Jones 103 Smith

103 Smith 104 Loly

104 Loly 106 Byron

107 Even 110 Drew

110 Drew

112 Smith

UNION operation

Hence R=P U(Union) Q

ID Name

101 Jones

103 Smith

104 Loly

107 Even

110 Drew

112 Smith

106 Byron

DIFFERENCE operation

Hence R=P-Q

ID Name

101 Jones

107 Even

112 Smith

R=Q-P

ID Name

106 Byron

INTERSECTION operation

R=P Q

ID Name

103 Smith

104 Loly

110 Drew

CARTESIAN PRODUCT

P Q

ID NAME DEPTNAME

101 Jones Accounts

103 Smith Medical

104 Loly IT

107 Even

R=P X Q

ID NAME DEPTNAME

101 Jones Accounts

101 Jones Medical

101 Jones IT

103 Smith Accounts

103 Smith Medical

103 Smith IT

104 Loly Accounts

104 Loly Medical

104 Loly IT

107 Even Accounts

107 Even Medical

107 Even IT

Additional Relational Algebraic Operations

Besides the above 4 basic set operations, PROJECTION, SELECTION, JOIN and DIVISION.

Projection(Л)

The projection of a relation(table or set) is defined as a projection of all its tuples(rows) over some set of attributes, i.e., yields a vertical subset of the relation. The projection operation is used to either reduce the number of attributes in the relation under consideration or to reorder its attributes

PERSONAL Projection

ID NAME AGE ID AGE

1 A 23 1 23

2 B 56 2 56

3 C 34 3 34

4 D 87 4 87

5 E 25 5 27

The projection operation can also be used to reduce the cardinality of the resultant relation, that is, due to deletion of duplicate rows, if any.

Selection(σ)

The selection operation yields a horizontal subset of a given relation, that is, it selects only some tuples(rows) of a given relation under consideration. To have a tuple(row) included in the resultant relation, the specified selection conditions or predicates must be satisfied.

PERSONAL Selection(Age<=50)

ID NAME AGE ID NAME AGE

1 A 23 1 A 23

2 B 56 3 C 34

3 C 34 5 E 25

4 D 87

5 E 25

Join(∞)

It allows combining of two or more relations to form a single new relation. The tuples(rows) from the operand relations that participate in the join operation and contribute to the resultant relation are related. Thus, the join operation allows the processing of relationships existing between the operand relations.

DEPT EMP

Deptno Dname Empno Ename Age Deptno

10 Accounts 1 A 23 20

20 Medical 2 B 43 20

30 IT 3 C 65 30

4 D 42 10

5 E 45 30

6 F 62 10

Considering the above two operand relations(tables), suppose we want to respond to the query “Get the Empno of all employees whose department name is Accounts”.

This requires first computing the Cartesian product of the DEPT and EMP relations. Let us name this Cartesian product as TEMP. This is followed by selecting those tuples(rows) of TEMP where the attribute Dname has the value “Accounts” and the value of the attribute Deptno of DEPT relation is equal to the value of the attribute Deptno of EMP relation. The required result is obtained by projecting these tuples on the attribute Empno. The operations are shown below :

Cartesian Product : TEMP=(DEPT X EMP)

Projection :Л_Empno(σ Dname=’Accounts’ ^ DEPT.deptno=EMP.deptno(TEMP))

Division(+)

Consider the following relation(table) P:

A B

a1 b1

a1 b2

a2 b1

a3 b1

a4 b2

a5 b1

a5 b2

Scenario 1

Now, if the relation Q is

Hence, R= P ÷ Q

Scenario 2

Now, if the relation Q is

Hence, R= P ÷ Q

Scenario 3

Now, if the relation Q is

Hence, R= P ÷ Q

Empty

Scenario 4

Now, if the relation Q is

Empty

Hence, R= P ÷ Q

In scenario 1, the result of dividing P by Q is the relation R and it has two tuples. For each tuple in R, its product with the tuples of Q must be in P. In our scenario, (a1,b1), (a1,b2),(a5,b1) and (a5,b2) must all be tuples in P.

In scenario 2, the Cartesian product of Q and R gives a resulting relation which is again a subset of P.

In scenario 3, since there are no tuples in P with a value b3 for the attribute B(that is selection_B=b3(P)=0), we have an empty relation R, which has a cardinality of zero

In scenario 4, the relation Q is empty. It is unusual to allow division by an empty relation. The resultant relation is projection of P on the attributes in P-Q.

Consider Q as set of properties. Each tuple in P represents an object with some given property. The resultant relation R is a set of entities/tuples that posses all of the properties specified in Q. The two entities a1 and a5 posses all the properties, b1 and b2. The other entities/tuples in P, a2, a3 and a4 only posses one, not both, of the properties.

The Division operation is useful when a query involves the phrase “for all the objects/tuples having all the specified properties”

Relational calculus simply provides a definition of a relation in terms of one or more other relations. In other words, it says what data is required, but not how it is actually retrieved. Hence its is non-procedural, as it is a query system wherein queries are expressed as variables and formulas on these variables.

Relation Operators

In a relational operator is a construct that tests some kind of relation between two entities . These include numerical equality (e.g., 5=5) and inequality (e.g., 4≥3). In programming languages that include a distinct boolean type in their type system, like java, these operators return true or false, depending on whether the conditional relationship between the two entities holds or not. In other languages such as C, relational operators return the integers 0 or 1.

Relational Operator Symbol

EQ Equal to =

NE Not equal to <> or # or !=

GT Greater than >

GE Greater than or equal to >=

LT Less than <

LE Less than or equal to <=

An expression created using a relational operator forms what is known as a relational expression or a condition.

Database Design / Application Lifecycle

Before a database design is set into motion, planning is an essential stage. This is typically the responsibility of management within the organisation. Good planning will enable the design and implementation to be successfully completed with the desired results. A general definition of the desired system should also be drawn up at this stage. This would include information such as what the database application should be able to do, to what areas it will be applied, and who will be using it.

Three main factors that should be analysed in the planning are

What work will need to be done

The resources needed to complete the work

How much it will all cost

Requirements Analysis and Collection

Information can be gathered using various techniques. Some of these are listed below:

Interviewing individuals

Observation

Examining documents

Using questionnaires

Using expert knowledge and experience from other design work

Interviewing many individuals can be time consuming but can result in quality feedback which can be used in the system design. Using questionnaires is a quicker and easier way to gain feedback from relevant people. It is hard to ensure that the answers people give are accurate though. Looking at the current documents and paperwork in circulation can be useful in determining Database Design

The database design stage will result in a database model that will hopefully support the organization’s goals. The two design approaches that can be used are Bottom Up or Top Down. The Top Down approach starts with a very general overview of the system and details are added in an iterative manner. Bottom Up design involves getting all the details first then constructing the overall design from the smaller more detailed parts.

The aim of a database design should be to represent all relevant data and the relationships that exist between data items. The database model should also allow for all necessary transactions on the data to be possible.

The following stages follow the database design stage:
DBMS Selection

Application Design

Prototyping

Implementation

Data Conversion and Loading (if data from an existing system needs to be put into the new database)

Testing

Operational Maintenance (follows the installation of the system)

Sunday, April 11, 2010

Codds commandments/rules

Basic RDBMS TERMS

Database Management System

Object Oriented DBMS Model

Web Techniques

Followers

Blog Archive

About Me