IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is...

34
IS0: Relational Languages Relational Languages - Relational Algebra - SQL (Structured Query Language) I (’86) en II (’92) (wij kijken slechts naar enkele ‘extra’ aspecten van SQL ’92) Literatuur : Boek Halpin hfdst 11 Dictaat: meerdere stukken over SQL ( I en II)

Transcript of IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is...

Page 1: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

IS0: Relational Languages

Relational Languages

- Relational Algebra

- SQL (Structured Query Language) I (’ 86) en II (’ 92)(wij kijken slechts naar enkele ‘extra’ aspecten van SQL ’92)

Literatuur:Boek Halpin hfdst 11Dictaat: meerdere stukken over SQL ( I en II)

Page 2: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Relational Algebra

Studying this algebra first clarifies the basic query operations without getting distracted by the specific syntax of commercial query languages.

In the relational model of data, all facts are stored in tables (or relations).

New tables may be formed from existing tables by applying operations in the relational algebra.

The original relational algebra defined by Codd contained eight relational operators: four based on traditional set operations (union, intersection, difference, and Cartesian product) and four special operations (selection, projection, join, and division).

Each of these eight relational operators is a table-forming operator on tables.

Relational algebra includes six comparison operators (=, <>, <, >, <=, >=).These are proposition-forming operators on terms.For example, x <> 0 asserts that x is not equal to zero.

It also includes three logical operators (and, or, not). These are proposition-forming operators on propositions (e.g., x > 0 and x < 8).

Since the algebra does not include arithmetic operators (e.g., +) or functions (e.g., count), it is less expressive than SQL.

Page 3: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Union (“ A �� B” or “ A union B” )

Two tables are union-compatible if and only if they have the same number of columns, and their corresponding columns are based on the same domain (the columns may have different names).

The union of tables A and B is the set of all rows belonging to A or B (or both).

Page 4: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Intersection (“ A �� B” or “ A intersect B”)and Diff erence (“ A - B” or “ A minus B” or “ A except B”)

The intersection of tables A and B is the set of rows common to A or B.

The difference between tables A and B is the set of rows belonging to A but not B.

Page 5: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Cartesian Product (Unrestricted Join) (“A �� B” or “A times B”)

If A and B are tables, A � B is formed by pairing each row of A with each row of B.(Here “pairing” means “prepending” .

The number of rows in A � B is the product of the number of rows in A and B.

The number of columns in A � B is the sum of the number of columns in A and B.

The operations of union, intersection and Cartesian product are associative(e.g., A � (B � C) = (A � B) � C ).Difference is not associative (e.g., A – ( B – C ) is not equal to ( A – B ) – C )

Page 6: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

What is wrong with this Cartesian Product?

Table aliases are needed to multiply a table by itself:

Page 7: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Codds table operations: a) Relational Selection

The selection operation chooses just those rows that satisfy a specified condition.

The selection operation may be specified using an expression of the form T where c. T denotes a table expression (i.e., an expression whose value is a table) and c denotes a condition.

The “where c” part is called a where clause.

The alternative notation 1c(T) is often used in academic journals. The “1” is sigma, the Greek s (which is the first letter of “selection”).

Page 8: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Examples

Unless parentheses are used, and is evaluated before or.

Two equivalent queries; if in doubt, include parentheses.

Three equivalent queries.

De Morgan’s laws: not ( p and q ) å not p or not qnot ( p or q ) å not p and not q

Page 9: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Codds table operations: b) Relational Projection

Relational projection involves choosing one or more columns from a table, and then eliminating any duplicate rows that might result.

We may represent the projection operation as T [ a, b, ...] where T is a table expression and

a, b, ... are the names of the required columns.

The alternative notation �a,b..(T) is common in academic journals. The “�” is pi, the Greek p(the first letter of “selection”).

Projection involves picking the columns and removing any duplicates

Page 10: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Codds table operations: c) Relational Joins

We now consider the relational join operator between two tables, which compares attribute values from the tables, using the comparison operators (=, <, >, <>, <=, >=).

There are several kinds of join operations, and we discuss only some of these here.

Let , (theta) denote any comparison operator (=, <, etc.). Then the ,-join of tables A and B on attributes a of A and b of B equals the Cartesian product A � B , restricted to those rows where A.a , B.bWe write this as shown below, an academic notation is shown in braces.

A � B where c { or A L c B }

The condition c used to express this comparison of attributes between tables is called the join condition. The join condition may be composite (e.g., a1 < b1 and a2 < b2 ).

With most joins, the comparison operator used is = . The ,-join is then called an equijoin.

Examples: A � B where A.a = B.b

or: A � B where a = b { if a, b occur in only one of A, B}

Page 11: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Base for the following examples

The ORM schema (a) maps to the relational schema (b).

An example population for the schema.

Page 12: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Example: an equijoin

If the matching columns actually refer to the same thing in the UoD (and they typically do), then one of these columns is redundant. In this case we lose no information if we delete one of these matching columns (by performing a projection on all but the column to delete).

Page 13: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Example: from equijoin to natural join (a)

If the columns used for joining have the same name in both tables, then the unquali fied name is used in the join result.

The resulting table is then said to be the natural inner join of the original tables.

Since ‘ inner’ is assumed by default, the natural inner join may be expressed simply as “natural join”. This is by far the most common join operation in practice.

It may be written as: A L B , or in words: “A natural join B”

To compute A L B : Form A � BFor each column name c that occurs in both A and B

Apply the restriction A.c = B.cRemove B.cRename A.c to c

Note that “L” looks like a cross “�” with two vertical li nes added, suggesting that a natural join is a Cartesian product plus two other operations (selection of rows with equal values for the common columns, followed by projection to delete redundant columns).

Page 14: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Example: from equijoin to natural join (b)

The duplicate column is removed by projection on the equijoin.

A natural join.

Page 15: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Example: with natural joins

Tables being joined may have zero, one, or more common columns.

Examples:

Account has a composite identification scheme.

Page 16: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Example: with natural joins (cont.)

Sample population for the bank schema:

Two queries using natural joins:

Page 17: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Other join typesIn rare cases, comparison operators other than equality are used in joins.

As a simple example:Suppose we want a list of drinker-smoker pairs, where the drinker and smoker are distinct persons.

Other kinds of joins can be defined.For example, left, right, and full outer joins are used to include various cases with null values.An outer join is basically an inner join, with extra rows padded with nulls when the join condition is not satisfied. For example, Client left outer join AcUser includes a row to indicate that client 8005 has the name “Shankara, TA” but uses no account (branchNr a nd accountNr are assigned null values on this row).

Page 18: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Codds table operations: d) Relational Division

A table A is divisible by another table B only if A has more columns.Let B have n columns.

The operation A ÷ B ( A divide-by B) is defined if and only if the domains of the last n columns of A match the domains of the columns in B (in order).Examples:

Page 19: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Query strategies (in relational algebra . . . )

• Phrase the query in natural language, and understand what it means.• Which tables hold the information?• If you have table data, answer the query yourself, then retrace your mental steps.• Divide the query up into steps or subproblems.• If the columns to be listed are in different tables, declare joins between the tables.• If you need to relate two or more rows of the same table, use an alias to perform a self-join.

Example: how can the next query be formulated in relational algebra?

Page 20: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Solution: query in relational algebra

(Account where balance > 700 ) [ branchNr, accountNr ]

L AcUser

L Client [ clientNr, clientName ]

This is not the only way we could express the query.

For instance, using a top-down approach, both joins could have been done before selecting or projecting, thus:

( Account L AcUser L Client )

where balance > 700

[ clientNr, clientName ]

Although these two queries are logically equivalent, if executed in the evaluation order shown, the second query is less efficient because it involves a larger join.

Relational algebra can be used to specify transformation rules between equivalent queries to obtain an optimally efficient, or at least more efficient, formulation.

SQL database systems include a query optimizer to translate queries into an optimized form before executing them.

Page 21: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

Relational Database Systems / SQL

We may now define a relational DBMS as a DBMS that has the relational table as its only essential data structure, and supports the selection, projection and join operations without needing specification of physical access paths.

A relational system that supports all eight table operations of the relational algebra is said to be relationally complete. This doesn’t entail eight distinct operators for these tasks; rather, the eight operations must be expressible in terms of the table operations provided by the system.

For version 1 of the relational model, Codd proposed 12 basic rules to be satisfied by a relational DBMS.

Version 2 of the relational model as proposed by Codd (1990) includes 333 rules …

In 1992 the SQL standard was substanciall y improved …

The latest standard SQL:1999 ….

Much used RDBMS: Oracle, DB2, Microsoft SQL Server, Ingress/Posgress, MS Acces, FoxPro, Paradox, ….

Page 22: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Choosing Columns, Rows, and Order [where ...]

select distinct a, b, … from T

where c

T where c

[ a, b, … ]

select * from T

where c

T where c

SQLRelational algebraEquivalent formulations in relational algebra and SQL :

A where clause is used to select just the Aquarians.

Page 23: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (a)

A cross join (Cartesian product) of tables pairs all the rows of one with all the rows of the other.

In SQL-89, a cross join of tables is specified by listing the tables in the from clause, using a comma to denote the Cartesian product operator.

A conditional join (,-join) selects only those rows from the Cartesian product that satisfy a specified condition.In SQL-89, the condition is specified in the where clause.

SQL-92 (and SQL:1999) uses special syntax for various kinds of joins. In addition to supporting the SQL-89 syntax, these newer standards include special notations for the following types of joins (any text after two hypens “- -” is a comment):

cross joinqualified join: conditional join - - on clause

column-list join - - using clausenatural joinunion join

Qualified and natural joins may be further classified into the following types:

inner - - this is the defaultouter: left

rightfull

Page 24: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (b) [syntax 89/92/1999]

. . .. . . etc.

select A.c, … { omit B.c }from A, Bwhere A.c = B.cunion allselect c, …, ‘?’ , ..from Awhere c not in (select c from B)

- - for composite c use exists with correlated subquery…

select *from A natural left [outer] join B

- - join column are unqualificated- - nulls generatted for nonmatches . .

left outer

select A.c, … , … - - omit B.c, …from A, Bwhere A.c = B.c

- - join columns in result is A.c

select *from A natural [inner] join B

- - join column in result is c

natural inner

select A.c1, … , … - - omit B.c1, …from A, Bwhere A.c1 = B.c1 and …

- - join columns are qualified

select *from A join B

using (c1, …)

- - c1, … are unqualified

column-list

select *from A, Bwhere condition

select *from A join B

on condition

conditional

select *from A, B

select *from A cross join B

cross

SQL-89 syntaxNew syntax in SQL-92, SQL:1999Join type

Page 25: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (c) [cross join // conditional join ]

Listing all possible male-female pairs.

A conditional join specifies the join condition in an on clause.

Page 26: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (d) [ natural join ]

Currently, most SQL dialects (including SQL Server) do not support the natural join.Instead, the conditional join is often used to handle natural joins. For example, the query ‘above’ may be formulated thus:

select Employee.empNr, empName, carModelfrom Employee join Drives

on Employee.empNr = Drives.empNrjoin Car

on Drives.carRegNr = Car.carRegNr

Alternatively, SQL-89 syntax can be used as follows:

select Employee.empNr, empName, carModelfrom Employee, Drives, Carwhere Employee.empNr = Drives.empNr and Drives.carRegNr = Car.carRegNr

Page 27: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (e) [ natural left outer join ]

Two equivalent formulations of a natural left outer join.

Page 28: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Joins (f) [ self-join ]

A self-join is used to list pairs of scientists of opposite sex.

Page 29: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Grouping (a)

Partitioning rows into groups with the same value for column a.

Basic syntax:select group-property1, ...from ...[ where row-criterion(s) ]group by group-criterion1, ...

Example:

select sex, count ( *), avg ( iq )from Pupilswhere iq is not nullgroup by sex

Page 30: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Grouping (b) [ ‘division’ ]

We have seen the relational algebra ‘division’ -example:

Speaks ÷ ( Speaks where country = ‘Canada’ [language] )

Sometimes, simple cases of relational division can be handled by grouping.

N.B. The column name “ language” is double quoted because it is a reserved word.

Page 31: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Correlated and Existential Subqueries

Example: (members have a composite reference scheme)

Consider: who is not ranked in judo?

select surname, firstname, sex from Memberwhere not exists ( select * from Ranked

where surname = Member.surnameand firstname = Member.firstnameand art = ‘judo’ )

Page 32: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Data Definition [ create table / view ]

create table tablename (colname data-type [not null]

[...| ..| .., .. , ...][, ...][, primary key (col-list) ][, unique (col-list)][, foreign key (col-list) references

tablename (unique-collist)] [, ...][, check (table-condition-on-same-row) [,...]])

Views, or “virtual tables”, are basically named, derived tables. Their definition is stored, but their contents are not.

create view viewname [ (col-list) ] asselect-query[ with check option ]

Also: drop table / drop view / alter table .. add ...

Page 33: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Updating Table Populations

insert into tablename [ (col-list ) ]values ( constant-list )

commit [work] �� rollback [work]

delete from tablename[ where condition ]

update tablenameset colname = expression [, ... ][ where condition ]

insert into Employeevalues ( 715, ‘Jones’, ‘Eve’, ‘F’, null, null )

or:insert into Employee (empNr, surname, firstname, sex)values ( 715, ‘Jones’, ‘Eve’, ‘F’ )

delete from Item - - deletes all rows from the table

delete from Itemwhere category = ‘DB’

update Employeeset salary = salary * 1.05where job = ‘Modeler’

and salary < 50000

Page 34: IS0: Relational Languagesgerp/IS0/sheets/IS0_Relationele_Algebra_SQL2.pdfIn SQL-89, the condition is specified in the where clause. SQL-92 (and SQL:1999) uses special syntax for various

SQL: Security and Metadata

Security:A database is secure if and only if operations on it can be performed only by users authorized to do so.

SQL provides a grant statement for granting various kinds of privileges to users.The SQL-92 syntax is:

grant all privileges | select | insert [(col)] | update [(col)] | delete | usage | ....on objectto user-list [ with grant option ]

Privileges may be removed with the revoke statement:

revoke [ grant option for ] privilege-liston objectfrom user-list [ restrict | cascade ]

Metadata (data about data):SQL systems automatically maintain a set of system tables holding information about the database schema itself (e.g., base tables, views, domains, constraints, and privileges).Users with access to the system tables may query them in SQL (just like the application tables).