Post on 14-Apr-2018
7/29/2019 DW Lecture 10
1/14
Lecture 10
Tue, April 14, 2009 1800 : 2100
FAST NU, Karachi
7/29/2019 DW Lecture 10
2/14
2
Physical Database Design Logical database design
What are the facts and dimensions?
Goals: Simplicity, Expressiveness
Make the database easy to understand
Make queries easy to ask
Physical database design
How should the data be arranged on disk? Goal: Performance
Manageability is an important secondary concern
Make queries run fast
7/29/2019 DW Lecture 10
3/14
3
Load vs. Query Trade-off Trade-off between query performance and load
performance
To make queries run fast Precompute as much as possible
Build lots of data structures
Indexes
Materialized views Cube structures (MOLAP/ROLAP)
7/29/2019 DW Lecture 10
4/14
4
A Lesson in Data Warehouse
Evolution ROLAP MOLAP MATERAILIZED VIEWS
AGGREGATE TABLES? Whats the difference?? While Star Schema Data Warehouses originally gained high
performance from well-designed database indexes, bothMOLAP and ROLAP took the approach of aggregating datagrouped by dimensional hierarchies to speed up queries byorders of magnitude. When this came to the attention of thedatabase research community, it became clear that
MOLAP/ROLAP efficiency, leaving specialized semanticsapart, could be traced to an approach of pre-computing queryresults, or in database terms materializing views, and atremendous outpouring of papers on materialized viewsfollowed
7/29/2019 DW Lecture 10
5/14
5
Materialized Views (MVs) View
A derived relation (or a function) defined in terms of base (stored)relations
Typically recomputed every time the view is referenced
Materialized view A view can be materialized by storing the tuples of the view in the
database Works like a cache
Why use materialized views
Provides fast access to data Critical in applications with high query rate and complex views
not possible to re-compute the view for every query Many DBMSs support materialized views
Goal: faster response for related queries
7/29/2019 DW Lecture 10
6/14
6
View Maintenance The process of updating a materialized view in
response to changes to the underlying data
MV gets dirty whenever the underlying base relations
are modified Incremental View Maintenance
It is wasteful to maintain a view by re-computing it fromscratch
Often it is cheaper to use the heuristic of inertia (only a partof the view changes in response to changes in the baserelations)
Compute only the changes in the view to update itsmaterialization
7/29/2019 DW Lecture 10
7/147
Classification of the View
Maintenance Problem Four dimensions along which the view maintenance
problem can be studied
Information Dimension
Modification Dimension
Language Dimension
Instance Dimension
7/29/2019 DW Lecture 10
8/148
Information Dimension Information Dimension
The amount of information available for viewmaintenance
Information may include base relations, materializedview itself and knowledge of constraints and keys
7/29/2019 DW Lecture 10
9/149
Information Dimension (Example) Consider relation part (part_num; part_cost;
contract); and
View expensive_ parts (part_num) = part_numwhere part_cost > 1000
Consider maintaining the view when a tuple p1 isinserted into relation part
Different view maintenance algorithms can bedesigned depending upon the information available
Consider following cases
7/29/2019 DW Lecture 10
10/1410
Information Dimension (Example) CASE 1: The materialized view alone is available
Use the old materialized view to determine if part_numalready is present in the view
If so, there is no change to the materialization, else insert partp1 into the materialization
CASE 2: The base relation part alone is available Use relation part to check if an existing tuple in the relation
has the same part_num but greater or equal cost
If such a tuple exists then the inserted tuple does notcontribute to the view
CASE 3: It is known that part no is the key Infer that part_num cannot already be in the view, so it must
be inserted
7/29/2019 DW Lecture 10
11/14
11
Modification Dimension Modification Dimension
What modifications can the view maintenancealgorithm handle?
Modifications may include insertion and deletion oftuples to base relations, direct updates, deletionsfollowed by insertions, changes to the view definitionetc.
7/29/2019 DW Lecture 10
12/14
12
Language Dimension Language Dimension
Is the view expressed as relational algebra (SQL or asubset of SQL)
Can it have duplicates? Can it use aggregation?
7/29/2019 DW Lecture 10
13/14
13
Instance Dimension Instance Dimension
Database instance: Does the view maintenancealgorithm work for all instances of the database?
Modification instance: Does it work for all instances ofthe modification?
7/29/2019 DW Lecture 10
14/14
14
Instance Dimension (Example) Extend the previous example with a new view
View supplier_parts as the equijoin between relationssupplier (supplier_num; part_num; price) and part(part_num,)
The view contains the distinct part numbers that are suppliedby at least one supplier
Maintenance Use of a join makes it impossible to maintain supplier_parts
in response to insertions to part when using only the view View supplier_ parts is maintainable if the view contains
part_num p1 but not otherwise Maintainability of a view depends also on the particular
instances