Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Peter Boncz (CWI)
Sjoerd Mullender update actionsJens Teubner XQUF parsingNiels Nes loggingStefan Manegold the rest
everything you always wanted to know about
Updates in MonetDB/XQuerybut were afraid to ask
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Overview• XQuery Update Facility (XQUF)
• semantics & the update tape
• Updatable XML storage in BATs• maintaining order in an array without O(N) cost
• Snapshot Isolation• why we want it, how we got it
• Concurrency Control• optimistic, with “abort convoys”
• Durability• physical logging
• Conclusion & Future Challenges
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XQuery Update Facility (XUF)
January 2006, first proposal
Internal primitives:upd:insertBeforeupd:insertAfterupd:insertIntoupd:insertIntoAsLastupd:insertAttributesupd:deleteupd:replaceValueupd:rename
Pending update list concept
upd:applyUpdates
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
insert
<item id="{id}">
<location>Brazil</location>
<quantity>200</quantity>
<name>XML in a nutshell</name>
<payment>Credit Card, Personal check</payment>
<shipping>Will ship internationally</shipping>
<incategory category="category1"/>
</item>
as last into
fn:doc("xmark.xml")/site/regions/samerica
Example
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Semantics
let $root = doc(“foo.xml”)
for $i in (1,2,3)
return
do insert <x>$i</x> as first into $root),
do insert <y>$i</y> as first into $root))
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Semantics
let $root = doc(“foo.xml”)
for $i in (1,2,3)
return
(do insert <x>$i</x> as first into $root),
do insert <y>$i</y> as first into $root))
We need to
• define an execution order, and
• enforce it
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
The Update Tapeupdate = sequence ( int, node, node/str, node/str)
fn:delete() (DELETE, node, nil, nil)
fn:insert_*() (INSERT, tgt-node, tgt-level, expr-node)
fn:set-attr() (ATTR, node, qn, val)
fn:unset-attr() (ATTR, node, qn, nil)
fn:set-text() (TEXT, node, val, nil)
fn:set-pi() (PI, node, ins-val, arg-val)
fn:set-comment() (COMMENT, node, val, nil)
( element construction ), that combines updates, will enforce the correct order of the update tape.
Pathfinder compiler automatically inserts call to
fn:update(item*)
on the result of all update queries
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XPath Accellerator [SIGMOD02]
pre posta 0 9b 1 3c 2 2d 3 0e 4 1f 5 8g 6 4h 7 7i 8 5j 9 6
Node-based relational encoding of XQuery's data model
<a> <b> <c> <d/> <e/> </c> </b> <f> <g/> <h> <i/> <j/> </h> </f></a>
descendant
ancestor following
preceding
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XML Storage Revisited
pre size level0 9 01 3 12 2 23 0 34 0 35 4 16 0 27 2 28 0 39 0 3
pre posta 0 9b 1 3c 2 2d 3 0e 4 1f 5 8g 6 4h 7 7i 8 5j 9 6
post = pre + size - level
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Updates: Mission Impossible?
pre posta 0 9b 1 3c 2 2d 3 0e 4 1f 5 8g 6 4h 7 7i 8 5j 9 6
size(following) = O(N) killer (?)
<a> <b> <c> <d/> <e/> </c> </b> <f> <g/> <h> <i/> <j/> </h> </f></a>
descendant
ancestor following
precedingINSERT SUBTREE
SIZE + |I|
PRE+ |I|
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XML Storage Revisited
rid size level nid0 11 0 N01 5 1 N12 -1 null null3 0 null null4 2 2 N25 0 3 N36 0 3 N47 4 1 N58 0 2 N6
9 2 2 N710 0 3 N811 0 3 N9
pre size level0 9 01 3 12 2 23 0 34 0 35 4 16 0 27 2 28 0 39 0 3
pre size level0 11 01 5 12 -1 null3 null null4 2 25 0 36 0 37 4 18 0 29 2 210 0 311 0 3
pre posta 0 9b 1 3c 2 2d 3 0e 4 1f 5 8g 6 4h 7 7i 8 5j 9 6
post = pre + size - level
Allow holes Define logical pages
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XML Storage Revisited
rid size level nid0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N98 2 2 N2
9 0 3 N310 0 3 N411 4 1 N5
pre size level0 9 01 3 12 2 23 0 34 0 35 4 16 0 27 2 28 0 39 0 3
pre size level0 11 01 5 12 -1 null3 null null4 2 25 0 36 0 37 4 18 0 29 2 210 0 311 0 3
pre posta 0 9b 1 3c 2 2d 3 0e 4 1f 5 8g 6 4h 7 7i 8 5j 9 6
post = pre + size - level
Allow holes Define logical pages
page map0 01 22 1
rid = pre.swizzle( )
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XML Storage RevisitedUpdate-friendly• rid-table is append-only• rid-tuples may be unused• rid = autoincrement column
MonetDB: • rid not stored but computed (virtual oid)• allows positional lookup/join
Not stored no need to update it either
rid size level nid0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N98 2 2 N2
9 0 3 N310 0 3 N411 4 1 N5
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
XML Storage RevisitedUpdate-friendly• rid-table is append-only• rid-tuples may be unused• rid = autoincrement column
rid size level nid0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N98 2 2 N2
9 0 3 N310 0 3 N411 4 1 N5
Updatable document collection:- pf:add-doc(URI, docname, perc>0)- pf:add-doc(URI, docname, collname, perc>0)
pre := nid.leftfetchjoin(nid_rid).swizzle(map_pid)
Read-only document collection:- pf:add-doc(URI, docname, 0)- pf:add-doc(URI, docname, collname, 0)NID = RID = PREpre := nid.leftfetchjoin(nid_rid).swizzle(map_pid) = FREE!!
pre size level nid0 11 0 N01 5 1 N12 -1 null null3 0 null null4 2 2 N25 0 3 N36 0 3 N47 4 1 N58 0 2 N6
9 2 2 N710 0 3 N811 0 2 N9
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Versus 2-phase locking (2PL) == full serializability
Why not 2PL XML:
• lock semantics much more complex than in relational case (order matters!!)
• node-level locking in staircase join?? (now 10 cycles/node…)
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Versus 2-phase locking (2PL) == full serializability
Why not 2PL XML:
• lock semantics much more complex than in relational case (order matters!!)
• node-level locking in staircase join?? (now 10 cycles/node…)
Why Snapshot Isolation:
• great for read-queries, great for ll_scj (runs unmodified)
• quite strong. Better than repeatable read. Oracle/Postgres do it.
Problem with Snapshot Isolation:
• in XQuery, it is unknown at compile-time what to snapshot (fn:doc(..))
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Read Query1 Read Query 2 Update Query
rid size level Nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Isolation By Shadow Paging (copy-on-write mmap)
• rid/pre delete/insert + attr-replace
Touch one byte per physical page: *addr = *addr;
MMU traps, OS replaces page by a copy
• we would like to replace the master copy once, not all client copies
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Read Query1 Read Query 2 Update Query
rid size level Nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Isolation By Shadow Paging (copy-on-write mmap)
• rid/pre delete/insert + attr-replace
Touch one byte per physical page: *addr = *addr;
MMU traps, OS replaces page by a copy
• we would like to replace the master copy once, not all client copies
Isolate-page
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Read Query1 Read Query 2 Update Query
rid size level Nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Isolation By Shadow Paging (copy-on-write mmap)
• rid/pre delete/insert + attr-replace
Touch one byte per physical page: *addr = *addr;
MMU traps, OS replaces page by a copy
Isolate-page
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Read Query1 Read Query 2 Update Query
rid size level Nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N87 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Isolation By Shadow Paging (copy-on-write mmap)
• rid/pre delete/insert + attr-replace
Touch one byte per physical page: *addr = *addr;
MMU traps, OS replaces page by a copy
• we would like to replace the master copy once, not all client copies
Master-update
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Durability Masters become dirty
• no time to flush them during query
• log all changes to a WAL
= log all tuples that changed = entire pages
Recovery:
• after a crash, we do not know whether dirty pages got saved
• solution: overwrite tables with values from the WAL
Checkpointing Thread:
• every 5 minutes, if ‘many’ changes occurred, checkpoint
• memory mapped bats are sync()-ed ony dirty pages get written
• checkpoint locks collection, halts query processing
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Durability Masters become dirty
• no time to flush them during query
• log all changes to a WAL
= log all tuples that changed = entire pages
Recovery:
• after a crash, we do not know whether dirty pages got saved
• solution: overwrite tables with values from the WAL
Checkpointing Thread:
• every 5 minutes, if ‘many’ changes occurred, checkpoint
• memory mapped bats are sync()-ed ony dirty pages get written
• checkpoint locks collection, halts query processing
rid size level nid
0 11 0 N01 5 1 N12 -1 null null3 0 null null4 0 2 N65 2 2 N76 0 3 N8
7 0 3 N9
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
The Update Sequence Execute Query
• build update tape
• queries get isolated copies of a document (VM copy-on-write mmap)
Prepare Intensional Updates
• execute update tape.
• does not modify masters (except append-only tables)
Commit Phase (locked phase – per doc-collection)
• precommit
• detect conflicts (not the size-ancestors)
•write WAL (globally locked)
• read master-size-ancestors, use delta, log result
• update master tables
• isolate first! Only then update masters.
• update index structures
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Many more Issues Solved
Conflicting Updates
• detect conflicting queries:
• look at RID page numbers and attr-IDs
• reacting to conflicts:
• abort query + automatic restart
• run CONVOY of 5 next update queries serially
Indexing and Updates
• Runtime QN NID mapping, with hash table
• read-only: not a hash, but keep sorted & persistent
• keep INS + DEL deltas to commit without changing the hash table
• Runtime NID ATTR hash table
• isolation loses you MonetDB dynamic hash table reuse
• share an old copy, exploit append-mostly
ACID properties on the Meta Level
• Shredding a new doc into a collection Query
• Shredding a new doc into a collection Update
• Using a collection Deleting/adding documents
• Meta Querying Deleting/adding documents
Concurrency
Updates Checkpoint
Shredding Query
Shredding Updates
Allocating New Pages and NIDS
• Offload shredding interference with freelist
• Unlocked access to private pages
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Snapshot Isolation Versus 2-phase locking (2PL) == full serializability
Why not 2PL XML:
• lock semantics much more complex than in relational case (order matters!!)
• node-level locking in staircase join?? (now 10 cycles/node…)
Why Snapshot Isolation:
• great for read-queries, great for ll_scj (runs unmodified)
• quite strong. Better than repeatable read. Oracle/Postgres do it.
Problem with Snapshot Isolation:
• in XQuery, it is unknown at compile-time what to snapshot (fn:doc(..))
2PL (++)375 transactions/5 minutes
= 1.2 transaction/sec
Updates in MonetDB/XQuery Database Techniek: XML Lecture(Part2)
Conclusions It works! Reasonable/good performance!
• transaction mgmt as a module extension outside a kernel works
• identified VM primitives that databases really need
Future work:
• Test on XML update benchmark TPOX (DB2: 700 trans/second)
• Packed Memory Arrays: alternative for page remapping?
• page remapping is technically O(N)
• Engineering:
• support for value-indexing (does PF support it already)
• asynchronous WAL writing to boost throughput
• port MIL to C primitives; port C primitives to Monet5
Top Related