CS-695 NoSQL Database HBase (part 1 of 2)ccartled/Teaching/2015-Fall/... · 2015. 9. 27. · 1/21...

Post on 25-Aug-2020

0 views 0 download

Transcript of CS-695 NoSQL Database HBase (part 1 of 2)ccartled/Teaching/2015-Fall/... · 2015. 9. 27. · 1/21...

1/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CS-695 NoSQL DatabaseHBase (part 1 of 2)

Dr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck Cartledge

24 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 201524 Sept. 2015

2/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Table of contents I

1 Miscellanea

2 DB comparisons

3 Origins and history

4 Data model

5 CRUDy stuff

6 Assignment #3

7 Conclusion

8 References

3/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Corrections and additions since last lecture.

Problems with assignment#02

Problems with cs695-nosql

Assignment #03 is available.

4/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

How different DBs compare to a RDBMS

We have some terms to compare now[5]

RDBMS K/V Columnar Doc.

DB. instance cluster cluster instancedatabase — namespace —table bucket table collectionrow key-value row documentrowid key — idcol. — col. fam. —schema — — databasejoin — — DBRef

5/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Where it came from and why.

HBase is new

Based on 2008’a Google’sBigtable: A DistributedStorage System forStructured Data [2]

“Bigtable is adistributed storagesystem for managingstructured data that isdesigned to scale to avery large size:petabytes of dataacross thousands ofcommodity servers.”

Change, et al. [2]

6/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Where it came from and why.

HBase underpinnings

HBase lives on top of aHadoop file system

The master ensures writesare persisted in order

Zookeeper ensures thatreads are consistent acrossall copies

Think qurom reads and writes. Image from [3].

7/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

What is in the backend?

Persistent maps

Sparse, distributed,persistent multidimensionalsorted map

No security

8/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

What is in the backend?

One mental data model

9/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

What is in the backend?

Another mental data model

Image from [3].

10/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

What is in the backend?

Take away mental data model

The database consists of columns

Each column has qualifiers

Each qualifier has a value and atimestamp

A value can be any series of bytes(a string)

Each row has a key

A row-key, column qualifier,timestamp tuple accesses a value

The same row can span multiplecolumns

Not all column qualifieres havevalues

Image from [4].

11/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Pre-CRUDy stuff

Things to be aware of:

HBase support in lots oflanguages: ruby, python.C#, R, Java

HBase can be accessed viacurl

HBase shell supports HBasescript files

No security.

https://wiki.apache.org/hadoop/Hbase/Stargate

12/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Pre-CRUDy stuff

How to access things:

HBase shell1:/opt/hbase/bin/hbase shell <script>

curl commands2:curl http://localhost:8080/version

1https://wiki.apache.org/hadoop/Hbase/Shell2https://wiki.apache.org/hadoop/Hbase/Stargate

13/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Pre-CRUDy stuff

Creating a database

The database always exists. Usercreates a table in the database,adds column famil(ies), and addsrows with values in specificcolumn qualifiers.A row-key, column qualifier,timestamp tuple accesses a value.The timestamp attribute defaultsto the lastest value.

Image from [1].

Recommend that you create an HBase namespace using $USER

14/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CRUDy nuts and bolts

Creating table with curl

curl3:

urlToHBase = http://localhost:8080/curl -v -H "Content-Type: application/json"

-d

’{"name":"ccartled:tab","ColumnSchema":[{"name":"fam"}]}-X PUT ’$urlToHBase/ccartled%3Atab/schema’

curl -H "Accept: application/json" $urlToHBase

curl -v -H "Content-Type: text/xml"

-d ’<?xml version="1.0" encoding="UTF-8"

standalone="yes"?>

<CellSet><Row key="cm93Mg=="><Cell

column="ZmFtOmFnZQo=">UGxhaW5UZXh0</Cell>

</Row></CellSet>’

-X POST

’$urlToHBase/ccartled%3Atab/false-row-key’

curl -v ’$urlToHBase/ccartled%3Atab/schema’3https://wiki.apache.org/hadoop/Hbase/Stargate

15/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CRUDy nuts and bolts

Creating table with HBase

HBase shell4:

create namespace ’ccartled’create ’ccartled:tab’, ’fam’list ’ccartled:tab’put ’ccartled:tab’, ’row1’, ’fam:age’, ’PlainText’alter ’ccartled:tab’, NAME=>’friends’

4http://hbase.apache.org/book.html

16/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CRUDy nuts and bolts

Reporting a cell value

curl (notice the URL encoding):

curl -H ”Accept: application/json” -X GET’$urlToHBase/ccartled%3Atab/row1/fam:a’

HBase shell:

get ’ccartled:tab’, ’row1’, ’fam:a’

17/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CRUDy nuts and bolts

Updating and deleting a cell value

curl:

curl -H ”Accept: application/json” -X DELETE’$urlToHBase/ccartled%3Atab/row1/fam:a’Followed by curl PUT directive

HBase shell:

delete ’ccartled:tab’, ’row1’, ’fam:a’Followed by shell put directive

18/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

CRUDy nuts and bolts

Deleting a database

curl:

curl -v -X DELETE ’$urlToHBase/ccartled%3Atab/schema’

HBase shell:

disable ’ccartled:tab’drop ’ccartled:tab’

19/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

Words of explanation.

The full text is available at:http://www.cs.odu.edu/

~ccartled/

Teaching/2015-Fall/NoSQL/

Assignments/03/

In general terms:

1 Parse data

2 Create columnar database

3 Alter the database for newdata

4 Query database

5 Create histograms

20/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

What have we covered?

Reviewed assignment #03Covered HBase CRUDYstuff

Next time: continued CRUDy exploration

21/21

Miscellanea DB comparisons Origins and history Data model CRUDy stuff Assignment #3 Conclusion References

References I

[1] Douglas Adams, The ultimate hitchhiker’s guide, Wings books, 1996.

[2] Fay Chang and Jeffrey Dean, Bigtable: A distributed storage system forstructured data, ACM Transactions on Computer Systems (TOCS) 26(2008), no. 2, 4.

[3] Nick Dimiduk, Apache hbase 1.0 release,http://www.slideshare.net/xefyr/apache-hbase-10-release, 2015.

[4] Mark Hicks, Mental breakthrough,http://school.discoveryeducation.com/clipart/clip/click.html,2015.

[5] Eric Redmond and Jim R Wilson, Seven databases in seven weeks,Pragmatic Bookshelf, 2012.