CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... ·...

17
1/17 Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References CS-695 NoSQL Database MongoDB (part 2 of 2) Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge Dr. Chuck Cartledge 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015 15 Oct. 2015

Transcript of CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... ·...

Page 1: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

1/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

CS-695 NoSQL DatabaseMongoDB (part 2 of 2)

Dr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck CartledgeDr. Chuck Cartledge

15 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 201515 Oct. 2015

Page 2: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

2/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Table of contents I

1 Miscellanea

2 Assignment #4

3 DB comparisons

4 Extensions

5 Summary

6 Midterm

7 Conclusion

8 References

Page 3: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

3/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Corrections and additions since last lecture.

Assignment #04 is available

Corrected typos in lecture#007

Page 4: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

4/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Words of explanation.

The full text is available at:http://www.cs.odu.edu/ ~ccartled/

Teaching/2015-Fall/NoSQL/

Assignments/04/

In general terms:

1 Parse data

2 Create document database

3 Update documents based on number ofmovies filmed in each city

4 Query database

5 Create list of 10 most used locations

6 Solve TSP with 10 cities

7 Plot route on a map

8 List locations and movies

9 What about when a popular locationdoesn’t have a city??

Page 5: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

5/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

How different DBs compare to a RDBMS

We have some terms to compare now[2]

RDBMS K/V Columnar Doc.

DB. instance cluster cluster instancedatabase — namespace —table bucket table collectionrow key-value row documentrowid key — idcol. — col. fam. —schema — — databasejoin — — DBRef

Page 6: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

6/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Aggregations

“Aggregationpipeline gives you a wayto transform andcombine documents inyour collection. You doit by passing thedocuments through apipeline thatssomewhat analogous tothe Unix pipe whereyou send output fromone command toanother to a third, etc.”

Karl Seguin [3]

General format of the aggregatefunction:db.collection.aggregate

([< stage >, ...])

Page 7: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

7/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Aggregations

A partial list of aggregation operators:1

$match — filters thestream$limit — passes first n docs$skip — skips first n docs$group — groups docsbased on id and appliesaccumulator(s)$geonear — returns docsclose to a location$and, $or, $not — typicalBoolean operators for

arrays of comparisonoperators

$eq, $gt, $gte, $lt, $lte —usual math comparisons

$add, $subtract, $multiply,$divide, $mod — mathoperators

$cmp, $ne — comparesand returns -1, 0, 1

Note: see how all operators start with a $.

1http://docs.mongodb.org/manual/reference/operator/aggregation/

Page 8: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

8/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Aggregations

Examples[1]:

db.employees.aggregate(

...{

... "$project" : {

... "totalPay" : {

... "$subtract" : [{"$add" : ["$salary", "$bonus"]},"$401k"]

... }}})db.employees.aggregate(

... {

... "$project" : {

... "email" : {

... "$concat" : [

... "$substr" : ["$firstName", 0, 1],

... ".",

... "$lastName",

... "@example.com"

... ]}}})

Page 9: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

9/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Indices

MongoDB has indices.

They can be created:db.unicorns.ensureIndex({name: 1});Note: the 1 means ascending, -1 is descending.They can be dropped:db.unicorns.dropIndex({name: 1});They can be unique:db.unicorns.ensureIndex({name: 1},{unique: true});They can be on embedded fields, arrays, and compounded.db.unicorns.ensureIndex({name: 1,vampires: -1});

Page 10: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

10/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Indices

Special indices[5]:

Capped collections — collectionsact like circular queues

TTL indices for caches —documents older than TTL arepurged (automatically)

Full-text indices — all strings aresplit, stemmed, and stored(heavyweight operations)

Geospatial (2d and 2dsphere) —supports: inclusion, intersection,and proximity

GridFS for large files — virtualfile system, supports sharding,large files (2G block size)

Image from [4].

Page 11: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

11/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Indices

Classic (a.k.a, Hadoop) MapReduce

The classic Map Reduce ecosystem operations:

1 Split the input file intochunks

2 Present each split to itsown map function

3 Map function emits 0 ormore <key, value> sets

4 Ecosystem collects are all<key, value> sets

5 Ecosystem merges <key,values> sets

6 Ecosystem presents <key,values> set to reducers

7 Ecosystem collects output

Image from [6].

Page 12: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

12/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Indices

MongoDB MapReduce:

Page 13: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

13/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Strengths and weaknesses

Good and not so good

Strengths:

Handling huge amounts of data byreplication and horizontal scaling

Very flexible data model

Ease of use (object oriented bent)

Weaknesses:

Discourages normalization

Items can be inserted anywhere (lack ofschema)

May require large infrastructure

Page 14: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

14/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

Applicabilities

Good for, and not so good for

Good fit;

Event logging, especially where the data changesrapidly

Anything that can communicate via JSONdocuments (CRM, publishing websites, usercomments, profiles, web-facing documents, . . . )

Web-analytics detailing transactions and changes

Intermediate database when transitioning fromone db to another

Not so good fit:

Anything requiring complex transactions spanningmultiple documents

Anything that required normalized data

Page 15: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

15/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

General notes and ideas

It will be about things wecovered in class. Including:

Information and ideas

Variouslectures/presentations

Different databasetechnologies

Open book, open notes (notopen neighbor)

It will be about Sarah and Hasta.Sarah will have an idea for Hasta.Your task will be to discuss whichDB technology to use and why.

Page 16: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

16/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

What have we covered?

Reviewed assignment #04Covered some of themongDB “querying”capabilitiesRemember Assignment #04due before next class

Next time: CouchDB

Page 17: CS-695NoSQLDatabase MongoDB(part2of2) Dr. ChuckCartledge ...ccartled/Teaching/2015... · Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

17/17

Miscellanea Assignment #4 DB comparisons Extensions Summary Midterm Conclusion References

References I

[1] Kristina Chodorow, Mongodb, the definitive guide, O’Reilly Media, Inc.,2013.

[2] Eric Redmond and Jim R Wilson, Seven databases in seven weeks,Pragmatic Bookshelf, 2012.

[3] Karl Seguin, The little mongodb book,http://github.com/karlseguin/the-little-mongodb-book, 2015.

[4] Alessandro Siletto, A quick start with mongodb geospatial queries,http://www.siletto.it/blog/alessandro/2013/03/19/quick-start-mongodb-

2013.

[5] MongoDB Staff, Mongodb documentation, MongoDB DocumentationProject, 2015.

[6] Tom White, Hadoop: The lay of the land,http://www.drdobbs.com/database/hadoop-the-lay-of-the-land/240150854

2013.