15 - Sun - Ouwens

31
1 1 BIG DATA The Petabyte Age: More Isn't Just More — More Is Different wired magazine Issue 16/.07 Martien Ouwens Datacenter Solutions Architect Sun Microsystems

Transcript of 15 - Sun - Ouwens

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 1/31

1

1

BIG DATA

The Petabyte Age:More Isn't Just More —More Is Differentwired magazine Issue 16/.07

Martien OuwensDatacenter Solutions Architect

Sun Microsystems

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 2/31

Sun Confidential: Internal Only 2

Next Generation

Big Data, The Petabyte Age:

Relational

The biggest challenge of the PetabyteAge won't be storing all that data, it'll be

figuring out how to make sense of itwired magazine Issue 16/.07

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 3/31

Sun Confidential: Internal Only 3

Next Generation

Big Data Key Trends

Relational

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 4/31

Sun Confidential: Internal Only 4

“Hitting walls” in Processor DesignClock frequency

− frequency increases tapering off, in newsemiconductor processes

− high frequencies => power issuesMemory latency (not instruction execution speed)dominating most application timesProcessor designs for high single-threadperformance are becoming highly complex,therefore:

− expense and/or time-to-market suffer− verification increasingly difficult− more complexity => more circuitry => increased

power ... for diminishing performance returns

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 5/31

Sun Confidential: Internal Only 5

Memory BottleneckRelativePerformance

10000

1

1990 199520051980

1000

100

10

1985 2000

Gap

CPU FrequencyDRAM Speeds

C P U - - 2

x E v e r y 2 Y

e a r s

D RA M - - 2 x E v e r y 6 - 8 Y e a r s

Source: Sun World Wide Analyst Conference Feb. 25, 2003

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 6/31

Sun Confidential: Internal Only 6

How We Mask Memory Latency Today

L2 CacheL2 Cache(Left)(Left)

L2 TagsL2 TagsM C U

M C U

S I U S I U

SIUSIU

L2/L3 CntlL2/L3 Cntl

L1 IL1 I--CacheCache L1L1DD--CacheCache

D T L B

D T L B

ITLBITLB

P$P$

W$W$

Core 0Core 0

L2 CacheL2 Cache(Right)(Right)

L3 TagsL3 Tags

L1 IL1 I--CacheCache ITLBITLB

P$P$

W$W$

D T L B

D T L B

• of many different kinds• that only mask memory

latency• and execute no code• accessed one cache

line at a time• but require power and

cooling to all lines allthe time

Core 1Core 1

L1L1DD--CacheCache

Great Big CachesGreat Big Caches

Cache Logic Accounts for About 75% of the Chip AreaCache Logic Accounts for About 75% of the Chip Area

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 7/31

Sun Confidential: Internal Only 7

“Power Wall + Memory Wall + ILP Wall = BrickWall”

* “The Landscape of Parallel Computing Research: A View from Berkeley”, December 2006 (emphasis added)http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html

““In 2006, performance is a factor of three below the traditionalIn 2006, performance is a factor of three below the traditionaldoubling every 18 monthsdoubling every 18 monthsthat we enjoyed between 1986 and 2002.that we enjoyed between 1986 and 2002.

The doubling of uniprocessor performance may now takeThe doubling of uniprocessor performance may now take5 years5 years ..””**

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 8/31

Sun Confidential: Internal Only 8

Single ThreadedPerformance

Single Threading

Thread

Memory LatencyCompute

Time

C C CM M M

Up to 85% Cycles Spent Waiting for Memory

SPARC US-IV+ 25% 75%

Intel 15% 85%

Hurry Upand

WAIT

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 9/31

Sun Confidential: Internal Only 9

Comparing Modern CPU DesignTechniques

1 GHz

Time

ILP Offers Limited HeadroomTLP Provides Greater Performance Efficiency

CC MM CC MM CC MM CC MM ComputeMemory Latency

CC MM CC MM CC MM CC MM2 GHz

TLPCC MM

CC MMCC MM

CC MM

Time Saved

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 10/31

Sun Confidential: Internal Only 10

Core-1

Core-2

Core-3

Core-4

Core-5

Core-6

Core-7

Core-8Time

Chip Multi-Threading (CMT) Utilization: Up to 85%*

Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1Thread 4Thread 3Thread 2Thread 1

Latency

UltraSPARC-T1 85% 15%

Compute

Chip Multi-threaded

(CMT)Performance

* based on example of UltraSPARC T1

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 11/31

Sun Confidential: Internal Only 11

Enterprise Flash – Relative Perf

1 sec

16,7 min

11,5 dag

31,7 jr

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 12/31

Sun Confidential: Internal Only 12

Where to Store Data?Optimization Trade-Off

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 13/31

Sun Confidential: Internal Only 13

Data Retention Comparison

DRAM SSD HDD

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 14/31

Sun Confidential: Internal Only 14

ZFS Turbo Charges ApplicationsHybrid Storage Pool Data Management on Solaris Systems, OpenStorage

•ZFS automatically:•Writes new data to very fast SSDpool (ZIL)

•Determines data access patternsand stores frequently accesseddata in the L2ARC

•Bundles IO into sequential lazywrites for more efficient use of lowcost mechanical disks

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 15/31

Sun Confidential: Internal Only 15

Unique Capabilities from SunSun Storage 7000 Unified Storage Systems

• Radical simplification> Ready-to-go appliance – installs in minutes> Extensive analytics and drill-down for rapid

management of performance challenges

• Improved latency and performance> Integrated flash drives> Hybrid Storage Pools automates data

placement on the best media

• Compelling value through standards> Industry standard hardware and open source

software vs. proprietary solution

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 16/31

Sun Confidential: Internal Only 16

Sun Oracle Database Machine• Hardware by Sun, software by Oracle – Exadata Version 2

• Enhancements to Exadata Version 1 offering extreme performance> Faster servers, storage, and network interconnects> Flash acceleration> Speedup at Oracle level of 20 times processing IOPS> Extends Exadata to provide acceleration for OLTP

• Database aware storage> Smart scans> Storage indexes – eliminates a good deal of I/Os

• Massively parallel architecture• Dynamic linear scaling

• Offering mission critical availability and protection of data

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 17/31

Sun Confidential: Internal Only 17

• Current database deployments often have bottlenecks limiting the movement of data from

disks to servers> Storage Array internal bottlenecks on processors and Fibre Channel Architecture> Limited Fibre Channel host bus adapters in servers> Under configured and complex SANs

• Pipes between disks and servers are 10x to 100x too slow for data size

What Does Sun Oracle Database MachineAddress?

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 18/31

Sun Confidential: Internal Only 18

• Adding more pipes – Massively parallel architecture

• Make the pipes wider – 5X faster than conventional storage• Ship less data through the pipes – Process data in storage

• Simplify the storage offering – eliminate the need of a SAN

• Allow for Ease of Growth• Provide a Flash Cache for Rapid Response

Sun Oracle Database MachineSolves the Bottleneck by

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 19/31

Sun Confidential: Internal Only 19

HPC Market Trends• It’s a growing market,outpacing growthof many others

> $10B in 2008

> CAGR of 3.1%for 2007 to 2012*

• Clustering is nowthe dominantarchitecture• HPC technologiesnow within reachof all organizations

HPC is Underserved by Moore’s Law

* Source IDC (January 2009)

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 20/31

Sun Confidential: Internal Only 20

Sun Constellation SystemOpen Petascale Architecture

NetworkingCompute Storage Software

Ultra-DenseBlade Platform

• Fastest processors:AMD Opteron,Intel Xeon

• Highest computedensity

• Fastest host

channel adaptor

Ultra-DenseSwitch Solution

• 648 port InfiniBandswitch

• Unrivaled cablesimplification

• Most economicalInfiniBand cost/port

ComprehensiveSoftware Stack

• Integrated developertools

• Integrated Grid Engineinfrastructure

• Provisioning, monitoring,patching

• Simplified inventorymanagement

DEVELOPER TOOLS

GRID ENGINEPROVISIONING

Ultra-DenseStorage Solution

• Most economical andscalable parallel filesystem building block

• Up to 48 TB in 4RU• Up to 2TB of SSD• Direct cabling to

IB switch

Eco-Efficient Building Blocks

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 21/31

Sun Confidential: Internal Only 21

Sun Constellation SystemOpen Petascale Architecture

CompetitorsCablingInfrastructure

CompetitorsCompute

Clusters

ReducedCabling

R a c k s

C a b l i n

g

C o r e S w i t c h e s

L e a f S w i t c h e s

Radical Simplicity, Faster Time to Deployment

R a c k sSun

ConstellationSystemCluster

C a b l i n

g

•Alternative Open Standards Fabric• 300 switching elements• 6912 cables

• 92 racks

• 1 switching element300:1 reduction• 1152 cables6:1 reduction• 74 racks20% smaller footprint

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 22/31

Sun Confidential: Internal Only 22

Massive Low Latency Engine

• 3,936 Sun four-socket,quad-core nodes

• 15,744 AMD Opteron processors• Quad-core, four flops/cycle

(dual pipelines)

Compute Power —579 TFLOPS Aggregate Peak

Memory

• 72 Sun x4500 “Thumper”I/O servers, 24TB each

• 1.7 Petabyte total raw storage• Aggregate bandwidth ~32 GB/sec

Disk Subsystem

InfiniBand Interconnect

• 2 GB/core, 32 GB/node, 125 TB total

• 132 GB/s aggregate bandwidth

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 23/31

Sun Confidential: Internal Only 23

The HPC Data I/O Challenge

• Compute power is scaling rapidly> Larger compute clusters> More sockets per server > More cores per CPU

• Compute advances far outpace those in I/O> Storage latency 12 orders of magnitude higher

than compute

> Compute Side developed Parallel Approachessooner

• Need a better approach to feed compute

> Cluster performance is often limited by data I/O> Need a parallel I/O approach to help balance things

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 24/31

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 25/31

Sun Confidential: Internal Only 25

Unique Capabilities from SunSun Storage Cluster

• High performance and broad scaling> From a few to over 100 GB/second> From dozens to thousands of nodes

• Breakthrough economics, up to 50% savings> Built with standardized hardware and opensource software vs. proprietary products

• Simplifying Lustre for the mainstream> Pre-defined modules speed deployment ofparallel file system, available factory integrated

> Ease of use improvements – patchless clients,

integrated driver stack, LNET self-test, mount-based cluster configuration, many more

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 26/31

Sun Confidential: Internal Only 26

Sandia Red Storm340 TB Storage, 50GB/s I/O throughput,

25,000 clients

German Climate ResearchData Centre (DKRZ)

272 TB, Lustre and Storage Archive Manager,256 nodes with 1024 cores

Framestore“The Tale of Despereaux”

200TB Lustre file system – averaged 1.2GB/s in sustained reads,

peaks of 3GB/s, 5TB data generated per night from clusterof 4,000 cores interfacing with Lustre file system

TACC Ranger 1.73 PB storage, 35GB/s I/O throughput,

3,936 quad-core clients

Lustre in Action

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 27/31

Sun Confidential: Internal Only 27

Next GenerationBig Data Key Trends

Distributed data and processing(Dist. FS, MapReduce, Hadoop...)

RelationalCommoditisation(Open source, COTS)

Proprietary, dedicateddatawarehouse

OLTP is thedatawarehouse

Open source, generalpurpose datawarehouse

Object Store

Distributed FS Federated/Shared

Master/Master

Master/Slave

Unstructured Data StructuredData

Semi-structuredDatabase

ScaleDB, Big Table,SimpleDB hBase

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 28/31

Sun Confidential: Internal Only 28

Big DataMartien OuwensDatacenter Solutions [email protected]

Sun Microsystems Nederland B.V.

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 29/31

Slide 29

Typical BI/DW Components

Load

Transactionalsystems

Line ofbusinesssystems

S

our ce

s

Pentaho Pentaho

Sun Servers Sun Servers

ODS

Repository

Reporting

Data Warehouse

Infobright

Sun Storage

Analytics

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 30/31

Sun Confidential: Internal Only 3030

Smarter architectureLoad data and goNo indexes or partitionsto build and maintain, no tuning

Super-compact data foot- printleverages off-the-shelfhardwareMySQL wrapper connects to BIand ETL tools

Data Packs – data storedin manageably sized, highlycompressed data packs

Knowledge Grid – statistics andmetadata “describing” the super-compressed data which isautomatically updated as data packsare created or updated

Column Orientation

Infobright Optimizer iterates over theKnowledge Grid; only the data packsneeded to resolve the query aredecompressed

Column-OrientationExemplified by InfoBright's approach

8/13/2019 15 - Sun - Ouwens

http://slidepdf.com/reader/full/15-sun-ouwens 31/31

Sun Confidential: Internal Only 31

`employees` records stored in DataPacks of 64k elements each

SELECT COUNT(*) FROM employees

WHERE salary > 50000

AND age < 65

AND job = ‘Shipping’

AND city = ‘TORONTO’;

Using the Knowledge Grid, we verify, which packs are

• irrelevant (no values found that fit SELECT criteria)

• relevant (all values fit)

• suspect (some values fit)

Any row containing 1 or more irrelevant DP’s is ignored.

Since we are looking for COUNT(*) in this case we do notneed decompress relevant packs. COUNT information is keptin the Knowledge Grid in the Data Pack Nodes.

The suspect packs must be decompressed for detailedevaluation.

salary age job city

Relevant packs

Only this pack will bedecompressed

All packsignored

Rows 1 to65,536

65,537 to131,072

131,073 to …

All packs

ignored

All packsignored

InfoBright Query Execution