Grids – Achtergronden en praktijk in het EU Data Grid

21
DataGrid is a project funded by the European Union ICT KennisCongres 2003 Grids – Achtergronden en praktijk in het EU Data Grid David Groep, NIKHEF [email protected] http://www.dutchgrid.nl/ http://www.eu-datagrid.org/ http://www.edg.org/ D utchG rid

description

Grids – Achtergronden en praktijk in het EU Data Grid. David Groep, NIKHEF [email protected]. http://www.dutchgrid.nl/ http://www.eu-datagrid.org/ http://www.edg.org/. The GRID: networked data processing centres and ”middleware” software as the “glue” of resources. - PowerPoint PPT Presentation

Transcript of Grids – Achtergronden en praktijk in het EU Data Grid

Page 1: Grids –  Achtergronden en praktijk in het EU Data Grid

DataGrid is a project funded by the European Union ICT KennisCongres 2003

Grids – Achtergronden en praktijkin het EU Data Grid

David Groep, [email protected]

http://www.dutchgrid.nl/

http://www.eu-datagrid.org/http://www.edg.org/

Dutc hG rid

Page 2: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 3

Grid – a vision

The GRID: networked data processing centres and ”middleware” software as the “glue” of resources.

Researchers perform their activities regardless geographical location, interact with colleagues, share and access data

Scientific instruments and experiments provide huge amounts of data

[email protected]

next: beyond distributed computing

Page 3: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 4

Beyond distributed computing

A grid integrates resources that are

not owned or administered by one single organisation

speak a common, open protocol … that is generic

working as a coordinated, transparent system

And … can be used by many people from multiple organisations

that work together in one Virtual Organisation

Checklist items based on: Ian Foster What is the Grid? July 2002

next: virtual organisations

Page 4: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 5

Virtual Organisations

A VO is a temporary alliance of stakeholders Users

Service providers

Information Providers

A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.

Viewgraph: Foster, Kesselman, Tuecke, the Globus Project

next: common and open protocols

Page 5: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 7

Common and open protocols

Applications

Grid Services GRAM

Grid Security Infrastructure (GSI)

Grid FabricFARMS Supers Desktops TCP/IP Apparatus

Application ToolkitsDUROC MPICH-G2Condor-G

GridFTPInformation

VLAM-G

• Resources must talk standard protocols …

• … for interoperability of application toolkits

Replica

DBs

next: protocol standards

Page 6: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 9

Standard protocols

New Grid protocols based on popular Web Services

Open Grid Services Architecture

service discovery

many different bindings

easily integrated in hosting environments (Java, WebSphere, .NET)

is entirely generic

adds: transient services, stateful services

Global Grid Forum (GGF) promotes the open standards process

next: access in a coordinated way

Page 7: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 10

Access in a coordinated way

New ‘qualities-of-service’

Transparently crossing of domain boundariessatisfying constraints of

site autonomy

authenticity, integrity, confidentiality

single sign-on to all services

ways to address services collectively

preferably via portals and visual programming

next: example GOME analysis

Page 8: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 11

Example: GOME analysis

Task: ozone is the component in the atmosphere that protects us from harmful UV radiation. Its concentration varies widely. What is happening?

the EnviSat satellite is orbiting the earth and measuring light absorption in the atmosphere

the absorption is related to the ozone concentration,but needs instrument corrections

ground-based observation give absolute concentrations linking both datasets can give us the concentration everywhere terabytes of data come in at several ground stations,

and various labs need the final products

Grid can provide a good solution to this problem

next: GOME analysis on the Grid, domains

Page 9: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 12

Example: Ozone Analysis on the Grid

10100100010111101001000100101101010010001000101011010100101010100001011110101001010011010010010111001001001010010011111010101001010111001010101010101001001001111101010100100010100101100010100000101010001010010001011110100100010010110101001000100010101101010010101010000101111010100101001101001001011100100100101001001111101010100101011100101010101010100100100111110101010010001010010110001010000010101000

NOPREGO

OPERA

LIDARdatabase

validation

visualize

resourcebroker

next: DataGrid overview

Page 10: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 13

A Working Grid: the EU DataGrid

Objective:

build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale

databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific communities

official start in 2001

21 partners

in the Netherlands: NIKHEF, SARA, KNMI

Pilot applications: earth observation, bio-medicine, high-energy physics

aim for production and stability

next: history of grids

Page 11: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 16

Realising the Grid Vision

Grid was the logical next step in the end of the 1990:

Harnassing desktop power became commonplace – 1988: Condor, later: SETI@Home, Entropia, Distributed.NET

Peer-to-peer data access protocols emerged– 1999: Napster, later: Gnutella, KaZaa, BitTorrent

Network access became extremely fast– 1997: wide area bandwidth starts to double every 9 months!

1997: Globus starts developing basic middleware– 1996: middleware by Legion, 2000: Unicore

Massive take-up of the Grid vision in 1999– lead in Europe by the EU DataGrid– others include: NASA-IPG, CrossGrid, GridLab, PPDG, Alliance, …

next: the EU DataGrid project

Page 12: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 17

Grid Security Infrastructure

Crucial in Grid computing: it gives Single Sign-On

GSI uses a Public Key Infrastructure with proxy-ing and delegation

multiple VOs per user, groups and role support

C=IT/O=INFN /L=CNAF/CN=Pinco Palla/CN=proxy

VOMSpseudo-cert

Query

Authentication

Request

AuthDBVOMS

pseudo-cert

connect to providers Gr i

d S

erv

ice

1G

r id

Se

rvic

e 1

Se

rvic

e 2

Se

rvic

e 2contracts

next: information services overview

VOMS overview: Luca dell’Agnello and Roberto Cecchini, INFN and EDG WP6

Page 13: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 19

What is needed to get the work done

Fabric information what are the resources (computers, disk, tape) available to my VO?

how do I access these resources (the “contact information”)?

“Physical” meta-data when was this dataset written?

where can I find copies of it ‘close’ to me?

Contextual meta-data or ‘information’ Which datasets contain feature “X”?

Which DNA sequence corresponds to this protein?

Actual storage, processing power, network connectivity

next: spitfire

Page 14: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 20

Spitfire: Access to Data Bases

based on common EDG Trust and Authorization Manager

VO and Role mapping to data base views

Access via

Browser

Web Service

Commands

Screenshots: Gavin McCance, Glasgow University and EDG WP2

next: R-GMA

Page 15: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 21

Grid information: R-GMA

Relational Grid Monitoring Architecture

a Global Grid Forum standard

Implemented by a relational model

used by grid brokers

next: RLS and RMC

Screenshots: R-GMA Browser, Steve Ficher et al., RAL and EDG WP3

Page 16: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 22

Replica Location Service

Search on file attributes (date, name, …)

Find replicas on (close) Storage Elements

SE1SARA

SE2CERN

cacheUvA DAS2 CE

DAS-2

CECERN

higgs1.dat, ... sara:atlas/data/higgs1.dat

cern:lhc/atlas/higgses/1.dathiggs2.dat, ...

cern:lhc/atlas/higgses/2.dat

ATLAS Replica Service

next: CE and RB, brokering and LCAS

Page 17: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 23

Compute Brokering: reliable execution

User can delegate all job actions to the Resource Broker …… and go away

Reliable scheduling of jobs over the entire grid (as seen from the R-GMA information system)

Users are roaming, and can retrieve their results anywhere, anytime

next: EDG test bed overview

Page 18: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 24

Current EU DataGrid Facilities

CERNLyon

RAL

NIKHEF

EDG and LCG sites

CNAF

Core site

TokyoTaipeiBNL

~1000 CPUs~100 Tbyte storage several key databases

~60 sites, ~600 users in ~7 VOs

next: using EDG, VisualJob

Page 19: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 25

Using the DataGrid for Real

next: Portals

Screenshots: Krista Joosten and David Groep, NIKHEF

Page 20: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 26

Portals

next: conclusions and outlook

Screenshots: ICES/KIS and WTCW: VLAM-G; INFN-GRID and EDG: Genius; NPACI: Rocks

Page 21: Grids –  Achtergronden en praktijk in het EU Data Grid

ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 27

What more is there to see and do?

The current Grids are only the beginning!

portals will get more users on the Grid

more functionality, better resilience, strong reliability

joining the Grid will be as simple as joining a file-sharing network

EGEE: a pan-European Grid Infrastructure being created today

The EU DataGrid project web www.edg.org

DutchGrid Platform www.dutchgrid.nl

For other grid projects, see www.gridstart.org www.enterthegrid.com