Clusternutzung mit jLSF - Technische Universität Ilmenau · Web Page jLSF Accoun ting LSF C API...

Post on 28-May-2020

2 views 0 download

Transcript of Clusternutzung mit jLSF - Technische Universität Ilmenau · Web Page jLSF Accoun ting LSF C API...

Clusternutzung mit jLSFVortrag im AK Supercomputing auf der ZKI Herbsttagung 2014 in Kaiserslautern

Dr. Markus Hillenbrand, RHRK, TU Kaiserslautern

Agenda

•Motivation

•Background Information

•Functionality Groups

• Visualisation

• Job Submission

• Statistics

• Analysis

• Accounting

• Administration

•Outlook to Future Features

MOTIVATIONWhy we develop jLSF for our cluster users and admins

Motivation

• User Support

• Many similar support inquiries from users

• Lacking experience in command line usage

• Demand for graphical job submission

• Cluster administration

• Many configuration files and separate tools to apply changes

• Quick and easy job monitoring and job control

• One tool for daily administration routine

• Accounting

• Accessing historical data in LSF is a tedious task

• Fine-grained accounting not available

• Retrieving detailed resource usage for analysis is quite impossible

BACKGROUNDSome information to understand jLSF’s architecture

Data Flow

JSON

MySQL

DB

Cluster

Web

Page

jLSF

Accoun

ting

LSF

C API

LSF

Log

Files

VISUALISATIONjLSF uses nice graphics instead of pure numbers

Jobs (Pie Chart)

Jobs (History Chart)

Job Slots (Pie Chart)

Job Slots (History Chart)

Hosts

File Systems (Usage)

File Systems (I/O)

Resources

Aggregated RAM Usage

Servers (Admins only)

JOB SUBMISSIONjLSF helps inexperienced users submit their jobs

Submit Batch and Interactive Jobs

Submit an Interactive (GUI) Job

Submit a Batch Job

Submit a Batch Job (Command Line)hillenbr@head2 [~] jsub fluent/15.0 --help

usage: java de.unikl.rhrk.lsf.bsub.ui.JobSubmission fluent/15.0

--dispatch-job-at <Dispatch Job At>

--email <Send Email To>

--graphics <GUI/Graphics> 'gu' | 'gr' | 'g'

-h,--help show this help

--hostmodel <Host Model> e.g. CORE_i7_3770, XEON_E5345,...

--hosts <Hosts>

--jobdescription <Job Description>

--jobname <Job Name>

--jobslots <Job Slots> e.g. 1, 2, 4, 8, 16, 32,...

--journal <Journal File> files with suffixes .jou, .log, .dat

--memory <Memory in MB> e.g. 512, 1024, 2048, 4096,...

--mode <Mode> '2d' | '2ddp' | '3d' | '3ddp'

--project <Project>

--queue <Queue> e.g. idle, short, normal,...

--rerun <Rerun Job> 'default' | 'yes' | 'never'

--reservation <Reservation>

--resources <Resources>

--walltime <Wall Time> e.g. 0:30, 1:00, 6:00, 12:00,...

STATISTICSjLSF presents current data user-friendly

Host Load and Job Slot Usage

Queues

Users, Groups and Job Status

User Information

Jobs

ANALYSISjLSF helps analyse jobs and their behavior

Job History

Job Analysis

ACCOUNTINGjLSF simplifies accounting

Project Accounting (NE)

Project Accounting (Jobs)

Project Accounting (Software)

Software Accounting

Job Type Accounting

Runtime Accounting

Core Usage Ranking

Memory per Core Usage Ranking

Projects Ranking

Software Ranking

ADMINISTRATIONjLSF helps admins simplify their work

LSF and other Configuration Files

FUTURE FEATURESjLSF is being continuously improved

Outlook

• Remote usage (from outside the cluster)

• Directly at the workstation (Linux, Windows, OS X via Webstart)

• Simplified functionality provided by an Android app

• Visualisation of the racks

• Graphical presentation of system and batch status

• Direct access to BMC information and operations

• Trigger maintenance operations (shutdown, reboot, re-install, etc.)

• Optimise database structures

• Redesign of table layout

• Updated / materialised views

• Support for other batch systems