SERIES EDITORS TECHNICA A SCIENTIAE RERUM NATURALIUM...

ABCDEFG

UNIVERSITY OF OULU P .O. B 00 F I -90014 UNIVERSITY OF OULU FINLAND

A C T A U N I V E R S I T A T I S O U L U E N S I S

S E R I E S E D I T O R S

SCIENTIAE RERUM NATURALIUM

HUMANIORA

TECHNICA

MEDICA

SCIENTIAE RERUM SOCIALIUM

SCRIPTA ACADEMICA

OECONOMICA

EDITOR IN CHIEF

PUBLICATIONS EDITOR

Professor Esa Hohtola

University Lecturer Santeri Palviainen

Postdoctoral research fellow Sanna Taskila

Professor Olli Vuolteenaho

University Lecturer Hannu Heikkinen

Director Sinikka Eskelinen

Professor Jari Juga


Publications Editor Kirsti Nurkkala

ISBN 978-952-62-0211-2 (Paperback)ISBN 978-952-62-0212-9 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)

U N I V E R S I TAT I S O U L U E N S I SACTAC

TECHNICA


TECHNICA

OULU 2013

C 466

Thomas Schaberreiter

A BAYESIAN NETWORK BASED ON-LINE RISK PREDICTION FRAMEWORK FOR INTERDEPENDENT CRITICAL INFRASTRUCTURES

UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU, FACULTY OF TECHNOLOGY,DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING;INFOTECH OULU;UNIVERSITY OF LUXEMBOURG, FACULTY OF SCIENCE,TECHNOLOGY AND COMMUNICATION, COMPUTER SCIENCE ANDCOMMUNICATIONS RESEARCH UNIT, LUXEMBOURG;PUBLIC RESEARCH CENTRE HENRI TUDOR,SERVICE SCIENCE & INNOVATION, LUXEMBOURG

C 466

ACTA

Thom

as Schaberreiter

C466etukansi.fm Wednesday, September 18, 2013 11:24 AM

PhD-FSTC-2013-23 The Faculty of Sciences, Technology and Communication

Faculty of Technology

DISSERTATION

Defense held on 14/10/2013 in Oulu, Finland

to obtain the degree of

DOCTEUR DE L’UNIVERSITÉ DU LUXEMBOURG EN INFORMATIQUE

and

TEKNIIKAN TOHTORI DOCTOR OF SCIENCE (TECHNOLOGY)

by

THOMAS SCHABERREITER Born on 05 August 1982 in Hartberg (Austria)

A BAYESIAN NETWORK BASED ON-LINE RISK

PREDICTION FRAMEWORK FOR INTERDEPENDENT CRITICAL INFRASTRUCTURES

Dissertation defense committee Dr. Madjid Merabti, Chairman Professor, Liverpool John Moores University Dr. Jari Veijalainen, Vice Chairman Professor, University of Jyväskylä Dr. Lea Kutvonen Docent, University of Helsinki

Dr. Pascal Bouvry, dissertation supervisor Professor, Université du Luxembourg Dr. Juha Röning, dissertation supervisor Professor, University of Oulu Dr. Djamel Khadraoui, dissertation co-supervisor Centre de recherche public Henri Tudor

A C T A U N I V E R S I T A T I S O U L U E N S I SC Te c h n i c a 4 6 6

THOMAS SCHABERREITER

A BAYESIAN NETWORK BASEDON-LINE RISK PREDICTION FRAMEWORK FOR INTERDEPENDENT CRITICAL INFRASTRUCTURES

Academic dissertation to be presented with the assent ofthe Doctoral Training Committee of Technology andNatural Sciences of the University of Oulu for publicdefence in Auditorium TS101, Linnanmaa, on 14 October2013, at 12 noon

UNIVERSITY OF OULU, OULU 2013

Copyright © 2013Acta Univ. Oul. C 466, 2013

Supervised byProfessor Juha RöningProfessor Pascal BouvryDoctor Djamel Khadraoui

Reviewed byDocent Lea KutvonenProfessor Jari Veijalainen

ISBN 978-952-62-0211-2 (Paperback)ISBN 978-952-62-0212-9 (PDF)

ISSN 0355-3213 (Printed)ISSN 1796-2226 (Online)

Cover DesignRaimo Ahonen

JUVENES PRINTTAMPERE 2013

Schaberreiter, Thomas, A Bayesian network based on-line risk predictionframework for interdependent critical infrastructures. University of Oulu Graduate School; University of Oulu, Faculty of Technology, Department ofComputer Science and Engineering; Infotech Oulu; University of Luxembourg, Faculty ofScience, Technology and Communication, Computer Science and Communications ResearchUnit, Luxembourg; Public Research Centre Henri Tudor, Service Science & Innovation,LuxembourgActa Univ. Oul. C 466, 2013University of Oulu, P.O. Box 8000, FI-90014 University of Oulu, Finland

Abstract

Critical Infrastructures (CIs) are an integral part of our society and economy. Services likeelectricity supply or telecommunication services are expected to be available at all times and aservice failure may have catastrophic consequences for society or economy. Current CI protectionstrategies are from a time when CIs or CI sectors could be operated more or less self-sufficient andinterconnections among CIs or CI sectors, which may lead to cascading service failures to otherCIs or CI sectors, where not as omnipresent as today.

In this PhD thesis, a cross-sector CI model for on-line risk monitoring of CI services, called CIsecurity model, is presented. The model allows to monitor a CI service risk and to notify servicesthat depend on it of possible risks in order to reduce and mitigate possible cascading failures. Themodel estimates CI service risk by observing the CI service state as measured by basemeasurements (e.g. sensor or software states) within the CI service components and by observingthe experienced service risk of CI services it depends on (CI service dependencies). CI service riskis estimated in a probabilistic way using a Bayesian network based approach. Furthermore, themodel allows CI service risk prediction in the short-term, mid-term and long-term future, given acurrent CI service risk and it allows to model interdependencies (a CI service risk that loops backto the originating service via dependencies), a special case that is difficult to model using Bayesiannetworks. The representation of a CI as a CI security model requires analysis. In this PhD thesis,a CI analysis method based on the PROTOS-MATINE dependency analysis methodology ispresented in order to analyse CIs and represent them as CI services, CI service dependencies andbase measurements. Additional research presented in this PhD thesis is related to a study ofassurance indicators able to perform an on-line evaluation of the correctness of risk estimateswithin a CI service, as well as for risk estimates received from dependencies. A tool that supportsall steps of establishing a CI security model was implemented during this PhD research. Theresearch on the CI security model and the assurance indicators was validated based on a case studyand the initial results suggest its applicability to CI environments.

Keywords: Bayesian networks, critical infrastructures, dependency, dynamic Bayesiannetworks, interdependency, modelling, monitoring, risk estimation, risk prediction,simulation

Schaberreiter, Thomas, Bayes-verkkoihin perustuva riskin ennustusmenetelmätoisistaan riippuvien kriittisten infrastruktuurien jatkuvaan käytön analysointiin. Oulun yliopiston tutkijakoulu; Oulun yliopisto, Teknillinen tiedekunta, Tietotekniikan osasto;Infotech Oulu; University of Luxembourg, Faculty of Science, Technology and Communication,Computer Science and Communications Research Unit, Luxembourg; Public Research CentreHenri Tudor, Service Science & Innovation, LuxembourgActa Univ. Oul. C 466, 2013Oulun yliopisto, PL 8000, 90014 Oulun yliopisto

Tiivistelmä

Tässä väitöskirjassa esitellään läpileikkausmalli kriittisten infrastruktuurien jatkuvaan käytönriskimallinnukseen. Tämän mallin avulla voidaan tiedottaa toisistaan riippuvaisia palveluitamahdollisista vaaroista, ja siten pysäyttää tai hidastaa toisiinsa vaikuttavat ja kumuloituvatvikaantumiset. Malli analysoi kriittisen infrastruktuurin palveluriskiä tutkimalla kriittisen infra-struktuuripalvelun tilan, joka on mitattu perusmittauksella (esimerkiksi anturi- tai ohjelmistoti-loina) kriittisen infrastruktuurin palvelukomponenttien välillä ja tarkkailemalla koetun kriittiseninfrastruktuurin palveluriskiä, joista palvelut riippuvat (kriittisen infrastruktuurin palveluriippu-vuudet). Kriittisen infrastruktuurin palveluriski arvioidaan todennäköisyyden avulla käyttämälläBayes-verkkoja. Lisäksi malli mahdollistaa tulevien riskien ennustamisen lyhyellä, keskipitkälläja pitkällä aikavälillä, ja mahdollistaa niiden keskinäisten riippuvuuksien mallintamisen, joka onyleensä vaikea esittää Bayes-verkoissa. Kriittisen infrastruktuurin esittäminen kriittisen infra-struktuurin tietoturvamallina edellyttää analyysiä. Tässä väitöskirjassa esitellään kriittisen infra-struktuurin analyysimenetelmä, joka perustuu PROTOS-MATINE -riippuvuusanalyysimetodolo-giaan. Kriittiset infrastruktuurit esitetään kriittisen infrastruktuurin palveluina, palvelujen keski-näisinä riippuvuuksina ja perusmittauksina. Lisäksi tutkitaan varmuusindikaattoreita, joilla voi-daan tutkia suoraan toiminnassa olevan kriittisen infrastruktuuripalvelun riskianalyysin oikeelli-suutta, kuin myös riskiarvioita riippuvuuksista. Tutkimuksessa laadittiin työkalu, joka tukeekriittisen infrastruktuurin tietoturvamallin toteuttamisen kaikkia vaiheita. Kriittisen infrastruk-tuurin tietoturvamalli ja varmuusindikaattorien oikeellisuus vahvistettiin konseptitutkimuksella,ja alustavat tulokset osoittavat menetelmän toimivuuden.

Asiasanat: Bayes-verkot, dynaamiset Bayesin verkot, kriittinen infrastruktuuri,mallinnus, monitorointi, riippuvuus, riskiarviointi, riskin ennustaminen, simulointi,sisäinen riippuvuus

Schaberreiter, Thomas, Kontinuierliche auf Bayesschen Netzwerken basierendeRisikovorhersage in voneinander abhängigen kritischen Infrastrukturen. University of Oulu Graduate School; University of Oulu, Faculty of Technology, Department ofComputer Science and Engineering; Infotech Oulu; University of Luxembourg, Faculty ofScience, Technology and Communication, Computer Science and Communications ResearchUnit, Luxembourg; Public Research Centre Henri Tudor, Service Science & Innovation,LuxembourgActa Univ. Oul. C 466, 2013University of Oulu, P.O. Box 8000, FI-90014 University of Oulu,

Kurzfassung

In dieser Doktorarbeit wird ein Sektorübergreifendes Modell für die kontinuierlicheRisikoabschätzung von kritischen Infrastrukturen im laufenden Betrieb vorgestellt. Das Modellerlaubt es, Dienstleistungen, die in Abhängigkeit einer anderen Dienstleistung stehen, übermögliche Gefahren zu informieren und damit die Gefahr des Übergriffs von Risiken in andereTeile zu stoppen oder zu minimieren. Mit dem Modell können Gefahren in einer Dienstleistunganhand der Überwachung von kontinuierlichen Messungen (zum Beispiel Sensoren oderSoftwarestatus) sowie der Überwachung von Gefahren in Dienstleistungen, die eine Abhängigkeitdarstellen, analysiert werden. Die Abschätzung von Gefahren erfolgt probabilistisch mittels einesBayessches Netzwerks. Zusätzlich erlaubt dieses Modell die Voraussage von zukünftigen Risikenin der kurzfristigen, mittelfristigen und langfristigen Zukunft und es erlaubt die Modellierung vongegenseitigen Abhängigkeiten, die im Allgemeinen schwer mit Bayesschen Netzwerkendarzustellen sind. Um eine kritische Infrastruktur als ein solches Modell darzustellen, muss eineAnalyse der kritischen Infrastruktur durchgeführt werden. In dieser Doktorarbeit wird dieseAnalyse durch die PROTOS-MATINE Methode zur Analyse von Abhängigkeiten unterstützt.Zusätzlich zu dem vorgestellten Modell wird in dieser Doktorarbeit eine Studie über Indikatoren,die das Vertrauen in die Genauigkeit einer Risikoabschätzung evaluieren können, vorgestellt. DieStudie beschäftigt sich sowohl mit der Evaluierung von Risikoabschätzungen innerhalb vonDienstleistungen als auch mit der Evaluierung von Risikoabschätzungen, die von Dienstleistungenerhalten wurden, die eine Abhängigkeiten darstellen. Eine Software, die alle Aspekte derErstellung des vorgestellten Modells unterstützt, wurde entwickelt. Sowohl das präsentierteModell zur Abschätzung von Risiken in kritischen Infrastrukturen als auch die Indikatoren zurUberprüfung der Risikoabschätzungen wurden anhand einer Machbarkeitsstudie validiert. ErsteErgebnisse suggerieren die Anwendbarkeit dieser Konzepte auf kritische Infrastrukturen.

Schlüsselwörter: Abhängigkeiten, Bayessche Netzwerke, dynamische BayesscheNetzwerke, gegenseitige Abhängigkeiten, kritische Infrastrukturen, Modellierung,Risikoabschätzung, Risikovorhersage, Simulation, Überwachung

Acknowledgements

The four-year long journey to my PhD has been an interesting one and I could experiencea rare constellation of supervision at three institutions in two different Europeancountries. I would like to thank the CRP Henri Tudor, as well as the University ofLuxembourg and the University of Oulu, for supporting the supervision of my PhDthesis in this way, which allowed me to benefit from the knowledge of three institutionswith different research focus. I would like to thank Dr. Djamel Khadraoui for his trustin me and my skills, which enabled me to start the journey to my PhD at CRP HenriTudor in Luxembourg. In the same spirit, I would like to thank Professor Pascal Bouvryand Professor Juha Röning for their support and supervision which allowed me to be aPhD student at both the University of Luxembourg and the University of Oulu. Specialthanks go to my colleague and friend Christian Wieser from the University of Ouluwho always supported me in any organizational and administrative tasks related to theUniversity of Oulu whenever I could not be physically present. I would also like tothank the Luxembourgish Fonds National de la Recherche (FNR) for funding my PhDstudies (grant number PHD-09-103).

My sincere gratitude goes to the co-authors of the original articles this thesis isbased on. Without their incredible knowledge and the general as well as specificdiscussions that were sparked by our collaboration, this thesis would have turnedout much differently. Specifically, I would like to thank Filipe Caldeira, who livesin Portugal and whom I only met in person twice. Our collaboration on multiplepublications was only possible due to the achievements of modern communicationtechnology. The numerous fruitful and productive telephone calls we had over the yearshave shown me that distances and borders are irrelevant, and once more highlighted theadvantages and importance of European collaboration.

I would also like to thank the partners of the EU-FP7 project MICIE, in which thebasic ideas of this PhD thesis originated. The discussions within the MICIE projecthelped me to identify the research focus of this thesis. Furthermore, I would like tothank the Grid’5000 project for allowing me to validate my research based on a casestudy within their project.

Last but not least, I would like to thank my Family, especially my parents andsiblings, for always supporting me in my decision of studying abroad. The most

9

extensive acknowledgements go to the most important person in my life, my girlfriendTiina, who had to suffer with me through all the ups and downs during my PhD studiesand who was always there for me. Without her, I could not have finished this thesis.Thank you!

10

List of abbreviations

BN Bayesian NetworkCI Critical InfrastructureCIA Confidentiality, Integrity, AvailabilityCII Critical Information InfrastructureCIIP Critical Information Infrastructure ProtectionCIP Critical Infrastructure ProtectionCPT Conditional Probability TableCRUTIAL Critical utility infrastructural resilienceDBN Dynamic Bayesian NetworkICT Information and Communication TechnologyIDS Intrusion Detection SystemIMM Input-output inoperability modelIRRIIS Integrated Risk Reduction of Information-based Infrastructure SystemsMICIE Tool for systemic risk analysis and secure mediation of data exchanged

across linked CI information infrastructuresOrBAC Organization-based Access ControlOUSPG Oulu University Secure Programming GroupSCADA Supervisory Control and Data AcquisitionSLA Service Level AgreementUML Unified Modelling Language

11

List of original articles

This thesis is based on the following original articles. Within the thesis report, they arereferred to by the Roman numerals [I]- [VII].

I Schaberreiter T, Aubert J & Khadraoui D (2011) Critical infrastructure security model-ling and RESCI-MONITOR: A risk based critical infrastructure model. In: IST-AfricaConference Proceedings: 1–8.

II Schaberreiter T, Kittilä K, Halunen K, Röning J & Khadraoui D (2011) Risk assessment incritical infrastructure security modelling based on dependency analysis (short paper). In:6th international conference on critical information infrastructure security (CRITIS 2011):213–217.

III Schaberreiter T, Bouvry P, Röning J & Khadraoui D (2012) A Bayesian network basedcritical infrastructure model. In: EVOLVE - A Bridge between Probability, Set OrientedNumerics, and Evolutionary Computation II: 207–218.

IV Schaberreiter T, Bouvry P, Röning J & Khadraoui D (2013) Support tool for a Bayesiannetwork based critical infrastructure risk model. In: EVOLVE - A Bridge betweenProbability, Set Oriented Numerics, and Evolutionary Computation III: 53-75.

V Schaberreiter T, Varrette S, Bouvry P, Röning J & Khadraoui D (2013) Dependency analysisfor critical infrastructure security modelling: A case study within the Grid’5000 project. In:Multidisciplinary Research and Practice for Information Systems, IFIP International CrossDomain Conference and Workshop on Availability, Reliability and Security (CD-ARES2013): 269–287.

VI Schaberreiter T, Caldeira F, Aubert J, Monteiro E, Khadraoui D & Simones P (2011)Assurance and trust indicators to evaluate accuracy of on-line risk in critical infrastructures.In: 6th international conference on critical information infrastructure security (CRITIS2011): 30–41.

VII Caldeira F, Schaberreiter T, Varrette S, Monteiro E, Simones P, Bouvry P & Khadraoui D(2013) Trust based interdependency weighting for on-line risk monitoring in interdependentcritical infrastructures. International Journal of Secure Software Engineering (IJSSE) 4(4).

Discussion of the author’s contribution

The majority of the work presented in [I] was written by the author of this PhDthesis, based on his contributions to [1–3] (substantial contributions were made informalizing the graph based representation of the CI security model using CI services,dependencies and base measurements), the author would like to thank Jocelyn Aubertwho played a major role in formalizing the basic concepts of CI security modellingand who contributed the weighted sum method for CI service risk estimation. Themain contribution relating to RESCI-MONITOR was to formalize the architecture

13

of the tool in a way that makes the CI security model deployable in complex andmulti-stakeholder environments. The author would like to thank Jocelyn Aubert forthe actual implementation of the tool. The author would also like to thank DjamelKhadraoui for important comments and reviews.

The main contributions of the author of this PhD thesis in [II] are the evaluation ofthe applicability of the PROTOS-MATINE method for establishing a CI security modelin the context of complex, multi-stakeholder CI environments and the adaptation of thePROTOS-MATINE methodology to be used for CI decomposition (identification of CIservices, base measurements and CI service dependencies). The author thanks KatiKittilä and Kimmo Halunen for contributing their substantial knowledge of PROTOS-MATINE to this work and helping to formalize the resulting CI dependency analysisin the context of the CI security model. The author would also like to thank all otherauthors for important comments and reviews.

The research presented in [III] and [IV] was entirely performed by the author of thisPhD thesis. The author thanks all other authors for important comments and reviews.

The main contribution of the author of this PhD thesis to the work presented in [V]was the conduction of the case study throughout all the phases, from the identification ofCI security modelling elements on all decomposition levels to conducting interviewswith CI experts and reviewing all provided information sources. The author thanksSébastien Varrette, who is one of the experts within Grid’5000, for his cooperation andfor providing his expertise in all areas of Grid’5000, as well as for contributing thedescription of the Grid’5000 infrastructure in the article. The author thanks all otherauthors for important comments and reviews as well.

The contribution of the author of this PhD thesis to the work presented in [VI]was the evaluation of possible assurance indicators and how they can be applied to theCI security model. Specifically, the base measurement assurance, which is seen as acontinuation of the initial ideas on assurance in the context of the CI security modelpresented in previous publications, is a contribution of the author. The author thanksJocelyn Aubert for his initial effort of highlighting the need for assurance in the contextof CI service risk estimates. The author of this PhD thesis contributed the work on theCI security model to this publication, while Filipe Caldeira contributed his work on trustand reputation systems. The adaptation of the risk alert trust indicator and the behaviourtrust indicator to the CI security model, based on the trust and reputation system byFilipe Caldeira, are to be seen as an equal contribution of the author of this PhD thesisand Filipe Caldeira, resulting from many discussions enabling us to combine the CI

14

security model and the trust and reputation system to receive the presented assuranceindicators. The author also thanks all other authors for important comments and reviews.

The author of this PhD thesis contributed the work on the CI security modelin [VII], while Filipe Caldeira contributed his work on trust and reputation systems. Theevaluation of how CI service risk alerts received from dependencies can be evaluatedfor correctness and how risk alert trust and behaviour trust can be adapted for thispurpose are to be seen as an equal contribution of the author of this PhD thesis andFilipe Caldeira, resulting from many discussions enabling us to combine the CI securitymodel and the trust and reputation system to evaluate the correctness of CI service riskalerts received from dependencies. Another contribution of the author was the evaluationof validation possibilities within the Grid’5000 project and to define the validationscenario. The author would like to thank Filipe Caldeira for contributing to the definitionof the validation scenario and carrying out the experiments of the case study. The authorwould also like to thank Sébastien Varrette for contributing his expertise on Grid’5000and introducing it in the article. The author would like to thank all other authors forimportant comments and reviews.

Remarks on selected publication forums

CRITIS, the International Conference on Critical Information Infrastructure Security,was chosen to publish [II] and [VI] and is, to the best of the authors knowledge, the onlyEuropean conference focused on critical infrastructure protection (CIP). It unites expertsfrom various research communities and disciplines involved in CIP. Publication in thisforum presents an excellent opportunity to expose the work of this PhD to researchers,experts, as well as stakeholders concerned with CIs and CIP.

The EVOLVE conference, which was chosen to publish [III] and [IV], is a uniqueconference which tries to build a bridge between probability, statistics, set orientednumerics and evolutionary computing and unites researchers from domains like com-puter science, mathematics, statistics and physics. This forum provides an excellentopportunity to expose the Bayesian network based component of this PhD thesis to across-domain audience and evaluate the applicability and validity of the approach basedon each domain’s unique viewpoint.

15

Contents

AbstractTiivistelmäKurzfassungAcknowledgements 9List of abbreviations 11List of original articles 13Contents 171 Introduction 21

1.1 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2 Definition of terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.3 Research hypothesis, research requirements and research question . . . . . . . . 25

1.4 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.5 Overview of research results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2 State-of-the-art analysis 332.1 Critical infrastructure definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2 Critical infrastructure dependency/interdependency. . . . . . . . . . . . . . . . . . . . . .35

2.2.1 Identifying, Understanding, and Analysing CIInterdependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.2.2 A generalized modelling framework to analyseinterdependencies among infrastructure systems . . . . . . . . . . . . . . . . . . 37

2.3 Critical infrastructure modelling and simulation . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.3.1 The IRRIIS information model and simulation ofinterdependent critical infrastructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.2 Critical utility infrastructural resilience (CRUTIAL) . . . . . . . . . . . . . . 43

2.3.3 MICIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.3.4 Graph Models of Critical Infrastructure Interdependencies . . . . . . . . . 46

2.3.5 Holistic-reductionistic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.3.6 Conceptual modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48

2.3.7 Critical infrastructure modelling using Petri nets . . . . . . . . . . . . . . . . . . 49

2.3.8 Visual models of CIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.3.9 Modelling CIs using genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 50

17

2.3.10 Spatio-temporal model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.3.11 Specific CI models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51

2.3.12 Federated Agent-based Modelling and Simulation . . . . . . . . . . . . . . . . 52

2.3.13 System dynamics, IDEF0 functional modelling and non-linearoptimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.3.14 Agent based modelling and simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.3.15 Intelligent agents reasoning about CI state . . . . . . . . . . . . . . . . . . . . . . . 54

2.4 Risk in critical infrastructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.4.1 Risk Management for Critical Infrastructure Protection (CIP):Challenges, Best Practices and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.4.2 Risk assessment in complex interacting infrastructures . . . . . . . . . . . . 55

2.4.3 A Markov Game Theory-based Risk Assessment Model forNetwork Information System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.4.4 Multi-sensor Real-time Risk Assessment usingContinuous-time Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . 58

2.4.5 Knowledge-Based Framework for Real-Time Risk Assessmentof Information Security Inspired by Danger Model . . . . . . . . . . . . . . . . 59

2.4.6 Hierarchical, model-based risk management of CIs . . . . . . . . . . . . . . . 59

2.4.7 Risk Filtering, Ranking, and Management Framework UsingHierarchical Holographic Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

2.4.8 Risk Management for Leontief-Based Interdependent Systems . . . . . 61

2.4.9 Using graph models to assess vulnerability in CI . . . . . . . . . . . . . . . . . . 61

2.4.10 Security-oriented cyber-physical state estimation . . . . . . . . . . . . . . . . . 62

2.4.11 Risk modelling of interdependencies in CIs . . . . . . . . . . . . . . . . . . . . . . 62

2.4.12 Operational support for CI security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2.4.13 SERSCIS project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2.5 Dependency analysis in complex systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

2.5.1 A Case for Protocol Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

2.5.2 Graphingwiki - a Semantic Wiki extension for visualizing andinferring protocol dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

2.5.3 Software Vulnerability vs. Critical Infrastructure - a CaseStudy of Antivirus Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

2.5.4 Socio-technical Security Assessment of a VoIP System . . . . . . . . . . . . 66

2.6 Analysis of the related work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67

18

3 Contribution 753.1 Methodology introduction and dependency analysis . . . . . . . . . . . . . . . . . . . . . 753.2 Methodology refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.3 Assurance indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.4.1 Grid’5000 case study implementation and experimentation . . . . . . . . 814 Discussion and Conclusion 95References 99Original articles 105

19

1 Introduction

Critical infrastructures (CIs) are the lifeline of our society and economy, enabling andproviding essential services like electricity supply, transport or telecommunication. CIsare complex interacting systems on a national or international level owned by differentstakeholders. Additionally, CIs and CI sectors may depend on the services provided byother CIs or CI sectors and a failure in one CI can cascade and cause multi-sector CIfailures. A review of CI definitions from different nations and international organizationsis presented in Section 2.1.

In recent years, research related to critical infrastructure protection (CIP) wasadvanced after recognizing the lack of proper strategies in many aspects related toCIP. For example, CIs are usually seen as self-contained environments and protectionstrategies are put in place by CI operators without taking the complex interactionsto other CIs or CI sectors into account. CI stakeholders are reluctant to introducecross-sector protection strategies or protection strategies involving other CI stakeholdersfrom the same CI sector, mainly because it would involve increased information sharing,going against business interests in a possibly competitive environment. Another recentdevelopment related to CIP is the threat of deliberate attacks against CIs. A majorconcern is the increasing connectivity of critical information infrastructure (CII) to theInternet, which increases the attack surface and enables a new class of targeted attacksagainst CIs without the attacker’s physical presence. Current CI protection strategiesusually consider CI availability as the main concern. Due to new attack types, modernCI protection strategies need to provide a more comprehensive approach to CI securityand include other parameters, like data confidentiality or CI integrity to be able addressrisks of any kind.

CI modelling and simulation is a powerful tool for analysing important aspectsrelated to CI behaviour and to highlight the complex dependencies and interdependencies(mutual dependencies) within CIs and among CI sectors. CI models are generallyused to identify weaknesses and thus to identify and mitigate risks. Some models aredesigned to model the structural aspects of CIs and are used to identify off-line risks,while other CI models try to capture the dynamic behaviour of CIs during operationand thus identify the on-line risk. However, the success of CI models highly dependson understanding the problems specific to CI environments and to build models that

21

provide solutions to those problems and take the interests of all involved stakeholdersinto account. CI specific modelling problems include the complexity of CIs, the diversity

of CI sectors, the dependency among CIs and the lack of information sharing due to themulti-stakeholder environment.

In this PhD thesis, based on the analysis of general problems in the field of CImodelling, a cross-sector CI model, called CI security model, is proposed. The CIsecurity model aims at providing on-line risk monitoring of CI services and addressesthe CI specific modelling challenges presented above. The model is based on observingthe state of components used to provide a CI service, as well as by observing services itdepends on. CI service risk is estimated from those observations in a probabilistic wayusing Bayesian networks (BNs). Furthermore, the CI security modelling frameworkincludes a CI dependency analysis method, allows risk prediction based on dynamicBayesian networks (DBNs) and on-line re-evaluation and validation of risk estimates. Aproof-of-concept validation of the CI security model in the context of a real-world casestudy is presented.

During the research of this thesis, the following research process was followed:

– Analysis: Understanding the research area and its specifics. The main sources for theanalysis were a detailed state-of-the-art analysis from various research fields relatedto CIs, as well as the EU-FP7 project MICIE. An overview of this analysis and anintroduction to the MICIE project can be found in Chapter 2.

– Hyphothesis: Based on the analysis of the research area, a hypothesis as well asmodelling requirements and a research question were formalized. The results of thisstep can be found in Section 1.3.

– Methodology: Based on the hypothesis, a methodology was formalized. An overviewof this methodology can be found in Section 1.4.

– Validation: To evaluate the feasibility of the proposed methodology, a case studybased validation was conducted. An overview of the results of this thesis can be foundin Section 1.5.

1.1 Thesis organization

The remainder of this PhD thesis is organized as follows: In Section 1.2, a definition ofthe terms used is presented. Section 1.3 introduces the research hypothesis, the modelrequirements and the research question concerning this PhD thesis and Section 1.4

22

presents an overview of the purpose of this research and the proposed methodology,while Section 1.5 gives an overview of the results of this thesis, as well as the estimatedimpact on CIP. In Chapter 2, the state-of-the-art in CI research is discussed by presentingrelated work and in Section 2.6 the related work is analysed for its relevance to this PhDthesis. Chapter 3 introduces the contribution of this PhD thesis by discussing the originalarticles this thesis is based on. The original articles are complemented by unpublishedwork related to validation in Section 3.4.1. Chapter 4 concludes the introductory part ofthe thesis by discussing the CI security model, as well as the initial modelling results.

Finally, the collection of the original articles this PhD thesis is based on is included.The research presented in [I] introduces the general modelling framework that forms thebasis for this thesis and sets the context for the subsequent research. This work dealswith the analysis of the context as well as formulation of the methodology. The researchpresented in [II] introduces a dependency analysis method that was introduced to be ableto represent CIs as CI security models. This paper formalizes part of the methodology ofthis PhD research. [III] introduces Bayesian network based risk estimation and riskprediction in the context of the CI security model. It represents part of the methodologyof this PhD research. In [IV], a tool that implements the methodology is introduced.This is seen as part of the validation of the presented methodology. [V] presents a casestudy that validates the feasibility of the dependency analysis method presented in [II]and, based on the results of this case study, Section 3.4.1 presents unpublished workvalidating the Bayesian network based risk estimation presented in [III], using the toolpresented in [IV]. This work is part of the validation of the presented methodology.The research presented in [VI] introduces assurance indicators to evaluate the accuracyof internal risk estimates and validates their applicability based on simulation. Thiswork contributes to the methodology of this PhD thesis as well as to the validation. Theresearch presented in [VII] introduces assurance indicators to evaluate the accuracyof risk estimates received from external sources and validates them based on a casestudy. This research contributes to the methodology of this PhD thesis as well as to thevalidation.

1.2 Definition of terms

In this Section, some terms used in this PhD thesis are defined to avoid ambiguousinterpretation. In CI research, there is no generally accepted definition of somefundamental concepts and the interpretation of those concepts varies. The definition

23

of terms in this Section should clarify how they are interpreted within this work.Furthermore, some terms were introduced during the work on this PhD thesis and thisSection is used to define their meaning.

– Critical infrastructure (CI): While an infrastructure can be seen as a system ofman-made artefacts controlled and operated by humans, a CI is an infrastructure thatis so vital to society and economy that a disruption or destruction would have severeconsequences. The term critical infrastructure is defined at the level of nation states orinternational organizations. A more detailed review of CI definitions from differentnations or international organizations can be found in Section 2.1. A CI sector is thegeneral classification of CIs into the services they provide to society (e.g. transport orenergy).

– CI service: A CI service in this work is defined as a service a CI provides toconsumers. A consumer can be a private customer, but in the model proposed in thisthesis, that is concerned with monitoring CI services and dependencies, the privatecustomer does not play a significant role. Consumers of CI services in this model areother CI services, either from within a CI or from other CIs or CI sectors. The conceptof a CI service in this work is quite flexible, any system or component utilized withina CI can be defined as a CI service that provides its service to another CI service. Theterm subservice is used in this thesis to refer to services that a CI service uses toprovide its service.

– CI service risk: Risk is a general term allowing various different interpretations, andthe interpretation often depends on the context. For example, some might see therisk of financial loss as the main aspect, while others see the risk of failure moreimportant. A common definition states that a threat is a potential loss or damage to anasset and risk is a measure of the frequency of threat occurrence. In the CI securitymodel, the term risk denotes the deviation of a CI service state from normal operation,as observed by system measurements and dependency risk states. Risks that are takeninto account are the risk of loss of confidentiality, the risk of loss of integrity and therisk of degrading availability (CIA)1. The term risk propagation is used in this thesis

1While the methodology formalized in this PhD thesis considers CIA risk indicators, the case study basedvalidation only considers the risk of degrading availability (A). The only deviation in constructing differentrisk indicators is in the CI analysis phase. Confidentiality (C) or integrity (I) dependencies are analysedinstead of A dependencies. While the author recognizes that in many complex environments A related aspectsare easier to analyse than C or I related aspects, it is assumed that the validation of the applicability of the Arisk indicator is sufficient to infer the applicability of other risk indicators as well.

24

to illustrate that a risk in one CI service can spread to other CI services that depend onit.

– Dependency: In the CI security model, a dependency is given if a CI service requiresthe services of another CI service, expressed by a risk relationship. A dependency isgiven if a CI service influences the CIA risk of another CI service in any way.

– Interdependency: In this work, an interdependency is given if one or more CIservices mutually depend on each other, for example if an incident (and thus CIArisk) in one CI service cascades back to the originating service via other CI services.

– Base measurement: A base measurement is defined as an observable measurementwithin a CI that can be used to determine the on-line state of a CI service.

– Assurance: Assurance in the CI security model provides a level of confidence in thecorrectness of a calculated CI service risk.

1.3 Research hypothesis, research requirements and researchquestion

In this Section, the research hypothesis and the identified requirements for building amodel for on-line monitoring in the context of CI environments are investigated anddiscussed, some of which are specific to CI environments, while others can be seenas more general modelling requirements that are deemed to be important in the CImodelling context. The basis for the formalization of the hypothesis and the requirementswas an analysis of the context of CI environments, based on a state-of-the-art analysis, aswell as discussions during the EU-FP7 project MICIE, in which the author of this PhDthesis was involved. The state-of-the-art analysis is presented in Chapter 2, the MICIEproject is introduced in Section 2.3.3. During the analysis phase, especially duringdiscussions related to the MICIE project, a strong need for CI monitoring that includesdependencies was uncovered, but the CI specific modelling challenges discussed in theintroduction make such a model a non-trivial task. This analysis leads to the followinghypothesis: “On-line monitoring of CIs that includes dependencies can be simplified if

risk is used as a modelling base, given the modelling challenges of CI complexity, CI

diversity, CI dependency/interdependency and the lack of information sharing”. Therequirements to build such a model in the context of CI environments are listed below.

Starting from the identified requirements, a research question, which is the basis forthe research presented throughout this thesis, is presented at the end of this Section.

25

– R1: The complexity of CIs, which can operate on a national or even internationallevel, provides a major challenge for CI modelling. A CI model needs to includeenough parameters to receive useful and accurate modelling results, while at the sametime providing enough abstraction to keep the model manageable in both human andcomputational effort. Furthermore, complex environments like CIs are composedof various different components and a model has to be able to represent them in anadequate way, while avoiding too many detailed component models.

– R2: The diversity of CI sectors presents a challenge in building cross-sector CImodels. Cross-sector CI models are useful to provide insight into the complexdependencies among CI sectors; however, it is hard to find common modellingparameters that are valid for all CI sectors. Those models usually use a high level ofabstraction (like e.g. financial transactions or the flow of goods), but actual informationabout CI components or the on-line state of those components is not integrated intothe model because of the challenge of providing a uniform representation in thecross-sector model. A model for on-line monitoring of dependent CIs has to presentthe data in a uniform way.

– R3: The dependencies and interdependencies among CI components and CI sectorspresent a major challenge for CI modelling. Dependencies among components canoccur in any part or organizational level of a CI. Another class of dependencies evenharder to identify are dependencies among CIs and CI sectors that belong to differentstakeholders. A failure in one CI or CI component can cascade through dependenciesto other parts of a CI or to other CIs/CI sectors and can even cascade back to theoriginating component via an interdependency. A CI dependency model needs to beflexible enough to capture dependencies at all technical and organizational levelswithin a CI, as well as cross-sector dependencies and interdependencies.

– R4: Information sharing, or the lack of information sharing among CIs is seen as amajor obstacle to building a cross-sector CI model, especially if the modelling goalis on-line monitoring which involves the constant exchange of information aboutthe current CI state. CI operators are reluctant to share information with externalpartners and possible competitors for confidentiality reasons. A CI model has tofacilitate information sharing among CIs, either by providing an environment thatgoverns information sharing by distinct policies and/or by minimizing the amount ofconfidential information to be shared.

– R5: In a CI environment, information about the CI is available in many differentforms (e.g. manuals, data records, expert knowledge). A CI model should utilize all

26

available information sources to provide an accurate model. In CI environments, itis assumed that extensive and deep expert knowledge is available. A CI model shouldacknowledge this by allowing expert input at any modelling phase.

– R6: Models are a reflection of real-world behaviour, but can not fully capture allreal-world complex relations and will always loose accuracy. This is in line witha famous quote by George Edward Pelham Box who said “Essentially, all modelsare wrong, but some are useful”. In CI models, especially in CI models concernedwith on-line monitoring (which might be used as decision support tools), wrongdecisions due to faulty model output can have catastrophic consequences. Having inmind that fully accurate modelling output is unrealistic, a CI model has to reflect theuncertainty of modelling results to be able to set the accuracy of results in contextand reach more informed decisions.

– R7: A CI model concerned with CI state monitoring can be of great value to a CIoperator. CI operators are not only interested in the current CI state, but also howthe CI state evolves over time (e.g. after an incident). A CI model should be able topredict the future CI state, given the current CI state.

– R8: A general modelling problem is to verify the validity of modelling results. Thisis especially interesting for a dynamic model, for example CI state monitoring, wherethe output changes over time. A model should be able to provide an estimate ofconfidence in the provided result.

After identification of the model requirements, the following research question can beplaced now: “Is it feasible to build a model for on-line CI risk monitoring, fulfilling the

requirements R1-R8?”

1.4 Thesis overview

The research conducted in this thesis tries to establish a risk-based on-line CI monitoringframework that includes internal (within the CI) as well as external (to other CIs/ CIsectors) dependencies. The core of this framework is a CI security model that contains,for each CI service, a list of its dependencies (other CI services or base measurements)as well as a risk probability for each dependency state combination. The intended usersof this framework are CI operators that receive a monitoring component that gives aholistic overview of the system’s state. Depending on the state of each dependency, theassociated risk level is visualized. Since actual deployment in CIs is unrealistic at this

27

point, this thesis focuses on simulation or emulation of risk scenarios that can helpCI operators to identify high-risk scenarios cascading through external and internalcomponents. Such scenarios can not be identified by monitoring tools that do not includedependencies and allow CI operators to deploy risk mitigation strategies. The remainderof this Section is used to present a detailed overview of the thesis and outlines howthe research question, as well as the model requirements stated in Section 1.3, wereaddressed.

At the core of the thesis, a service-oriented, risk-based approach is used to modelCIs. This approach was chosen to overcome the diversity of CIs (R2). CI servicesrepresent an interesting modelling entity since all CIs are service providers and thusCI services represent a common modelling base. For the same reasons risk, or in thiscontext CI service risk, was selected as an indicator to model the dynamic or on-linebehaviour of CI services. All CIs are concerned about the risk (e.g. service degradationor service failure) experienced by the services they provide. Risk is estimated from basemeasurement states and the CI service risk states of dependencies. In this thesis, risk isrepresented by the three risk indicators CIA (risk of loss of confidentiality, risk of loss ofintegrity and risk of degrading availability). Those indicators were chosen since they arecommonly accepted and well established metrics in information security and are alsovalid in the context of system security, like modelling CI service risk. Although thereare other information security indicators like authenticity or non-repudiation, the CIAindicators were selected since they are the most recognized and are seen to be the mostuseful for indicating on-line CI service risk. However, the model proposed in this thesisis designed for flexibility and if a CI operator is interested in additional risk indicators,they can be modelled according to the same principles as the CIA risk indicators. CIservice risk is represented as a risk level (C,I or A) in five steps between one (normaloperation) and five (maximum risk). Five risk levels were chosen as a trade-off betweenaccuracy and manageability. A CI operator needs to be able to distinguish betweenlow-risk and high-risk situations, while at the same time, maintain oversight in a crisissituation.

To be able to include the CI service risk of dependencies in service risk estimation, agraph-based representation was chosen. CI services are represented as nodes and thedependencies among nodes are represented as directed edges (interdependencies in thegraph model form directed cycles) (R3). A graph represents a natural and intuitive wayto model dependencies since it visually and/or mathematically connects two entities thathave some kind of relation with each other. In this thesis, a dependency is given if the

28

CIA of a CI service depends on another CI service and on-line risk estimates of one CIservice are distributed to CI services depending on it. In the graph, base measurementscan be included since they can be seen as a dependency of a CI service.

The risk-based representation of CI service states should facilitate informationsharing among CIs (R4) which would like to include dependent CI service states intheir risk estimation, since only abstract risk estimates are shared among CIs and nointernal, possibly confidential information is needed. Those risk indicators are seen as aminimum amount of information that needs to be shared to enable monitoring of externaldependencies. CI operators might still be concerned about sharing this information withexternal entities. It is assumed that contracts and service level agreements (SLAs) arestill needed to support the information sharing effort, but the less information needs tobe shared, the easier it will be for operators to provide the information.

In complex environments like CIs, CI services provided to customers use a multitudeof components and/or depend on many internal as well as external services. Sincemany factors need to be taken into account, aggregating a CI service risk level poses aproblem (R1). In this work, it is proposed to use decomposition to represent complexCI services by lower level services used to provide the CI service. The lower level CIservices are defined as dependencies of the higher level service. In this way, complexlarge-scale infrastructures can be decomposed in easier to handle entities. CI servicerisk is aggregated (from base measurement states and dependency risk states) on eachdecomposition level and is distributed to the higher level services as the CI servicerisk of a dependency. The methodology used for CI decomposition is based on thePROTOS-MATINE methodology developed by OUSPG (Oulu University SecureProgramming Group) and utilizes all available information sources (documents as wellas interviews) on all organizational levels (R5) to gain a holistic view of the analysed CIand to identify, on all decomposition levels, the CI services, the dependencies among CIservices and the base measurements used to define the on-line states of CI services.Analysing the information provided by the information sources and possibly combiningthe information provided by multiple information sources allows to gain a holistic viewon the analysed CI and it allows to better identify and discard irrelevant information.

Estimation of CI service risk from base measurement states and the CI service riskstates of dependencies is a classification problem. In this work, a Bayesian networkbased approach was chosen for this classification. Bayesian networks provide a wayto calculate the probability of the state of a node, given the state of its parent nodes(which are in this work the dependencies of a CI service). Therefore, each node holds a

29

conditional probability table (CPT) containing the most probable node state for eachstate combination of the parent nodes. Bayesian networks were chosen for variousreasons. First, the graph structure of Bayesian networks is the same as the graphstructure of the CI security model and therefore integrates well with the original ideas ofthe CI security model. Second, Bayesian networks allow probabilistic state estimation.

In the context of CI service risk, an exact estimation of CI service risk is unrealisticsince too many uncertain factors contribute to the risk estimation. The Bayesian networkbased risk estimation reflects this by providing a probabilistic estimate expressed inpercent (R6). Furthermore, Bayesian networks allow to learn the risk probabilities frompast events, but also allow to estimate the risk probabilities where learning is not possible(R5). Bayesian networks were chosen for this work in favour of other graph-basedmodels like Markov models (which allow to model the state change probabilities of asystem) or Petri nets (which allow to model the state transitions of complex systems orprocesses) since none of those models fulfill the requirements of the CI security model,which are the estimation (or classification) of a CI service risk state, given the state ofdependencies, as well as the propagation of those risk estimates through the dependencygraph.

One aspect of the CI security model, the modelling of interdependencies (R3), is nottrivial using Bayesian networks. Interdependencies represent a directed cycle in thegraph model and can not be handled by Bayesian networks since the effects of loop-backwould produce wrong probability estimates. In this work, a method based on dynamicBayesian networks (DBNs) for handling interdependencies is presented. The method isbased on the estimation of the time an original event in one CI service needs to loopback to the service via a dependency cycle. A separate probability estimate is given foreach time frame, providing a CI service risk estimate caused by the loop-back effects ineach time frame.

Risk prediction in the context of the CI security model is introduced to be able topredict a future CI service risk state after an incident. It also accounts for the situationwhere a risk received from a dependency does not pose an immediate CI service risk,but poses a CI service risk in the future if it is not addressed. Risk prediction (R7) inthis work is realized using DBNs. It represents the most likely risk in the short-term,mid-term and long-term future, given a current CI service risk state. The time frames forshort-term, mid-term and long-term risk can be defined by each CI operator based onpreferred conception of the time frame durations.

30

To be able to evaluate the validity of calculated risk estimates (R8), in this work, astudy of assurance indicators able to evaluate the static inaccuracy (e.g. the systematicinaccuracy of sensors), as well as the dynamically changing inaccuracies of the on-linerisk estimates is conducted. The dynamic inaccuracies are evaluated using a trust-basedapproach to validate the accuracy, as well as the general behaviour of risk levelsreceived from dependencies. The trust-based indicators are based on the evaluation ofthe difference between a received CI service risk level and the actually experiencedservice level (risk alert trust) and the evaluation of an entity for compliance to expectedbehaviour (behaviour trust).

1.5 Overview of research results

The results of this thesis include the formalization of a methodology for on-line riskmonitoring and prediction in CIs, including dependencies. The methodology containsa dependency analysis method to be able to represent CIs as CI security models andassurance indicators that allow to evaluate the accuracy of on-line risk estimates. Atool that supports the realization of this methodology in practice was implemented.The tool supports the graphical representation of a CI security model (CI services, CIservice dependencies and base measurements), pre-processing of data used to learn riskprobabilities, learning of risk probabilities and estimation of risk probabilities whereautomatic learning is not possible. The output of the tool is a CI security model thatincludes, for each CI service, a list of dependencies as well as risk probabilities for eachdependency state combination. Furthermore, the tool can be used to conduct simulationor emulation of risk scenarios based on the CI security model.

To validate the presented methodology and the implemented tool, a case-studybased evaluation method was chosen. A case study was conducted in the context ofGrid’5000, a real-world distributed computing grid in France and Luxembourg. The datacentres of Grid’5000 are connected via dedicated fibre optic connections operated by anindependent provider, allowing to validate two independently operated infrastructureswith a strong dependency. The case study focuses on the availability of Grid’5000 andonly includes an availability risk indicator. Since the risk indicators only differ in thedependency analysis part (analysis of confidentiality or integrity dependencies of a CIservice instead of availability dependencies), it can be inferred that the results receivedfrom the availability indicator are valid for the other risk indicators as well, although the

31

author recognizes that confidentiality and integrity related parameters might be moredifficult to analyse.

The case study allowed to validate the presented dependency analysis method, theBayesian network based risk estimation, the assurance indicators and the implementedtool. Some parts of the methodology could not be validated in the context of this casestudy. The special handling of interdependencies within the methodology could not beverified since no interdependency could be identified during CI dependency analysisand risk prediction could not be verified since in the computing sector failures areusually corrected with short time delay and long-term risk is not applicable. Althoughsome parts of the methodology could not be validated during the case study, the majorcomponents of the methodology could be validated and the results suggest its feasibility,thus the research question can be answered: “Yes, it feasible to build a model for on-lineCI risk monitoring, fulfilling the requirements R1-R8”.

The estimated impact of the research presented in this thesis, given that CI operatorswould adapt and utilize it, would be enhanced reliability and security of CIs due toholistic monitoring and early detection of propagating risks in dependent CIs, if theCI security model is used for on-line monitoring. The reliability and security can beenhanced as well if the CI security model is used for simulation or emulation of riskscenarios, if CI operators deploy risk mitigation strategies based on the findings.

32

2 State-of-the-art analysis

After setting the goals and requirements in Sections 1.3 and 1.4, this Chapter presents astate-of-the-art analysis of research related to the goals and requirements of this PhDthesis. The Chapter opens by providing an overview of CI and CI sector definitionsin Section 2.1. In the following Sections, related work in the research areas CI

dependency/interdependency analysis (Section 2.2), CI modelling and simulation

(Section 2.3) and CI risk analysis (Section 2.4) is presented. Section 2.5 presents workrelated to the PROTOS-MATINE dependency analysis methodology.

The related work presented in this Chapter provides a broad overview of research inthe context of CIP, using a multitude of approaches to reach a common goal: To enhancethe availability or security of CIs. In Section 2.6, the related work is set in contextas to the goals and requirements of the CI security model. This allows to comparethe similarities and differences of the approaches, as well as to identify gaps in thestate-of-the-art.

2.1 Critical infrastructure definitions

The term “Critical infrastructure” is relatively new and has been defined in the pastdecades mainly by governments and government related institutions as a result of eventsthat brought CIs and CI protection to the public mind. Such events are always relatedto the failure of one or more CIs and have such a severe impact that they are publiclyrecognized. Recent events include the 2003 power blackout in the north of the USA andCanada [4], the 2006 power blackout in parts of Europe [5], deliberate malware attackslike the Stuxnet incident ([6]) or the September 11th 2001 attack to the World TradeCenter in New York. The reasons causing those events can be manifold, ranging fromhuman/technical error to natural disasters or deliberate attacks. Due to this variety offactors that can go wrong, a clear definition of CIs, as well as strategies for better CIprotection are crucial.

The international CIIP (critical information infrastructure protection) handbook [7]aims at giving a broader picture of what defines CIs and how they can be protectedby reviewing the definitions and protection policies of 25 nations and seven internalorganizations. Table 1 summarizes the definitions of some of those countries andorganizations.

33

Table 1. CI definitions of selected nations and international organizations (cited from [7]).

Nation/Organization

CI definition

Austria Natural resources; services; information technology facilities; networks;and other assets which, if disrupted or destroyed would have a seriousimpact on the health, safety, or economic well-being of the citizens or theeffective functioning of the government.CIs: Institutions of the legislative, executive, and judiciary powers; infrastruc-ture facilities of energy supply companies; information and communicationtechnologies; infrastructure facilities that ensure the provision of vital goods;transport and traffic infrastructures.

France All infrastructures that are vital to the maintenance of primary social andeconomic processes are considered critical sectors in France.CIs: Finance; industry; energy; the work of the judiciary; public health; thework of national civil authorities; electronic communication; audiovisualmedia and information technology; transport systems; water supply; food;space and research; the armed forces.

Germany Organizations or facilities whose failure or impairment would cause asustained storage of supplies, significant disruptions of public order, orother dramatic consequences for large parts of the population are definedas critical.CIs: Transportation and traffic; energy; hazardous materials; telecommu-nications and information technology; financial; monetary and insurancesystems; supply; government agencies; administration and justice; media;research facilities; cultural property.

United Kingdom [...] key elements of the national infrastructure that are crucial to thecontinued delivery of essential services to the UK. Without these keyelements, essential services could not be delivered and the UK could sufferserious consequences, including severe economic damage, grave socialdisruption, or even large-scale loss of life.CIs: Communications; emergency services; energy; finance; food; govern-ment and public services; public safety; health; transport; water.

European Union CIs consist of those physical and information technology facilities, networks,services and assets which, if disrupted or destroyed, would have a seriousimpact on the health, safety, security or economic well-being of citizens orthe effective functioning of governments in the member states. CIs extendacross many sectors of the economy and key government services.CIs: Energy; information and communication technologies; water; food;health; financial system; public and legal order and safety; civil administra-tion; transport; chemical and nuclear industry; space and research.

34

Nation/Organization

CI definition

United States [...] the term ‘critical infrastructure’ means systems and assets, whetherphysical or virtual, so vital to the United States that the incapacity ordestruction of such systems and assets would have a debilitating impact onsecurity, national economic security, national public health or safety, or anycombination of those matters.CIs: Information technology; telecommunications; chemicals; commer-cial facilities; dams; commercial nuclear reactors, materials and waste;government facilities; transportation systems; emergency services; postaland shipping services; agriculture and food; public health and health care;drinking water and waste water treatment systems; energy; banking andfinance; national monuments and icons; defence industrial base; criticalmanufacturing.

Summarizing the different notions of what defines a CI it can be observed that althoughthe explanations vary, the core definition of what is a CI is basically the same for all: ACI is an infrastructure that, if disrupted or destroyed, would have severe consequences tothe economy and society of a nation.

It can also be observed that although some nations/organizations define CIs thatdo not appear in the definitions of others, there is a common set of CIs (like energy,telecommunication, financial sector, transport, ...) that appear in each definition in oneform or another.

2.2 Critical infrastructure dependency/interdependency

In the literature, the meaning of the terms dependency and interdependency are notalways used in a consistent way. For example, in some articles, interdependency is usedto define a dependency of any kind of one CI sector to another CI sector, whereas in otherarticles only a mutual dependency between CI sectors is seen as an interdependency. Inthis PhD thesis, the general definition of the terms “dependency” and “interdependency”presented in Section 1.2 are used, but it is noted that the conceptions of those termsmight differ in the related work presented in the following Sections.

35

2.2.1 Identifying, Understanding, and Analysing CI Interdependencies

In [8], Rinaldi, Peerenboom and Kelly present a detailed and well established analysis ofCI interdependencies. In their work, a dependency is defined as “A linkage between twoinfrastructures, through which the state of one infrastructure influences or is correlatedto the state of the other” and an interdependency as “A bidirectional relationshipbetween two infrastructures through which the state of each infrastructure influencesor is correlated to the state of the other”. For the majority of their work, they do notdistinguish between the two definitions and denote any interconnection between CIs asan interdependency.

The interdependencies of CIs are described in six dimensions: The types of interde-

pendencies, the infrastructure characteristics, the environment it operates in, the state of

operation, as well as the type of failure and the coupling and response behaviour.The interdependency types are categorized in four different classes: physical (the

state of one infrastructure is dependent on the material output of another), cyber (thestate of one infrastructure depends on information transmitted through the informationinfrastructure), geographic (a local environmental change can cause state changesin all geographically interdependent infrastructures) and logical (dependency amonginfrastructures that is not physical, cyber or geographic).

Infrastructure characteristics have an impact on how infrastructures interact withothers. Such characteristics include temporal (time criteria for the delivery of aservice are not the same for different CI sectors), operational (different CIs havedifferent procedures for stressed or perturbed infrastructures), spatial (on which level ofcomposition an infrastructure is considered, e.g. part, unit, system, ...) and organizational

(ownership and regulation have a huge impact the characteristic of an infrastructure).It is stated in the paper that the environment an infrastructure works in has a

substantial influence on the interdependencies it has to other infrastructures. Theenvironmental influences are categorized into eight classes: business, economic, public

policy, legal/regulatory, security, technical, health/safety, social/political. In each ofthose sectors, different decisions can be made or different regulations apply whichinevitably leads to a different configuration and operation of an infrastructure (which inturn influences the interdependencies to other infrastructures).

The current state of operation of an infrastructure has a direct impact on the state ofoperation of interdependent infrastructures. The states of operation can range fromnormal to stressed/disrupted and repair/restoration.

36

Another important characteristic of interdependencies is the types of failure thatcan occur within an infrastructure and how they may spread to other interdependentinfrastructures. Different failure types are cascading failure (a disruption in oneinfrastructure causes a failure in another infrastructure), escalating failure (two or moreindependent failures in different infrastructures influence each other) and common cause

failure (a common cause disrupts two or more infrastructures, e.g. natural disaster).The last category defining infrastructure interdependencies is the coupling and

response behaviour of infrastructures. Different coupling types are loose or tight (definesto which degree one infrastructure is dependent on another), linear (interdependenciesto other infrastructures that occur in normal states of operation) or complex (an inter-dependency to another infrastructure that occurs in an unforeseen state of operation),inflexible (past experience will not affect future behaviour) or adaptive (learn from pastexperience to adapt future behaviour).

2.2.2 A generalized modelling framework to analyseinterdependencies among infrastructure systems

In [9], Zhang and Peeta present a generalized framework using a network structuremodel and a market-based approach to capture the flow characteristics among CIs andinterdependent CIs. They analyse previous work on interdependency modelling andidentify six classes of previous interdependency models:

– Surveys and descriptive studies which give a qualitative overview to CI interdepend-encies.

– Simulation-based approaches, for example agent-based simulations.– Input-output approaches using economic criteria to describe interdependency levels.– Network-based approaches use the network structure of CIs to model interdependen-

cies, which allows to capture geographic interdependencies.– System-of-Systems approaches where single CIs are modelled as a network of a larger,

multi CI network.

The authors argue that none of the above models can capture at the same time interde-pendencies in the geographic network structure, as well as the dynamically changingdependency level caused by CI operation. In their framework, they propose to capturethe geographic interdependency by modelling each CI as a graph of their networkstructure, where nodes represent geographic locations like cities and edges represent

37

their connections, like for example power lines. Each CI (energy, transport, telecommu-nication) is represented by the same set of nodes, but the edges vary depending on thenetwork topology. Interdependencies among CIs are modelled as links between thenodes of the individual CI models. To capture the dynamic interdependency level, amarket interaction model based on computable general equilibrium (CGE) and spacialcomputable general equilibrium (SCGE) to capture the spacial interdependenciesamong regions is introduced. CGE and SCGE model different types of supply-demandinteractions among CIs based on the model of a producer and a household (and anadditional transport agent to model spacial dimension in the SCGE model). Producersrepresent nodes in the geographic model and transport agents represent the edgesdelivering the goods to the respective region.

The main purpose of the model is to capture interdependencies and highlight theirimplications (cascading effects). This should aid decision makers in planning, designand operation of CI systems.

2.3 Critical infrastructure modelling and simulation

Modelling and simulation is seen as a key element to better understand CI interdepend-encies. For example, in [10] Rinaldi identifies eight major benefits:

– It can help to identify the potential national and economic security implicationsfollowing a catastrophic infrastructure failure.

– Provide insight into CI operation during extreme and rare events (for example naturaldisasters).

– Help in understanding the recovery process after rare or extreme events that led tocatastrophic failures.

– It can be important to the process of infrastructure risk analysis (vulnerabilityassessment, consequence analysis, threat assessment).

– Modelling and simulation can be used to derive and analyse infrastructure policies.– It can be used to develop, test and validate infrastructure protection policies.– In decision support tools that enable situational awareness (for example, monitoring

and visualization), modelling and simulation can be used for what-if analysis by forexample simulating the consequences of a decision.

– Modelling and simulation can be useful in exercises and training related to CIprotection.

38

Critical infrastructure interdependency models are categorized in six different typesin [10]:

– Aggregate supply and demand tools which evaluate the demand of CI services in aregion and the ability to provide those services.

– Dynamic simulations where the flows of goods provided by CIs are the basis formodel dynamics and the effects of policies, regulations and laws on CI operation canbe examined.

– Agent-based models where the agents are used to model infrastructure behaviour,including the interactions among interdependent infrastructures.

– Physics-based models which provide a detailed model of the physical behaviour of aninfrastructure based on engineering techniques (for example, power flow and stabilityanalysis).

– Population mobility models which examine the movements through urban regionswhile goods are produced and consumed. By simulating the routines associatedwith this behaviour, strategies for the optimized planning of those routines can bedeveloped.

– Leontief based input-output models which model the economic interconnectionsamong CIs.

In [11], Svendsen and Wolthusen provide another overview of CI modelling approaches.In general, they argue that sector specific CI models are useful, but in order to adequatelycapture the dependencies and interdependencies of CIs, which they see as a major goal,a sufficient degree of abstraction needs to be introduced to CI models. They identify sixdifferent model types used in CI research:

– Economic models, like Leonthief based input-output models, are used to capture thehigh-level dependencies of CIs based on the needed supplies.

– System dynamics approaches are used to model the dynamic behaviour of CIs, basedon events and the time between events is usually visualized by diagrams.

– Behavioural and game-theoretic models are used to model strategic interactionsamong entities (agents), allowing to reason about the outcome of an event, even whenthere is incomplete information about the opponent’s strategy. The authors state thatthis modelling type was not extensively used in the CI sector yet.

– Graph- and network-based models are often used to capture dependencies amongsystem entities. The flexibility of graph models allows modelling at various abstractionlevels.

39

– Agent based models are used to model physical components as agents which allows toimplement detailed physical behaviour as well as simple abstractions of physicalbehaviour. Agent based models are well suited to model the dependencies amongphysical components.

– Physical and geospacial models are usually used to model detailed sector-specific CImodels.

The remainder of this Section is used to introduce two European projects concernedwith CI modelling and simulation, namely IRRIIS [12] (Integrated Risk Reductionof Information-based Infrastructure Systems) and CRUTIAL (CRitical UTility Infra-structurAL resilience), as well as a third European project, MICIE [13] (Tool forsysteMIc risk analysis and secure mediation of data exchanged across linked CIinformation infrastructurEs), in which this PhD thesis originated. Furthermore, someother interesting CI modelling approaches are discussed. At the end of this Section,some of the modelling techniques closely related to CI simulation are examined.

2.3.1 The IRRIIS information model and simulation of interdependentcritical infrastructures

In [14], Klein et al. present the information model and simulation methods developed inthe IRRIIS project. The main challenge addressed in those models is the different natureand behaviour of CIs. Information models (abstractions of CIs) are needed that aresufficiently expressive to model physical as well as information and control aspects ofCIs. At the same time, simulation techniques have to be found to address the variety ofdifferent CI systems and dependencies among infrastructures. A trade-off betweenaccuracy and abstraction has to be found and a well-defined semantic manageable by allCIs needs to be used.

Therefore, the IRRIIS information model introduces three generalization models:

– A generic information model representing the top level abstraction assuming thatall CIs have a common core information model that can be used to instantiate morespecific models.

– A domain specific model that extends the generic model with domain specificinformation (e.g. electricity network, transportation network, ...).

40

– An instance level model that describes concrete CIs as an instantiation of the domainspecific model (e.g. electricity network of Provider A, transportation network of areaA, ...) .

The most important generalization model is the generic model since the interdependen-cies are taken into account at this level. The generalization model is organized in threelayers, the static model, the behaviour model, and the event and action model.

The static model is used to describe a CI by characterizing elements (e.g. services,dependencies, ...) at a point in time (snapshot). The behaviour model is concerned withsystem or service states and state transitions. State transitions in one system, componentor service can propagate to other interdependent systems, components or services eitherinstantaneously or with a time delay. The events and actions model models a mechanismto trigger state changes in the behaviour model (events) and to react to state changes inthe behaviour model (actions).

The simulation approach is based on federated simulation. Special purpose simu-lations are used to simulate different CIs and the IRRIIS information model is usedto model interdependencies among the infrastructures on top of the special purposesimulations.

In [15], models and model types to represent CIs and their interdependencies thatwere investigated in the IRRIIS framework are presented. Generally, the modellingapproaches are divided into general models that capture the interdependencies ofinfrastructures on a high level and special purpose infrastructure models that are used tomodel a specific CI. In the following Sections, the general models are detailed.

Models of causal networks - Leontief based model

Several authors propose a CI interdependency model based on the market equilibriumtheory by Wassili Leontief [16]. In the literature, it is often referred to as an input-outputinoperability model (IMM) [16, 17].

In [18], the IMM is described as the study of economic loss in interconnectedeconomic sectors after the disruption in one particular economic sector. The input-outputmodel mapped to the field of CI modelling states that a system can be fully described bythe current level of perturbation of the system and by its interdependencies in a dynamicway. In [19], this is mathematically described in a time continuous form:

dxidt = ∑

lk=1,k 6=i

δFiδxk

δxiδ t +

δFiδui

δuiδ t

41

The term δFiδxk

describes the system states of interdependent systems mapped to the

current system by a function describing its level of interdependency. δFiδui

is a functiondescribing the current state of perturbation of the system.

xi(k+1) = min{[

φiui +∑lj 6=i pi jx j(k)

],1}, ∀i = 1, l

is a time discrete approximation of the time continuous description, where φiui representsthe perturbation of the system and pi jx j(k) represents the interdependency to othersystems. The functioning level xi is located in a scale between 0 and 1, with 0 meaningthe system does not work at all and 1 meaning the system is working at full capacity.

The function to evaluate the level of interdependency among infrastructures is mainlydefined by expert knowledge to specify a certain level of interdependency. Therefore themodel is to a high degree dependent on subjective information. [16] tries to overcomethis lack of precision by the use of fuzzy numbers to add an uncertainty range to themodel estimations.

In [17], an on-line tool based on the input-output inoperability model is proposed. Inthis scenario, the state estimators from one infrastructure are distributed to interdependentinfrastructures which use them to calculate their own state estimators, given a picture ofthe global view of the structure of the interdependent network. The estimation tools ateach infrastructure work independently, the issue of synchronizing the infrastructurestate estimator exchange is pointed out in this work.

Articles like [20] are inspired by this approach to model close to reality infrastructurecase studies and their interdependencies. In this case, a telecommunication and an energyinfrastructure were modelled assuming geographical interdependency. Simulationresults show that a perturbation of one infrastructure can cause an even more severeperturbation in the other infrastructure which effects the quality of the provided service.

Common-mode failure model

The common-mode failure model tries to take into account the fact that the failure ofone component usually increases the stress (and therefore failure likelihood) on othersystem components. For example, in a communication network, the failure of a routerwill lead to a re-route of the communication through other routers and increase theirload. The increased stress will directly affect the failure likelihood of the components.

The model was implemented using the Möbius tool which allows the simulationof Stochastic-Activity-Networks (SANs). SANs are the probabilistic extension of

42

activity networks and allow the representation of the system characteristics parallelism,timeliness, fault tolerance and degradable performance [21].

A communication network was simulated in two scenarios, one taking the depend-ence of failure rates on data flow into account, while the other was not. Not surprisingly,the results show that a communication network becomes unavailable faster when takingthe dependence on component failure into account, since the failure of one componentincreases the stress on other components.

Stationary and Dynamic cascading models

Stationary and dynamic cascading models [22] aim to model cascading failures dueto component failure, taking into account the fact that the re-distribution of the failedcomponent load does not happen statically (from one moment to the other), but rather, ina continuous way (dynamic). The model aims to represent a generic model of complexinterdependent systems, based on a simple flow model with only a few parameters.

Results show vast differences in simulation between the static and the dynamicfailure model. The dynamic model is generally more precise in estimating the networkvulnerability due to cascading failures.

2.3.2 Critical utility infrastructural resilience (CRUTIAL)

Several papers describe the CRUITAL architecture [23–25]. The CRUTIAL projectprovides an architecture for global critical information infrastructures, taking intoaccount computer-borne attacks and faults. Resilience of those infrastructures should bereached by overcoming the fact that CI interdependencies are massively growing, withoutan architecture that considers the global view on those interconnected infrastructures.Conventional security mechanisms can not be directly applied to CII protection. TheCRUTIAL project does not aim at providing a detailed model of specific infrastructurebut to model a global view on interconnected infrastructures.

The architecture is based on an intrusion tolerant design. Resilience of infrastructureshould be achieved through trusted-trustworthy operation (secure and trusted hardwareis used to support intrusion tolerance for the rest of the system) and fully-transparentintrusion tolerance (intrusion tolerance is achieved without changing legacy compositionof real-world CII components). Furthermore, methods to enable the non-stop operationof CII infrastructure despite the presence of faults and intrusions were developed

43

(proactive-resilience). Trusted monitoring of components is achieved by a deviationdetection component (detect behaviour different from normal behaviour) and a statediagnosis component (guess the internal state of a component based on deviationdetection, e.g. through heuristic or probabilistic detection).

On top of that, in order to provide secure communication, a communicationsubsystem handling communication over local networks and the Internet was developed.In fact, this can be seen as a WAN-of-LANs, connecting the locally communicatingcomponents (LAN structure) via a global connection network (WAN structure). Itconnects all the local application level firewalls and intrusion detection systems together.Communication devices are built as trusted components, designed to be intrusiontolerant (by introducing redundancy and applying a majority vote on the outputs of thecomponents). By building the devices intrusion tolerant on a local level, they becometrustworthy components helping to establish secure communication on the global levelsince the communication components can be trusted.

Finally, the last main aspect of the CRUTIAL architecture is access control. Theidea of access control in CRUTIAL is that every organization can define its own securitypolicies. The global communication device is responsible for checking the local securitypolicies in accordance with the global ones. For this purpose, OrBAC (organization-based access control) is proposed which sees organizations as structured groups ofentities where subjects play a specific role. Roles are introduced to structure the linkbetween subjects and organizations. Similarly, a relationship between organizationsand objects in organizations can be achieved. Access control can be enforced specificto organization, role in organization or to a specific subject/object. To address thecollaboration among CIs, PolyOrBAC was introduced which allows to publish andnegotiate access control rules between different organizations, as well as allowingruntime access to remote services.

2.3.3 MICIE

The EU-FP7 project MICIE, presented in [26, 27], aims at establishing a middlewarefor information sharing among CIs so that CIs can include information received fromdependencies in their infrastructure models to provide real-time monitoring. Three mainareas were covered in the MICIE project:

44

– Information discovery: This part is specific to each CI and deals with discoveringinformation that is relevant for information sharing, like data from various sensors inexisting monitoring tools.

– Overcome the heterogeneity of data: Different CI sectors are concerned with differentinformation. To be able to include exchanged information in CI models, a commonrepresentation has to be found. In the MICIE project, an ontology is used to createcommon metadata (concrete information about the ontology is omitted in the abovepublications and no discussion about the success of this approach is given).

– Secure information exchange: To allow secure information exchange, a securemediation gateway, which needs to be deployed at each CI was designed. Theimplementation follows the principles of secure communication (confidentiality,integrity, availability, authenticity and non-reputation).

The design of the MICIE middleware allows CI operators to create sector specific modelsfor their CIs, but at the same time, include real-time information about dependencieswithin their model. For this purpose, the MICIE project follows a holistic-reductionisticapproach, where from each sector specific model (holistic model), key elementsare identified and used to create overview models, also including key elements ofdependencies (reductionistic model) to capture the relationships among CIs. Theholistic-reductionistic approach will be further discussed in Section 2.3.5.

Bayesian network based interdependency analysis

An approach to CI interdependency analysis based on Bayesian networks was publishedin the context of the MICIE project. In [28, 29], Di Giorgio and Liberati present aDBN based approach to model CIs, including their interdependencies. As an input totheir model, they take atomic events (like weather conditions, frequency disturbancesor earthquakes) and model the impact of those events on CI components to drawconclusions on CI service availability (services provided to customers or internalservices). To achieve this, the modelling approach is split in three levels:

– In the Atomic event level, events influencing the CI, like bad weather or earthquake,are modelled. The authors state that modelling this level is difficult because of thecomplexity and geographical dependence on those phenomena.

– In the Propagation level, actual behaviour of infrastructure components (and depend-encies to other components) and their behaviour in the presence of atomic events

45

is modelled. The nodes in this level are either and/or nodes to model connectionsamong components, or complex nodes describing underlying components based onfailure rates. The temporal behaviour of the components after an event occurred ismodelled using DBNs.

– The Service level models the availability of a service provided to a customer or usedinternally. The states of those nodes are either normal or failed.

The model was validated based on the case study of the MICIE project and showspromising results in the context of:

– Reliability analysis: An estimation of the reliability of a service can be given as theprobability that a service will work without failure up to a given time.

– Event propagation analysis: The analysis of event propagation can be used to detectstructural weaknesses in a CI, or it can be used directly after an event to assist anoperator in finding the best response.

– Failure identification: A list of the most probable failure causes can be estimatedusing inference. This can help to reduce the time of failure localization and repair.

2.3.4 Graph Models of Critical Infrastructure Interdependencies

In [30], Svendsen and Wolthusen propose a graph based model for CI interdependencymodelling. The original model proposed in [31] was extended in [32] to supportinfrastructures with a storage capability (if resource supply fails, operation can betemporarily maintained by accessing the previously stockpiled resource). In [33], themodel is extended through statistical properties (e.g. effect of component failure overtime, random failure and targeted attacks) to enhance model accuracy.

The modelling goal is to model large scale interactions among CI sectors. It is notedthat in contrast to sector specific models the level of detail of such a model is limited.

The vertices of the graph model are interpreted as producers and/or consumers andtheir interactions are modelled by a defined set of dependencies (services that theyproduce or consume). The dependencies are modelled as directed edges, the headnode is dependent on the tail node. Furthermore, each vertex can have a buffer to storea certain amount of a service provided by another producer (e.g. oil, gas, ...). Thebuffer types are classified as ephemeral, storable and compressible and storable and

incompressible. Each edge has an assigned maximum capacity, as well as a lowerthreshold for the flow through the edge in order to model the capacity range of the

46

dependency. The ability to provide a service/resource is defined by a response functionfor each edge, which is a function of the availability of resources in the source nodes andthe maximum capacity/lower threshold of the edge. A vertex is functional if the internalneeds are satisfied by receiving and/or generating a sufficient amount of resources.Therefore, the functional state of a vertex is a function of its internal state and the stateof its dependencies. In case of a buffered resource (storable and compressible or storableand incompressible), the amount of the stockpiled resource is also part of this function.

The model was applied to a telecommunication and energy use-case in [31]. Sim-ulation results show that the introduction of redundant infrastructure decreases theprobability of cascading failures and that the model can be used to assist in identifyinginfrastructure elements essential to the availability of services.

2.3.5 Holistic-reductionistic model

In [34], De Porcellinis et al. present a holistic-reductionistic approach for CI interde-pendency modelling. The holistic view on CIs provides a view on a well-defined set offunctionalities in self-contained environments. Those environments are interconnectedand can interact with each other by a limited set of relationships. Holistic models areusually concerned with a high level of abstraction and therefore enable a general modelwhile loosing accuracy.

In a reductionistic view on CIs, some elementary components of the infrastructuresand their interconnections with each other are in the centre of attention. The boundariesof the infrastructures are not clearly defined, since for the model it does not matter towhich infrastructure an elementary component is originally linked to. This view onCIs is useful to represent the complexities of cross-infrastructure interactions, but themodels tend to be complex and deep knowledge of the infrastructures is required to beable to model the systems.

A mixed holistic-reductionistic view on CIs aims to combine the advantages of theholistic and the reductionistic approaches. In the mixed holistic-reductionistic approach,reductionistic techniques are used to model interdependencies among components.Holistic modelling techniques are used to express the logical and functional dependenciesamong CIs. A CI is simultaneously represented as a holistic entity, as well as byinterconnected components. The model is able to capture the high level relations amonginfrastructures, as well as the low-level interactions among components. Furthermore, itis possible to model interactions among the component level and the infrastructure

47

level. This enables the modelling of infrastructures on a different level of granularity,while still enabling the holistic-reductionistic view on the infrastructures. It is statedthat, due to the large number of interconnections an explosion in complexity can occur.Furthermore, it is also mentioned that on the component level many required links aregenerally not well understood.

As advancement of the model, a third layer in between the reductionistic and theholistic layer, is introduced to mediate interactions among the reductionistic and theholistic model. This layer is called the service layer and can be composed of servicesprovided by either one or more CIs. Those services are specific, high-level functions forreductionistic elements (belonging to one or more CIs). The advantage of this layer isthat some events, like for example high level failures, are easier to model than on thereductionistic layer, but provide more information than on the holistic layer. The holisticlayer interacts with the service layer by providing control actions through a managementservice.

2.3.6 Conceptual modelling

In [35], Sokolowski et al. apply the conceptual modelling method to CIs. The authorsdefine a conceptual model as “[...] an abstraction, or simplified view, of the actualsimulation (or implemented model) it is intended to represent”. Conceptual modellingmethods accept the loss of accuracy by a necessary reduction of variables and relations.A way to retrieve a conceptual model is by generating a potential model (to modelas many variables and relations as possible) and derive an actual model by applyingcontextual and situational values to the variables and relations defined in the potentialmodel (which leads to the desired reduction in variables and relations). In other words,this approach is similar to the class concept of object oriented programming. A generaltype of model is derived and has to be general enough to contain all potential situations(in terms of used variables and relations). By instantiating such a general class withconcrete information about a specific situation, the model is narrowed down to a certain,smaller subset of variables and relations.

Since the potential model contains all the information about all possible scenarios,building the potential model is seen as a challenging task, creating an actual model forma potential model is seen to be easier.

As a case study, an implementation of a reference scenario based on an energydelivery infrastructure, a clean water delivery infrastructure, a waste removal infrastruc-

48

ture, as well as a communication infrastructure and a transportation infrastructure isgiven. For each infrastructure, functional components were identified and modelled as apotential model. The properties of the potential model are based on the interchange ofinputs and outputs among the CIs, resulting in a potential overall picture of all possibleinterdependencies. In a next step, the potential model could be used to implement asimulation of a derived actual model.

2.3.7 Critical infrastructure modelling using Petri nets

In [36], Gursesli and Desrochers present a CI model using Petri nets. Petri nets are agraph based tool able to represent the main infrastructure components as the tokens oftheir states (in-service, failed, ...). Changes in the states are modelled as transitionsbetween the states. In this work, the CIs electric power, oil, water, transportation, naturalgas and telecommunication were considered. In more detail, it was considered how thefailure of electric power cascades to the oil, natural gas and water distribution, as well asto a transportation and a telecommunication network. This representation enables toeasily model the behaviour of CI interdependencies over time. For example, if power isdisrupted, natural gas and oil delivery will stop. Other infrastructures depending onnatural gas or oil delivery will not be affected yet, since natural gas reserves can be used.Only if the reserves are used up, this will have an effect. Petri nets represent a verypowerful way to model this behaviour.

2.3.8 Visual models of CIs

In [37], Chakrabarty et al. present a method for visualizing CIs and their interdependen-cies. Based on the foundation of a mathematical model, the visual model aims to providean interface which is easy to understand for decision makers. Some characteristics of thevisual model are: The status of a component (a node in the visual representation) isreflected by a grey-scale colour scheme (dark grey means 100% of functionality, lightgrey means 0%). Different types of components are represented by nodes with differentsymbols. Physical characteristics are represented as a list. Geographic proximity in thereal world is reflected by the proximity of the nodes in the visual model. Interdependenceis represented by arcs, the line ends are denoting the type of interdependency.

The visual model aims to enhance the visibility of CI interdependencies to decisionmakers (analysing) and to speed up for example the recovery process after a failure

49

by enabling decision makers to identify the most critical parts (monitoring) and setappropriate measures and priorities for counteractions (manipulating, controlling).

2.3.9 Modelling CIs using genetic algorithms

In [38], Permann proposes to use genetic algorithms to aid CI protection. A geneticalgorithm is an optimization technique based on the theory of evolution by CharlesDarwin. In this context it is used to search for optimal ways to protect CIs. Based on theoutputs of a CI modelling system (e.g. the attributes and state values of assets) andadditional information (such as list a of critical assets, the relative importance of eachasset, ...), the genetic algorithm is applied to find critical sub-networks that had not beenidentified before, find ways to mediate damages (either before or after an event) and,more generally, identify weaknesses in the given CI network.

In the evolutionary theory, an initial population is followed by generations ofvariations of the initial population. The fitter an individual in the population is, the morelikely are his/her chances of reproduction (population gets robuster over time). Carriedforward to genetic algorithms, this means that a parameter somehow has to represent thefitness of an individual. Bearing in mind some randomness, the individuals with thehighest scores are used for reproduction.

The fitness function of each CI asset takes into account the importance of the asset,the cost to protect/re-enforce it and the recovery cost after failure (highly important assetsand assets able to quickly recover from a failure have a higher rank in the population).Evaluation is done by initially removing assets from the population and simulating themodel for a certain amount of time steps. In order to receive comparable results, thefitness of the whole population is calculated before and after simulation by summingup the fitness of all individual assets. Depending on the desired goal of a simulation,the simulation time can last as long as it takes to reach a certain fitness value of thepopulation, until a steady state (convergence) is reached or the simulation can run for apre-defined time to obtain the results for a specific generation of the population.

2.3.10 Spatio-temporal model

In [39], Chou et al. propose a spatio-temporal model of CI interdependencies to beable to capture interdependency information during and after a disaster. To provideaccurate information, the model contains a spatial component in order to capture the

50

nature of moving objects and a temporal component in order to capture the changes ofdependencies over time. Furthermore, the temporal component addresses the differentrequirements in time. In case of a disaster, cascading failures may spread withinmilliseconds (e.g. electrical infrastructure) or weeks to months (destroyed transportinfrastructure).

The model itself is represented as an UML diagram. It is divided into three mainsections: the infrastructure and management classes that contain information about theinfrastructure and the owner’s organization; the interdependency classes that containinformation about the types of relationships to other interdependent infrastructures(Physical, Geographic and Cyber) and the infrastructure type classes that are introducedto take into account that different CIs have a different set of defining characteristics.The spatial component is introduced in the infrastructure classes by defining a locationparameter. The temporal component is introduced in the infrastructure classes bydefining a parameter containing the infrastructure status at a certain time, the expectedmaintenance period of an infrastructure (if infrastructure needs service) and the expectedrecovery time of an infrastructure (if the infrastructure failed). In the interdependencyclasses, the temporal component is introduced by a parameter specifying the time theservice failure of an interdependency would start to affect its own service.

2.3.11 Specific CI models

Another category of CI models are specific models aiming to represent specific CIs,usually modelled after a use-case scenario taken from a real-world CI. Those types ofmodels will not play a major role for this work, but for completeness, some of them willbe briefly discussed in this Section.

In [40], Laprie et al. provide a study of the interdependencies among an electricityand an information infrastructure. Starting from a study to identify the accidental andmalicious failures/attacks in the two CIs, an interdependency model, taking into accountcascading, escalating and common cause failures through the interdependencies, isgiven. In [41], Chiaradonna et al. analyse the electric power system infrastructure indetail and model the interdependencies among the infrastructure itself and the ICT basedSCADA system used to control the infrastructure.

51

2.3.12 Federated Agent-based Modelling and Simulation

In [42], Casalicchio et al. propose to combine the methods of federated simulation andagent-based modelling and simulation to capture the interdependencies of CIs.

Federated simulation is based on partitioning a complex system into federates,represented by a separate model. The federation of the separate models represents thecomplex system. Partitioning can be done in two directions, vertical and horizontal.Vertical partitioning tries to separate the system into hierarchies, for example startingfrom the physical layer up to the functional and organizational layer. Horizontalpartitioning tries to separate the system into functional subsystems of the same level ofhierarchy.

Agent based simulation is based on modelling systems using interconnected,cooperative agents. An agent is an autonomous entity with location, capabilities andmemory. The model of a complex system can be embedded in the agent’s behaviour.

Federated agent-based simulation tries to combine the advantages of both modellingand simulation approaches: Agent-based modelling allows to simulate complex systemscomposed of many sub-systems, the behaviour of complex systems can be modelled byagents at many levels of abstraction and it allows to study the behaviour of complexsystems, like the impact of unexpected perturbation. Federated simulation allows toreuse existing models since the partitioning step allows a more generic view on acomplex system and it allows to simulate each independent federated model and thusenables distributed simulation.

A vertical boundary is used to separate the organizational and functional layer ofa complex system from the physical layer. Subsequently, horizontal boundaries areintroduced on the physical layer to identify sub-systems. Specialized modelling toolsare used to model each sub-system at the physical layer. Agent-based modelling andsimulation is used at the functional and organizational layer by defining an agent foreach sub-system at the physical layer and modelling the interactions among each other.

2.3.13 System dynamics, IDEF0 functional modelling and non-linearoptimization

In [43], Min et al. propose to model and simulate critical national infrastructureinterdependencies based on three techniques: System dynamics, IDEF0 functional

52

modelling and non-linear optimization. Modelling and simulation takes place on twolevels, the physical and the economic infrastructures of a system.

The system dynamics model is used to model the resources of a system, thealteration of those resources and the (positive or negative) feedback that is caused bythose alterations. High level relationships, interactions and feedback are captured bycausal-loop diagrams. To capture the physical build-up and mathematical relationships ofthe interdependent systems, stock-and-flow diagrams are developed from the causal-loopdiagrams. Stock-and-flow diagrams allow to derive differential equations that describethe evolution of a system and can be used as a basis for simulations. The functionalmodelling technique IDEF0 was used to define the requirements for data and informationexchange among simulation models. Finally, non-linear optimization was used to findvalues for control variables to optimize system dynamics simulation (decision supportsystem for the economic level).

The goal of this modelling approach is to find the optimum economic revenue incase of disruption of a certain infrastructure component by providing decision support atthe economic level and thus to reduce the financial impact of disruptions.

2.3.14 Agent based modelling and simulation

In [44] and [45], Panzieri et al. propose a CI simulator based on agent based simulation.The goal is to use a minimum amount of technical and specific data in order to allow abetter information sharing of confidential infrastructure related information. The mainsource of data is obtained by interviewing managers. A CI is divided into independentinteracting agents that are defined by three quantities represented as system states: theiroperative level (how much of their capacity are they able to provide?), their requirements

(what does each entity need to function correctly?) and their fault (the occurrenceof a fault and fault types). The inputs and outputs of the agents (interaction withinterdependent systems) are also defined by those system states (e.g. resources neededfrom and provided to interdependent systems). Furthermore, the model considers thephysical, geographical and cyber interdependencies of systems and uses a fuzzy numberrepresentation to model the uncertainties of infrastructure information obtained viaexpert interviews.

The modelling goal is to evaluate the short-term effects of one or more faults andtheir propagation to interdependent systems, to assist in what-if analysis and to identifycritical elements in the interdependent systems.

53

2.3.15 Intelligent agents reasoning about CI state

In [46], Tolone et al. introduce a CI interdependency modelling and simulation approachbased on intelligent agents. Based on individual behavioural models of CIs, intelligentagents sense state changes in the models and collectively reason about the consequencesof those state changes and which interdependent infrastructure might be affected.Meta-knowledge of the own infrastructure and inter-infrastructure meta-knowledge isneeded for the reasoning. The outcome of the reasoning is provided to an end user (e.g.infrastructure operator) and can for example be used for decision support or predictivewhat-if analysis.

In [47], the authors detail the implementation of their approach using Cougar, aJava based agent architecture for large-scale distributed multi-agent systems.

2.4 Risk in critical infrastructures

In [18], Haimes et al. define risk management as a process for developing a set ofdecisions for understanding and recognizing the range of consequences and trade-offs ofactions within an uncertain environment. Risk assessment and risk management providean analysis and decision structure for policy formulation in interdependent systemsin case of uncertainty and extreme events. In [48], Kröger reflects about modern riskand system weaknesses in CIs and argues about ways to address it (e.g. policy options,dialogue with stakeholders, the extension of currently existing modelling and simulationtechniques).

2.4.1 Risk Management for Critical Infrastructure Protection (CIP):Challenges, Best Practices and Tools

In [49], Adar et al. identify the main difference of CIP risk management to be the levelof complexity found in CIs. The main challenges in CIP risk management are identifiedas:

– The nature of CIP: Each CI sector has to be analysed separately.– Groth in organizational complexity: Organizational structures and information

systems are becoming increasingly complex in CIs.– Dynamic aspects of risk: Structures and business models in CIs change and so should

risk management.

54

– Need for compliance: Compliance with existing laws and regulations is a key issuefor CIs.

– Efficiency and cost effectiveness: Cost-effective and efficient security solutions aredemanded in order to be taken into serious consideration of being adopted by CIowners.

– Human factors: Risk management is an intuitive process that has to be done bydomain experts and relies highly on the skills and knowledge of individuals.

To address the challenges that have to be faced, best practice strategies are proposed.Those strategies include:

– Framework and measurements: This addresses the development of a global riskmanagement framework that allows to measure its accomplishments. Centralizing riskmanagement and risk analysis tasks is seen as a key for efficient risk management.

– Advanced risk analysis methods: There is no standardized way for risk analysis,therefore, it is crucial to pre-select suitable risk analysis methods for each individualorganization (e.g. Common criteria, OCTAVE, CORBA or EESA).

– CIP models that can be used for risk analysis: Most risk analysis methods are basedon information system models. It is important to integrate CI models that bettercapture the nature of CIs.

– Critical infrastructure layers: Critical infrastructures are organized in different layersthat have to be understood for risk analysis within CIs (e.g. Business/strategic layer,organizational layer, cyber layer, physical layer).

– Dependency between layers: The CI layers depend on each other (intra-dependency)as well as infrastructures depend on each other (interdependency). This has to beunderstood and considered in risk analysis.

– Multi-dimensional impact vectors: The impact of an incident in a CI might haveconsequences on more than one dimension.

– Development and implementation of risk management tools: Tools that support riskmanagement in CIs have to be developed and optimized.

2.4.2 Risk assessment in complex interacting infrastructures

In [50], Newman et al. propose a methodology for the probabilistic assessment ofcascading failures in complex interacting systems to be able to determine the impact ofcoupling related to the risk of component failure.

55

Two different models to capture the nature of cascading failures are investigated: Asimple probabilistic model and a dynamical model for coupled complex systems tocapture the dynamic evolution of the risk of failure to other, interdependent systems. Thefocus of this work lies on the dynamic of failures and how this impacts risk assessmentin interconnected systems.

The simple probabilistic cascading failure model relies on the assumption of anumber of n identical system components with a random initial load. If the load ofone component gets higher than the load it can handle, the component will fail andthe load will be distributed to the other system components. This can be seen as thestart of a cascade, the additional load might cause a failure in other system componentswhich again might cause failures in other system components. To apply this model tointerconnected systems, is seen straight-forward. Two independent systems where eachsystem has a number n of identical system components are considered. If a componentin the first system fails, the load is distributed to the internal system components as wellas the load of all components of the second system is increased by a fixed amount. Thisshould not be seen as distributing the load of the failed component to the components ofthe second system, but rather to increase the stress on the components of the secondsystem because components in an interdependent system failed. It is also consideredthat the resulting failures in the second system might fall back to the first system due tothe interconnection.

The dynamical complex system model is a cellular automata based model, a regulargrid with fixed interaction rules. The dynamism of the model comes from the fact thatsystems have memory of their previous states. The interaction rules for uncoupledsystems are simple: Each node has a certain probability of failure, the neighbour nodesof a failed node have a higher probability of failure and a failed node has a certainprobability of being repaired. The dynamic evolution of the model is characterized bythe following rules: The probability of a random failure is influenced by the state of thenode in the previous time step, the probability for a node of being repaired is influencedby the state of the node in the previous time step, the state of the neighbouring nodes atthe previous time step influence the failure probability of a node and finally, all nodesare advanced to their new state. In interconnected systems, the behaviour is modelledanalogously to the simple probabilistic cascading failure model ( if a component insystem A fails, the loads in system B are increased to account for the increased stressin system B). However, there are two differences in comparison to the simple model:The spacial structure of the systems is taken into account (not every node in system A

56

is connected to every node in system B, but a random spacial selection is made) andthe strength and direction of coupling is taken into account (the failure of differentcomponents in system A will cause different levels of increased stress on the componentsof system B).

The dynamical model was further developed in [51]. The original model is based ona square grid network, thus only four neighbours were taken into consideration by themodel to propagate an initial failure to neighbouring nodes. In the extended model,arbitrary networks can be considered, the failure propagation is described by a certainprobability for all the nodes that are connected to a failing node. At each time step, acomponent can be operating, failed or failing. The model is following the assumptionsthat for each time step a failed component can be repaired with a certain probability, afailing component becomes a failed component, an operating component has a certainfailure probability if one of the nearest components is failing and finally, there is acertain probability that an operating component fails. In coupled systems, another ruleapplies: A component in system 2 fails if an associated component in system 1 hasfailed or is failing.

The main finding presented in this work shows that component failures are morelikely to cascade in coupled systems than they do in uncoupled systems. This findingwas reflected in both, the simple probabilistic cascading failure model and the dynamicalcomplex system model.

2.4.3 A Markov Game Theory-based Risk Assessment Model forNetwork Information System

In [52], Xiaolin et al. propose a risk assessment model based on the assumption that allpossible future risks will influence the risk assessment in the present. Based on thisassessment, a reinforcement/repair scheme is provided. The model is based on threatidentification methods, vulnerability identification methods and asset identificationmethods. The model uses one Markov chain to model the spreading of potential threats,as well as another Markov chain to model the repair process which is equivalent with thereduction of vulnerabilities to minimize threats. This can be seen as a game betweenvulnerabilities and threats. A threat agent increases the risk by threat spreading, whilethe vulnerability agent decreases the risk by repairing the vulnerabilities. Using thosetwo Markov chains, the state changes over time can be modelled. After each statechange, the threat agent can introduce threat spreading according to his strategy set,

57

whilst the vulnerability agent can introduce repair actions according to his strategyset. After a game of n steps, the outcome of the Markov game is provided to a riskassessment module which is now able to decide on a system risk assuming the largestrisk provided by the Markov game. Experiments show that the Markov game riskassessment model performed a better risk assessment (by identifying potential risks)compared with a traditional risk assessment model (the traditional risk assessmentmethod is not further specified in this publication).

In [53], a similar approach based on a hidden Markov model to enable real-time riskevaluation is proposed. Starting from various data sources of heterogeneous data, a riskanalysis module transforms them into homogeneous data representing confidentiality,integrity and availability. Each parameter can have six states, the transitions between thestates can be characterized with a probability, the likelihood of a transition from onestate to a particular other state. Together with an observation probability matrix (given aseries of events, the probabilities are different if a previous event had an impact or not),a discrete-time hidden Markov model can be developed to evaluate the real-time risk ofassets in a system in terms of confidentiality, integrity and availability.

2.4.4 Multi-sensor Real-time Risk Assessment using Continuous-timeHidden Markov Models

In [54], Haslum et al. propose an approach to real-time risk assessment by takingmeasurements from system sensors (like intrusion detection systems). Continuous-timehidden Markov models are used for the estimation of risk, as well as a risk computationmethod based on weighted sum (to account for the varying reliability of sensors). Thesecurity of an asset is modelled by a fully connected Markov model with three states,representing good, under attack and compromised. The observation of the systemis done by getting information from sensors, with the assumption that the messagesreceived are in a subset of previously known/defined messages. For each sensor, ahidden Markov model exists that describes the state transition probabilities of the asset(probability that the asset reaches state x given it is in state y). Depending on the type ofsensor input, those probabilities can be calculated using continuous-time hidden Markovmodels or discrete-time hidden Markov models. An expert has to evaluate the initialstate distribution and state transition rates and combined with sensor observations atcertain times (continuous or discrete), the transition probabilities can be estimated inreal-time. To derive the risk for an asset from this calculation, the probabilities of a

58

certain event have to be set in relation with the costs of the event and in order to derivethe risk for the whole system, the risks of each single asset/sensor are summed up andweighted according to their assumed reliability.

2.4.5 Knowledge-Based Framework for Real-Time Risk Assessmentof Information Security Inspired by Danger Model

In [55], Hu et al. propose a framework for the real-time assessment of risk based on thedanger theory. The starting point of the framework is a knowledge-based risk analysis.The human knowledge in how to protect a system, including the spread and developmentof knowledge throughout an organization, is mathematically modelled, assuming adefined set of assets and the corresponding knowledge about vulnerabilities, threats andsecurity measures. Furthermore, a weight denoting the importance of each specificasset is defined. Those parameters are set in a mathematical relationship to be able tocompute the risk of each asset.

The danger model in this work is inspired by the theory of the immune response inhuman cells. A cell under attack can send a danger or alarm signal to antigen-presentingcells (APCs) which would send helper cells, which in turn can send killer cells toeliminate the invaders. The danger model for real-time risk assessment uses the sameprinciple. Based on the knowledge-based risk analysis, system components can sendsignals to analysis centres which are able to assess real-time risk based on the numberand nature of received signals.

2.4.6 Hierarchical, model-based risk management of CIs

In [56], Baiardi et al. propose a risk management strategy based on a hyper-graph modelof infrastructures, which allows to represent the infrastructure at different levels of detail,depending on the required level of detail. The model aims to detect complex attacksas well as to support risk mitigation. Each infrastructure component consists of aninternal state and operations on this state. Each component has three security attributes(confidentiality, integrity and availability) that are derived from the internal states, anddependencies to other infrastructure components are modelled as security dependenciesbased on those parameters. Each controlled parameter of one component can influencethe same or another parameter of another component. For example, the integrity of onecomponent can influence the availability of another component or the integrity of one

59

component and the availability of another component can influence the confidentiality ofa third component. By using hyper-graphs for the representation, this can easily bemodelled. Hierarchical decomposition is used to decompose one single component intoa more detailed representation of the component, again modelled as a hyper-graph. It isassumed that a complex attack is carried out by a series of simple attacks, each effectinga different component. An evolution graph is used to represent all the states that can beproduced in an infrastructure by a sequence of simple attacks. A complex attack canbe seen as a path through the evolution graph from a start point to an end point. Riskanalysis is performed by assigning probabilities to a certain complex attack or evolution.Attacks that have a low probability, are not possible to implement or take too many steps(simple attacks) to succeed are removed from the graph. The input to this evaluation isassumed to be supported by historical data. Risk mitigation is modelled by assumingthat a set of countermeasures can stop simple attacks and thus reduce the paths throughthe evolution graph. A tool was implemented that can compute strategies for stoppingall the evolutions through a graph by eliminating a subset of simple attacks.

2.4.7 Risk Filtering, Ranking, and Management Framework UsingHierarchical Holographic Modelling

In [57], Haimes et al. propose a framework for large-scale systems to identify, prioritize,assess and manage risk scenarios. This framework is organized in eight phases, namelyscenario identification, scenario filtering, bi-criteria filtering and ranking, multi-criteriaevaluation, quantitative ranking, risk management, safeguarding against missing criticalitems and operational feedback. The main objective of this framework is to achieve arisk ranking and filtering based on identified risk scenarios.

Hierarchical holographic modelling is used to understand the organization of aninfrastructure and thus perform risk scenario identification. It is based on exploitingthe hierarchical structure of an infrastructure and model the risk in each componentof the structure to be able to estimate an overall system risk. The strength of thismodel is that each component in the hierarchy can be independently modelled by adifferent mathematical model. Scenario filtering is based on the assumption that risk canbe filtered according to the needs and interests in the current situation, for exampleaccording to the scope and temporal domain of the situation the scenario is evaluated for.This filtering is done by decision makers based on expert experience, knowledge ofthe system and the role/responsibility of the decision maker. Bi-criteria filtering and

60

ranking is based on the risk filtering process and takes into account two different typesof information: likelihood and consequences. This can be represented in matrix form(e.g. five discrete steps for likelihood and the same for the consequences, each field inthe matrix associated with a risk level) and gives a quantitative assessment of systemrisk, reducing the number of possible risk scenarios. Having completed the first threesteps, in the multi-criteria evaluation step the scenario is evaluated against its abilityto defeat the resilience, robustness and redundancy of the underlying system. This isdone by defining criteria that are describing this goal and evaluating each criterionagainst the scenario by assigning a risk level (e.g. high, medium, low) to it. In thequantitative ranking process, Bayes theorem is used to determine the likelihood of eachscenario, replacing expressions like “high”, “medium” and “low” with probabilities. Itfurther reduces the number of possible risk scenarios. The input of the risk management

step is a (small) set of remaining and likely risk scenarios. In this step, for each of theremaining scenarios, an evaluation taking into account the possible system modificationsto reduce the risk and the costs as well as to maximize the benefits is performed. Thisshould result in a decision leading to a cost-effective risk mitigation strategy. In thesafeguarding against missing critical items step, the proposed risk mitigation strategyis evaluated against the previously filtered scenarios in order to be sure that the finalimplementation will hold against the unfiltered risk scenarios as well. The operational

feedback step is concerned with the fact that risk is not static and will evolve over time.Therefore, feedback and re-evaluation after applying the method is important.

2.4.8 Risk Management for Leontief-Based Interdependent Systems

Based on the input-output inoperability model described in Section 2.3.1, Jiang andHaimes propose a risk management strategy for interdependent systems in [58]. Afterdefining the Leontief inoperability model, optimization techniques can be used forexample to minimize the overall total inoperability or maximize the overall productivitybetween interdependent systems and thus reduce the probability of system risk in termsof financial losses.

2.4.9 Using graph models to assess vulnerability in CI

In [59], Chopade and Bikdash present an approach to assess the vulnerability in CIsusing the graph theory. In their work, the term “vulnerability” is mainly used in the

61

context of structural vulnerability where a component failure would lead to a large-scaleservice failure. In their model, the components of a CI are graph nodes and the edgesrepresent a connection between the components. As an example, they use the power grid,where nodes could be power plants, stations and power users and the edges could bepower lines. The graph is undirected and (initially) connected. The aim of the model isto perform simulations where elements from the graph are removed and to observe howthe performance of the network is affected by it. The model is validated by analysingparts of a power grid, but it is stated that it is uncertain how realistic the results are sinceunrealistically high failure rates (the removal of elements) was performed which notnecessarily represents a realistic attack scenario.

2.4.10 Security-oriented cyber-physical state estimation

In [60], Zonouz et al. present a tool for detecting attacks on CIs based on the combination(stochastic information fusion) of data received from ICT systems (e.g. intrusiondetection systems) and the physical infrastructure (e.g. existing state monitoring tools).Using this approach, attack detection accuracy can be enhanced, since false alarms and/orundetected attacks in the ICT systems can be supplemented by information about theactual state of the CI. For now, the research is focused on the power grid CI. The modelwas validated using a case study and results indicate increased detection performancewhen combining ICT alerts, as well as physical infrastructure state estimation and allsimulated attacks could be detected.

2.4.11 Risk modelling of interdependencies in CIs

In [61], Utne et al. present a method for qualitative vulnerability analysis in CIs. Itallows a quantitative or semi-quantitative analysis of the risk associated to an event (inthe context of likelihood and consequence of the event). The first step is concerned withidentifying hazardous events. In the second step, selected events (according to decisioncriteria like high risk, high consequences or suspected high interdependencies), thoseevents are further analysed for possible interdependencies or cascading effects. Theapproach requires a detailed description of the event (like location, environment, spacialand temporal scales) in order to evaluate dependencies. Identified dependencies aredescribed in a cascade diagram, a structural representation starting from the initiatingevent that lists, in hierarchical manner, a series of events that lead to an undesired end

62

state. A semi-quantitative analysis of each event can be performed by estimating a riskprobability for a cascading scenario. For the initiating event, the frequency of occurrence,the extend and the duration of the event need to be estimated. Each subsequent eventin the cascade diagram is assigned a conditional probability of occurrence, given theduration and extend of the parent event (a long duration and extend of a parent eventleads to a higher probability of a child event to occur). With this information about eachevent, a risk probability for the scenario can be estimated by calculation. Furthermore,measures to reduce interdependencies and cost-benefit analysis for implementing riskreducing measures can be performed based on this analysis.

2.4.12 Operational support for CI security

In [62], Hurst et al. propose a method for identifying complex threats against CIsbased on observing the CI behaviour. They recognize that conventional threat detectionsystems like intrusion detection systems (IDS) are not suited to detect complex attacksagainst CIs. They argue that the behaviour of CI operation can be determined byobserving real-time sensor data (e.g. temperature, pressure, speed or flow rate). Theirapproach tries to recognize unusual behaviour from observing multiple sources ofinformation. It allows proactive security by presenting CI operators the patterns ofunusual behaviour.

2.4.13 SERSCIS project

The EU-FP7 project SERSCIS [63] (Semantically Enhanced, Resilient and SecureCritical Infrastructure Services), as presented in [64–66], deals with fault monitoringand risk management in complex CIs to aid decision support in case of an emergency(deliberate attack or accidental failure). The approach is to build a semantic model ofa running CI and reason about the likelihood of threats being carried out within thesystem (risk classification and periodic assessment). The operator of the CI is presentedwith information about the system, including system vulnerabilities, threat likelihoodprobabilities and explanation as well as threat impact. Furthermore, an automaticdecision support tool provides suggestions for the operator to resolve the issue. The toolalso allows the operator to send control commands to the CI to mitigate the impact ofthe event or stop cascading failure.

63

The CI models of the SERSCIS project are ontology based, all CI expertise as wellas security related expertise is input at design time. The system behaviour is capturedusing relationships among types of services, as well as associated threats and controls.A dynamic model of run-time system composition is created by making instances ofthe previously designed components. No additional expertise is needed at this point.Multi-stakeholder dependencies are taken into account at run-time through dynamiccomposition. A model represents the viewpoint of one of the involved stakeholders.

2.5 Dependency analysis in complex systems

The dependency analysis methodology developed in this work is based on thePROTOS-MATINE dependency analysis methodology developed by OUSPG (OuluUniversity Secure Programming Group). This methodology was chosen since thegoals of PROTOS-MATINE and its application areas are closely related to the goals ofdependency analysis within this work. Furthermore, the author is affiliated with OUSPGwhich allowed to access expertise and experiences of exceptionally knowledgeablecolleagues.

The main goals of the PROTOS-MATINE dependency analysis methodology can besummarized as follows:

– Find critical dependencies by analysing critical systems.– Utilize all available information sources in a socio-technical approach (written sources

as well as expert interviews).– Utilize information sources at all organizational levels.– Graphically visualize findings for analysis.

In the following Sections, some publications related to the PROTOS-MATINE methodo-logy are presented.

2.5.1 A Case for Protocol Dependency

In [67], Eronen and Laakso present a methodology to discover vulnerabilities inprotocols using dependency analysis. Protocols are the basis for higher level systemsand vulnerabilities in the protocols can compromise those systems and put them at risk.During their research, it was uncovered that many implementations of the same protocol

64

share the same vulnerabilities. Four different types (meta levels) of possible protocolvulnerabilities were discovered:

– A vulnerability exists in a single implementation of a protocol.– A vulnerability exists in multiple implementations of a protocol.– A vulnerability in a common sub-protocol might cause vulnerabilities in all protocols

using it.– A vulnerability in a basic encoding or encryption scheme might cause vulnerabilities

in all protocols using it.

Their work focuses mainly on types 3 and 4. By visualizing the dependencies ofa protocol (e.g. sub-protocols or encoding/encryption schemes) and linking thisinformation with known vulnerabilities, possible vulnerabilities in the protocol can beuncovered. The proposed information sources for researching protocol dependenciesare:

– Expert interviews.– Technical specifications.– Public reports of protocol security and vulnerabilities.– Protocol usage information.

2.5.2 Graphingwiki - a Semantic Wiki extension for visualizing andinferring protocol dependency

In [68], Eronen and Röning introduce the tool Graphingwiki, a semantic extension ofMoinMoin wiki, to allow the visualization of data by adding information (or knowledge)and set it in context to each other by adding descriptive additional information. Informa-tion can be automatically gathered from various sources like technical specifications orit can be added or supplemented manually by adding expert knowledge. Furthermore,Graphingwiki includes logic reasoning capabilities for discovering relations amongadded information. Visualization is done in graph-based form, where the nodes representthe added data and the edges represent the relations among the data.

As a use-case scenario, protocol dependency analysis is given. Each networkedsystem depends on various protocols which themselves depend on lower-level protocols.Identification of those dependencies is crucial for vulnerability analysis. One informationsource for identifying protocol dependencies are standards, containing for example

65

information about status types, relation with other standards or involved protocols. Thedesired information in standards can be extracted for example using scripts, or whereautomatic extraction is not possible via manual input. After information gathering, thedesired model for protocol dependencies can be extracted from the available informationvia logic reasoning and/or expert input.

2.5.3 Software Vulnerability vs. Critical Infrastructure - a Case Studyof Antivirus Software

In [69], Eronen et al. present a case study in the context of antivirus vulnerabilityidentification based on dependency analysis. The used dependency analysis methodologyis the PROTOS-MATINE method, with its focus on gathering information fromall possible information sources (written sources as well as expert interviews) andvisualization using the tool Graphingwiki. The main information sources in this casestudy were expert interviews, specifications, market situation, historical data, publicvulnerability data and usage scenarios.

As a result of the case study, besides finding out that existing implementation levelvulnerabilities can make antivirus software ineffective against malware, it was discoveredthat many vulnerabilities in antivirus software are related to used archive formats whichare common among many antivirus products. It is stated that the PROTOS-MATINEapproach of information gathering and visualization was crucial in uncovering thiscritical dependency.

2.5.4 Socio-technical Security Assessment of a VoIP System

In [70], Pietikäinen et al. utilize the PROTOS-MATINE methodology of dependencyanalysis and Graphingwiki for dependency visualization in the context of providinga security audit in an active large-scale VoIP system. Information sources in thecontext of PROTOS-MATINE related to this case study were expert interviews, networkdocumentation and network measurements. The incentive of this approach is that avisualization of system structure containing all elements as well as their dependenciescan help identifying structural weaknesses in the system. Furthermore, the structure of asystem is often not as well understood by the people operating the system as it shouldbe. A visualization of the system combining multiple information sources can help toavoid misunderstandings and thus strengthen security in certain scenarios.

66

As a result of the case study it was shown that there was a mismatch between systemdocumentation and actual implementation which could only be discovered by combiningexpert interviews with the documentation of the system. It was also uncovered thatorganizational guidelines (e.g. the use of secure passwords) are sometimes not followedin practice. A structural flaw that was detected during analysis was a substantialdependency of the system to one single system administrator who is responsible for allother administrators. The PROTOS-MATINE methodology as well as Graphingwikihave proven valuable for conducting security audits in large-scale systems.

2.6 Analysis of the related work

In this Section, the related work presented in the previous Sections is set into contextwith the goals and requirements of the CI security model presented in this PhD thesis.Table 2 lists the goals and requirements identified for this PhD thesis in Sections 1.3and 1.4 and roughly classifies the commonality of each related article to those goals andrequirements. To allow a meaningful classification, the parameters taken into accountwere grouped into Goal/Approach, Indicators and Requirements. The goal/approachsection lists the main goals of the CI security model (monitoring and simulation), aswell as the main approach (a model following a risk-based approach). Additionally, thelist item CI analysis was introduced to be able to classify more general articles related tothe analysis of CIs and CI environments. The indicators section lists the indicatorstaken into account by the CI security model, which are confidentiality, integrity andavailability. The requirements section lists the modelling requirements identified for theCI security model in Section 1.3.

67

Table 2. Classification of related work.

Related work Goal/Approach Indicators Requirements

CIm

onito

ring

CIs

imul

atio

nC

Imod

elR

isk-

base

dap

proa

ch

CIa

naly

sis

Con

fiden

tialit

y

Inte

grity

Ava

ilabi

lity

R1

Com

plex

ityof

CIs

R2

Div

ersi

tyof

CIs

R3

CId

epen

denc

y/in

terd

epen

denc

y

R4

Info

rmat

ion

shar

ing

R5

Mul

tiple

info

rmat

ion

sour

ces

R6

Incl

usio

nof

resu

ltun

cert

aint

y

R7

Pre

dict

ion

R8

Ass

uran

ceof

resu

ltva

lidity

Section 2.1 ([4–7]) XSection 2.2.1 ([8]) X XSection 2.2.2 ([9]) X XSection 2.3 ([10]) X X XSection 2.3 ([11]) X XSection 2.3.1 ([14, 15]) X X X X X XSection 2.3.1 ([16, 17, 19, 20]) X X X X XSection 2.3.1 ([21]) X X XSection 2.3.1 ([22]) X X XSection 2.3.2 ([23–25]) X X X XSection 2.3.3 ([26, 27]) X X X X X X XSection 2.3.3 ([28, 29]) X X XSection 2.3.4 ([30–33]) X X X X XSection 2.3.5 ([34]) X X X X XSection 2.3.6 ([35]) X X X XSection 2.3.7 ([36]) X X XSection 2.3.8 ([37]) X X XSection 2.3.9 ([38]) X X XSection 2.3.10 ([39]) X X XSection 2.3.11 ([40, 41]) X X XSection 2.3.12 ([42]) X X XSection 2.3.13 ([43]) X X XSection 2.3.14 ([44, 45]) X X X X XSection 2.3.15 ([46, 47]) X X X XSection 2.4 ([18, 48, 49]) XSection 2.4.2 ([50, 51]) X XSection 2.4.3 ([52, 53]) X X X X X XSection 2.4.4 ([54]) X XSection 2.4.5 ([55]) X XSection 2.4.6 ([56]) X X X X X XSection 2.4.7 ([57]) X X XSection 2.4.8 ([58]) X X XSection 2.4.9 ([59]) X X XSection 2.4.10 ([60]) X XSection 2.4.11 ([61]) X XSection 2.4.12 ([62]) X X XSection 2.4.13 ([64–66]) X XSection 2.5 ([67–70]) X X X X

68

The analysis of related work shows that each work covers some of the goals andrequirements of the CI security model, but no work covers all of them. In general, it ishard to compare CI models since they differ in many aspects. The main differences thatcould be identified are:

– Model input/output: CI models differ in what kind of input parameters are taken intoaccount and what is presented to the user as a modelling result.

– Model purpose: The general purpose of CI models differs, ranging from detailedrepresentations to capture physical behaviour to high-level representations to capturehigh-level relationships/dependencies.

– Abstraction level: The detail of CI models differs. Similar to the model purpose, somemodels capture detailed physical processes, while others only consider high-level (e.g.economic) parameters.

– Modelling approaches: CI models using various types of modelling techniques werepresented.

Table 2 shows that most of the presented models are concerned with off-line modellingand simulation of CI environments and on-line monitoring is not a main concern. It alsoshows that most of the presented CI models only take availability of CIs into accountand provide solutions for some of the CI related modelling requirements like complexity,diversity or dependency/interdependency. The risk-based CI models, on the other hand,take other indicators like confidentiality and integrity into account, while not providingadequate solutions for CI related modelling requirements. Based on this analysis, agap in the state-of-the-art could be identified: There is no risk-based CI model foron-line monitoring that takes other risks than availability into account and at the sametime addresses CI specific modelling requirements. Furthermore, additional modellingrequirements that can be addressed by the CI security model, like information sharing,the use of multiple information sources to enhance modelling results, the inclusionof result uncertainty, prediction and the assurance of result validity, are not a majorconcern for the research presented in related work. The remainder of this Section is usedto discuss related work sharing important goals or requirements with the CI securitymodel in more detail. In general, no fundamental concepts presented in related workare accepted to be part of the framework presented in this thesis. The MICIE projectmight be an exception to some degree, since the authors general understanding of CIenvironments and the general modelling challenges originate in the MICIE project andthe framework presented in this thesis is based on this understanding.

69

The interdependency modelling framework presented in Section 2.2.2 ([9]) hasthe same objective of capturing dependencies among CI sectors as the CI securitymodel. While it provides a solid framework for capturing dependencies based onnetwork structure and a dynamic dependency model based on market interactions,it does not consider the interactions among infrastructure components and does notobserve the system on the infrastructure level, limiting the model to supply/demanddriven interactions. Furthermore, while the geography-driven network model mightprovide a common modelling ground for many CI sectors (e.g. energy, transport,telecommunication), since they serve the same geographic regions, other CI sectors likethe banking sector or the air traffic sector might be harder to model using this approach.

The IRRIIS project presented in Section 2.3.1 ([14, 15]) shares the same idea ofabstraction to be able to capture the complex relations among CIs and CI sectors.However, within the IRRIIS project, detailed sector specific models are used as a basisfor an abstract cross-sector dependency model which implies the need for handlingdiverse sector-specific data as an input to the cross-sector model. In the CI securitymodel, the need for sector specific models is avoided by using CI services and risk(CIA) as common, sector independent information. While the IRRIIS approach allowsfor more detailed sector-specific models, the advantage of the CI security model is theuse of one modelling approach for all CI sectors, based on cross-sector CI information(CI services and CI service risk).

The graph-based interdependency modelling approach presented in Section 2.3.4([30–33]) shares a core idea of the CI security modelling approach since it recognizesthe advantages of using graphs to capture interactions in complex systems. However,the approaches differ fundamentally, since this model captures supply/demand driveninteractions among CIs or CI components (graph nodes are producers/consumers andedges are resource dependencies). The main model purpose is availability and cascadingfailure analysis. The CI security modelling approach models CI services as nodes andCIA dependencies as edges and presents a more generalized view on CI service risk,taking into account more aspects than availability.

The research presented by the MICIE project introduced in Section 2.3.3 ([26, 27]),to the best of my knowledge, represents the closest relation to the research of this PhDtheses, since the main goals are real-time monitoring of CIs by also taking into accountdependencies. One reason for those similarities is that the work on this PhD thesisoriginated in the MICIE project and parts of this work are a contribution to the MICIEproject. Discussions among the project members helped to understand the general

70

research problems in the field of CI security and helped to lay the foundations of thisPhD thesis. Apart from the general goals, there are fundamental differences in theapproaches. While the MICIE project tries to support sector specific CI models byenriching them with information about dependencies, this work tries to build a uniformmodel based on a set of common information. The main motivation behind this decisionwas to address the heterogeneity of CIs. Although the MICIE project uses an ontologyto address this problem, to the best of my knowledge, it was not convincingly shownthat using this ontology data can be represented in a uniform way, having in mindthe diversity of CI sectors. Using CI service risk as a common abstraction parametersimplifies information sharing by reducing it to a limited set of parameters. While theMICIE approach allows more flexibility since each CI operator can build a model thatbest represents their infrastructure, both in modelling detail as well as model output,using risk as model output can capture CI service states on a higher abstraction leveland simplifies the inclusion of dependencies within the CI model, since the sharedinformation is the same and no sector specific modelling is required.

The Bayesian network based interdependency analysis method introduced inSection 2.3.3 ([28, 29]) in the context of the MICIE project presents a Bayesiannetwork based CI models. Although it shares some of the same basic ideas as the CIsecurity model (probably because of the same roots in the MICIE project), there arefundamental differences in the model input/output and purpose. The model takes atomicevents (such as weather or earthquake) as input, whereas the CI security model usesgeneral sensor data defining the state of a service as input. Using sensor data allows tocapture all possible service state changes, for example atomic events like weather orearthquake should result in a change of the sensor data, if the service is affected. Italso circumvents the difficulty of modelling the actual cause of atomic events. Anotherdifference with regard to the CI security model is that the model output is binary, eitherindicating the service is working or it failed, whereas the CI security model is not onlyconcerned with service availability, but also gives an operator the ability to monitorother relevant indicators like confidentiality and integrity and allows a more fine-grainedanalysis of quality-of-service by introducing intermediate steps of degraded service.Finally, the purpose of this model, which is mainly used for failure analysis, differs fromthe CI security model which is concerned with on-line monitoring.

Research concerned with risk in CIs mainly considers risk from the viewpoint of riskassessment and mitigation in the context of availability and economic loss (structuralanalysis and the analysis of threats/vulnerabilities, as well as finding strategies to reduce

71

risks) without taking on-line risk monitoring into account. Risk in this PhD thesis ismainly seen from the viewpoint of on-line risk experienced during CI operation, riskassessment and mitigation is part of the proposed CI dependency analysis methodology,but is not the main focus of this work. However, some research related to CI risk duringoperation was proposed:

The approach presented in Section 2.4.4 ([54]) uses sensors to detect if a componentis under attack or compromised and uses a probabilistic model to draw conclusionsabout the attack state of the whole system. The CI security model presents a moregeneral risk analysis since different types of risk (CIA) are considered and other eventsthan deliberate attacks are taken into account.

The objectives of the SERSCIS project presented in Section 2.4.13 ([64–66]) arerelated to some of the ideas of the CI security model, since the project is concernedwith real-time risk monitoring (where risk is seen as the risk of a threat being carriedout). To the best of my knowledge, the SERSCIS approach is concerned with riskassessment in secluded CIs by evaluating the complex processes and dependencieswithin a CI, without taking dependencies to other, external CIs into account. In SERCIS,CIs can be composed of multiple stakeholders, for example in the air traffic sector,multiple external companies are used to manage business at airports. However, thosecompanies are closely coupled with the airport and are not external to the air trafficsector. Modelling the dependencies between CIs and different CI sectors, having inmind the associated confidentiality and information sharing concerns, is one of the maingoals of the CI security model and sets the two approaches apart. Furthermore, theSERSCIS approach deals with reasoning if an identified threat is currently carried out,whereas the CI security model is monitoring the risk based on system measurements,independent of concrete threats.

The work presented in Section 2.4.12 ([62]) presents an interesting approach thatshares a key idea of the CI security modelling approach: The authors of this work arguethat observing system behaviour using sensor data helps to detect complex attacks orthreats carried out against CIs. In the CI security model, a similar approach is takento observe the security state of CI services. However, in their work, dependencies toother CIs are not taken into account. Furthermore, no discussion is presented whethertheir approach scales to large-scale CIs on a national level, since it does not appear thattheir approach uses decomposition (or a similar method) to manage the complexity oflarge-scale CIs.

72

The PROTOS-MATINE methodology introduced in Section 2.5 ([67–70]) has provento be useful in protocol dependency analysis, protocol vulnerability analysis and securityaudits in complex systems. In this PhD thesis, the PROTOS-MATINE methodology isenriched with new aspects and it is shown that it is useful in CI decomposition anddependency analysis in the context of the CI security model.

73

3 Contribution

In this Chapter, the contributions of this PhD thesis, as presented in the original articlesthis thesis is based on, are introduced. Furthermore, the main contribution of eachoriginal article is reviewed and the most important aspects of each article are highlighted.Where applicable, the contribution is compared to related work. Additional, unpublishedwork related to the validation of the CI security model is presented in Section 3.4.1.

3.1 Methodology introduction and dependency analysis

The work presented in [I] represents a summary of the most important results from twopublications introducing the CI security model ([1, 2]) and one publication introducingRESCI-MONITOR, a tool that was implemented to show how the CI security modelcould be deployed in practice ([3]). Those three publications are not part of this PhDthesis, but important ideas of those publications are re-used in [I] in a more compactand structured way. The initial ideas of the CI security model, which are the service-oriented, risk based modelling of CIs, including the risk of dependencies, formingthe general structure of the CI security model (CI services, dependencies and basemeasurements), are presented and are the core contribution of this work. Furthermore,while not providing comprehensive solutions, the need for the validation of estimatedrisk (assurance) and for the partitioning of CIs (decomposition) is highlighted.

RESCI-MONITOR is an implementation of the CI security model allowing on-linemonitoring of CIs and its architecture allows flexible deployment in distributed andmulti-stakeholder environments. The tool was submitted to initial proof-of-conceptvalidation in the context of the MICIE project, but due to the fact that deployment of theCI security model in actual multi-stakeholder CI environments is unrealistic at thispoint (despite some effort, CI operators are reluctant to provide their infrastructure forresearch), this tool does not play any further role in this PhD thesis.

The service-oriented, risk-based approach of the CI security model provides solutionsto the following research requirements listed in Section 1.3: The complexity of CIs(R1), the diversity of CIs (R2), CI dependency (R3) and the facilitation of informationsharing among CIs (R4). The research presented in this work shares core ideas ofthe MICIE project presented in Section 2.3.3, like the need for on-line monitoringof CIs and the need to find solutions for CI specific modelling problems like the

75

dependency/interdependency of CIs and information sharing among CIs. The MICIEproject as well as the CI security model utilize a service-oriented approach to providea common abstraction layer for CIs, since all CIs are service providers. A coredifference between the approach taken by the CI security model and the MICIE projectis the information exchanged among services. In the MICIE project, service specificinformation determining the service state for on-line monitoring is exchanged amongdependent services. In the CI security model, the CI service state is represented by CIservice risk and only this abstract information is exchanged among dependent services.This approach has two advantages compared to the approach taken by the MICIEproject: First, information sharing among CIs is facilitated since no detailed, possiblyconfidential, service specific information is exchanged and second, the diversity ofinformation from different CI sectors is addressed since risk is a common parameter thateach CI sector is concerned with. Additionally, the MICIE project is mainly focused onthe availability of CIs, while the CI security model allows to monitor other parameterslike confidentiality and integrity as well.

The work presented in [II] shows how the PROTOS-MATINE method for dependencyanalysis can be utilized and adapted to the CI security model to analyse CIs and identifythe modelling entities of the CI security model (CI services, base measurements andCI service dependencies). This method was specifically introduced to address thecomplexity of CIs and allows the decomposition of complex infrastructures into sub-items, represented as CI services. Lower-level CI services are seen as dependencies ofthe higher-level CI service. Furthermore, the method allows to identify dependencieswith other internal or external CI services as well as base measurements to observe CIservice states.

The PROTOS-MATINE method for dependency analysis, as introduced in Sec-tion 2.5, with its focus on providing a holistic view on the dependencies in differentcontexts (protocols, software, systems), based on the utilization of all available informa-tion sources on all organizational levels, has proven to be well suited for establishing a CIsecurity model. The main contributing factors are the flexibility of PROTOS-MATINEallowing to evaluate dependencies among any kind of entity and the utilization of anykind of information during the analysis process, which is in line with the diversity ofCIs and the multitude of information available (written and human sources) withinCI environments. One aspect that was neglected in [II] is the fact that the graphicalrepresentation of the analysis results is crucial for the success of the analysis, sincediscussions with experts based on a visual representation of initial results help to

76

improve the model substantially. The need for graphical representation during theanalysis process is highlighted in [V], where a validation of the approach is presented inthe context of a case study.

The research on dependency analysis that leads to the adaptation and extension ofthe PROTOS-MATINE methodology was initiated to be able to apply the CI securitymodel in practice. To be able to represent a complex infrastructure like a CI asCI services, dependencies among CI services and base measurements, a structuredapproach is required that respects the modelling requirements in the context of CIs.Specifically, the utilization of the PROTOS-MATINE dependency analysis methodologysupports the CI security model to fulfil the following research requirements listed inSection 1.3: The complexity of CIs (R1), the diversity of CIs (R2) and the identificationof CI dependencies and interdependencies (R3). Additionally, the PROTOS-MATINEdependency analysis methodology provides a solution for the requirement of usingmultiple information sources (R5) to receive a holistic and complete CI security model.

3.2 Methodology refinement

The work presented in [III] introduces a BN based approach to estimate risk in thecontext of the CI security model. It is presented how CI service risk can be estimatedfrom base measurement states and the CI service risk of dependencies using BNs, eitherby learning the risk probabilities from recorded past events or by using expert estimationof risk probabilities where learning is not possible. Furthermore, the work introducesthe prediction of CI service risk in the short-term, mid-term and long-term future usingDBNs and it introduces a method for handling the special case of interdependencies(directed cycles in the graph model), which is difficult to model using BNs. In thecontext of this work, interdependencies can be modelled using DBNs.

The work presented in [IV] is based on the publication of the Bayesian networkbased CI security model presented in [III] and complements the research by introducing atool that fully implements the ideas of [III]. The tool allows to input a CI security modelconsisting of CI services, base measurements and dependencies and to pre-process theassociated data used to learn conditional probabilities (base measurement normalizationand estimation of CI service risk during incidents). Furthermore, the tool allows toinput additional information needed to handle interdependencies and CI service riskprediction. The tool supports automatic learning of CPTs from data and it supports themanual input of conditional probabilities where automatic learning is not possible. To

77

be able to validate a CI security model, the tool supports the simulation or emulationof incidents and visualizes the associated CI service risk, as well as its effects on CIservice risk of CI services that depend on it. A notable feature of the tool is the graphicalrepresentation of all steps which should simplify the process of building a CI securitymodel and facilitate expert input.

To use BNs to estimate risk probabilities from data records is, to the best of theauthor’s knowledge, a novel approach for risk estimation in CIs. In the context of theCI security model, this approach provides a simple way of estimating risk based onpreviously recorded incidents. At the same time, it allows to manually estimate risk forincidents that did not occur before and it allows advanced features like handling ofinterdependencies and risk prediction. The following research requirements listed inSection 1.3 can be satisfied by using the BN based risk estimation: The inclusion ofresult uncertainty (R6), since the difference of the risk estimate form 100% certainty canbe used as a measure of uncertainty. The utilization of multiple information sources(R5), since risk estimates that can not be learned from data records can be supplementedby expert estimation and the inclusion of risk prediction (R7). Risk prediction is also aresearch focus of the MICIE project presented in Section 2.3.3. In the MICIE project,the prediction of a future state is achieved by simulation of scenarios after an incident,using specific CI simulators. This approach results in exact predictions if simulators areavailable that can adequately capture the complex behaviour and interactions of CIs overtime. The prediction of future CI states in the CI security model on the other hand isbased on experience, by analysing past events in data records. The advantage of thisapproach is that no CI specific simulators are required, which substantially simplifies theprediction. A drawback of this approach is that, if no data records are available, futurerisk probabilities can not be adequately learned. The CI security model addresses thisissue by allowing expert estimation where automatic learning of risk probabilities is notpossible.

3.3 Assurance indicators

The work presented in [VI] evaluates indicators able to determine the confidence inon-line CI service risk levels calculated by a CI service within the CI security model.Three different indicators were identified:

78

– Base measurement assurance, which defines the static accuracy class of each basemeasurement a CI service depends on.

– Risk alert trust, which evaluates the trust in the correctness of a calculated risk levelbased on an evaluation of the difference between a calculated CI service risk level andthe actually experienced service level as a measure of quality-of-service.

– Behaviour trust, which evaluates the trust in the behaviour of an entity (for example abase measurement) for compliance to expected behaviour (for example unrealisticsensor readings that would suggest a sensor fault rather than an incident).

The research presented in this work is based on the CI security model presented in [I]without taking the Bayesian network based CI security model presented in [III] intoaccount. All three presented indicators are also valid within the Bayesian network basedCI security model, but the Bayesian network based approach would allow to include anadditional CI service assurance indicator utilizing the uncertainty reflected by Bayesianprobabilities, by evaluating the distance of a risk probability to 100% certainty. Thework on such an assurance indicator is a possibility for future improvement.

The work presented in [VII] presents indicators similar to the risk alert trust andbehaviour trust presented in [VI] to evaluate the trust in the correctness of on-line CIservice risk levels received from services a CI service depends on. The risk alert trust ofa dependency is evaluated based on the comparison of the received risk level with theexperienced quality-of-service. The behaviour trust is evaluated based on the generalexpected behaviour of the received risk value (for example, behaviour trust is influencedif a dependency is expected to update the CI service risk level on a regular basis, butfails to do so for a certain amount of time).

The research presented in this work is based on the CI security model presentedin [I] without taking the Bayesian network based CI security model presented in [III]into account. One aspect of this research is not compatible with the Bayesian networkbased CI security model: In the original CI security model, a weight describing theimportance of a dependency to the risk of a CI service needs to be defined to be able tocalculate CI service risk using the weighted sum method. In this work, the risk alerttrust and behaviour trust in the dependency risk estimates dynamically influence theinitial weight and thus the importance of the dependency based on the current trustlevel. The Bayesian network based CI security model does not require the definition ofweights since it estimates CI service risk in a probabilistic way based on the state ofparent nodes. However, the research presented in this work could still be used in the

79

Bayesian network based CI security model by including it as a CI service risk assuranceindicator as presented in [VI]. The work on such an assurance indicator is a possibilityfor future improvement.

The research presented in this Section addresses the following research requirementlisted in Section 1.3: The assurance of result validity (R8), by re-evaluating the accuracyof risk estimates provided by the CI security model. The analysis of related workpresented in Table 2 of Section 2.6 has shown that the assurance of result validity wasnot a major research focus in CIP research and therefore this approach can be seen as anovel contribution to CIP.

3.4 Validation

To validate the proposed methodology, a case study based validation was conducted,which includes the validation of the methodology and the dependency analysis methodpresented in Section 3.1 ([I], [II]), the Bayesian network based risk estimation and thetool presented in 3.2 ([III], [IV]) and the assurance indicators presented in 3.3 ([VI],[VII]).

The work presented in [V] introduces a proof-of-concept validation of the dependencyanalysis method presented in [II] in the context of a case study within the Grid’5000project. The case study details the CI security model structure for the availability riskindicator. Grid’5000 operates a distributed academic computing grid with locationsmainly in France and Luxembourg. The network interconnection among the sites isrealized by a dedicated fibre optic network backbone operated by external providers inFrance and Luxembourg, allowing to conduct the case study based on two independentlyoperated infrastructures. The results of the case study have shown that a CI securitymodel can be realized by analysing real-world CIs using the method presented in [II],with its focus on utilizing all available information sources (written and human) toget a holistic view of CIs. It was shown that the decomposition of CIs as well as theidentification of the CI security modelling entities (CI services, base measurements anddependencies among the entities) is possible at all decomposition levels. It was shownthat the graphical representation of analysis results helps to improve modelling resultsby facilitating discussions to find mistakes in model representation. It was experiencedthat, aside from technical manuals, CI experts are a valuable source for identifyingbase measurement normalization bounds. The results of this case study suggest theapplicability of the dependency analysis method to CI environments. Based on the

80

results of the case study presented in [V], Section 3.4.1 details the validation of the BNbased risk estimation.

The assurance indicators presented in Section 3.3 were validated to evaluate theirapplicability in the context of the CI security model. In [VI], the assurance indicatorswere validated based on the simulation of a constructed example. The results suggest thevalidity of the approach and the applicability in the context of the CI security model.In [VII], the trust-based indicators were validated based on a realistic scenario, usingreal-world data sets, in the context of Grid’5000. The results of this case study basedvalidation give even more confidence in the validity and applicability of the presentedassurance indicators.

3.4.1 Grid’5000 case study implementation and experimentation

In this Section, the Grid’5000 case study presented in [V] is implemented using the toolpresented in [IV]. The presented research is based on unpublished work and is presentedas an addition to the original articles to complete the work on this PhD thesis with thevalidation of the Bayesian network based CI security model presented in [III]2. As inthe case study presented in [V], only the availability risk indicator is taken into account.

The work presented in this Section introduces some simplifications to theGrid’5000 CI security model. The main purpose of this validation is to show that allinformation needed to provide the Bayesian network based CI security model, can beprovided by CI experts. The following information needs to be provided by CI experts:the pre-processing of data records by introducing base measurement normalizationbounds, the CI service risk estimation during recorded incidents and the CI servicerisk probability estimation where automatic learning is not possible. In general, toreduce the load for CI experts, this case study only considers a maximum of three parentnodes (dependencies) of a CI service. Three parent nodes equal 125 dependency statecombinations in the CPT of the CI service, which is considered being manageableby a CI expert for reviewing automatically learned probabilities as well as providingprobability estimates where learning is not possible. The exponential growth of statecombinations with increasing parent nodes (for example, five parent nodes equal 3125dependency state combinations) limits the number of parent nodes to three in thisproof-of-concept validation. However, it should be noted that in a real-world set-upwhere CI operators are dedicated to provide a CI security model, it is considered that

2Parts of the content of this Section were published in [IV] to complement the work with a practical example.

81

CI services with more than three dependencies are realistic. Bayesian networks alsoallow to introduce intermediate nodes to distribute the parents of a node to more thanone (intermediate) node, which reduces the dependencies of a single node and thuscounteracts the exponential growth of node state combinations.

Some aspects of the Bayesian network based CI security model can not be validatedby the Grid’5000 case study. First, during the analysis of Grid’5000, no interdependency(a directed cycle in the CI security model graph) was detected. It is assumed thatit is unrealistic to detect an interdependency within a single CI (or in this case, twoclosely coupled CIs), since a dependency cycle represents an unstable element in thesystem. Therefore, dependency cycles are avoided in system design. Interdependenciesare expected to be present in cross-sector dependency models where a dependencyloop spans multiple sectors and causes service failures in unexpected ways. Second,risk prediction is omitted in the Grid’5000 case study since an incident that causesa CI service risk usually does not result in a long lasting risk. In the computing andtelecommunication sector, CI service risk caused by a component failure (hardware orsoftware) is usually corrected with only short delays, which makes the estimation of riskin the short-term, mid-term and long-term obsolete.

The general simplifications and limitations lead to the following concrete simplifica-tions in the Grid’5000 CI security model:

– Only the site Luxembourg and the network segment Luxembourg-Nancy are con-sidered since they cover the main structure of the Grid’5000 CI security model whichis composed of sites and network segments.

– The connection point Esch-sur-Alzette is omitted, since the performance measuresused as base measurements are also available directly for the network segmentLuxembourg-Nancy, measured by the equipment in the connection points Nancy andLuxembourg. The connection is treated as if the connection point Esch-sur-Alzettewould not exist and any failure in the equipment of this connection point will bereflected in the performance measures of the other connection points.

– The Renater tickets are not considered as base measurements since they occur tooinfrequently.

– The site users and the site staff are not considered since they do not present a majoraspect contributing to the CI service risk of site Luxembourg. In this way, anyconcerns about privacy related to the publication of employee and user data is avoided.

82

– Diesel generator fuel level and planned interruption of electricity supply basemeasurements are not considered since they rarely occur.

– The remaining base measurements for determining the environmental factors in theserver room are directly connected to the server room CI service, which greatlysimplifies the model.

– The interconnect and fast interconnect CI services of the cluster are not considered,since they are only defined by the status of Nagios plug-ins. The usefulness of Nagiosplug-in status base measurements is already validated by the service nodes of thecluster and the removal of the interconnect and fast interconnect CI services reducesthe dependencies of the cluster, which greatly simplifies the model.

– The Nagios plug-in status base measurements for the computing nodes are notconsidered, since the usefulness of Nagios plug-in status base measurements isalready validated by the service nodes of the cluster.

– Each tool that observes multiple entities (for example multiple computing nodesor service nodes) is treated as one base measurement (for example, OAR status orKadeploy phoenix status). This means that not every computing node or servicenode is monitored separately, but each entity monitored by a tool (computing node orservice node) reporting a problem is reflected by the same base measurement. Thisreduces the granularity and precision of risk estimates, but reduces the complexity ofthis case study substantially since the dependencies of the concerned CI services arereduced considerably.

The resulting CI security model for the availability risk indicator of site Luxembourgcan be seen in Figure 1. The nodes labelled “SERVICES” are CI services, the nodeslabelled “BASE MEASUREMENT” are base measurements. The arrows between thenodes are dependencies.

Discussion of base measurement data sets

In this Section, the base measurement data records used to automatically learn theCI service risk probabilities are discussed. First it should be noted that the basemeasurement data records for the server room base measurements (Struxureware coolingsystem status, Struxureware electricity supply status and smoke detector status) are notavailable in this case study since for now those measurements are only observable, butno data records are kept. Therefore, the server room risk probabilities will be estimatedsolely by expert estimation.

83

Fig 1. Grid’5000 case study graph.

Unfortunately, most of the presented data sets are not optimal for automatic learningeither, since incidents and failures rarely occurred and therefore states that wouldresult in a high CI service risk are not available. A good example for this behaviourare the packet loss network performance measures measured at the connection pointLuxembourg (Figure 2) and at the connection point Nancy (Figure 4). Althoughdata records are available for a long period of time (nine months) with a respectableamount of data (5604 data points), the packet loss in both cases rarely exceeds 0%,with a maximum of 20% (normalization level 2 according to the base measurementnormalization presented in [V]). A similar behaviour can be observed for the linkstatus base measurements at the connection point Luxembourg (Figure 3) and theconnection point Nancy (Figure 5), which are available for 22 months (connectionpoint Luxembourg) and 48 months (connection point Nancy) and include 7431 datapoints each, but rarely indicate a link failure (although the link status data recordsare missing for a substantial amount of time). An interesting additional aspect of thenetwork performance measurement data sets is that they are stored in a round-robindatabase format, which means that recent data points are stored at a higher frequencythan older data points. This also explains why a similar amount of data points in the linkstatus data set is available at the connection point Luxembourg, although the link statusdata at the connection point Nancy is available for a much longer time period. Due

84

to the structure of the available network performance data sets, only automatic riskprobability learning of low risk CI service states will be possible. The additional riskprobabilities will have to be determined by expert estimation.

0

5

10

15

20

01/1

0/11

01/1

1/11

01/1

2/11

01/0

1/12

01/0

2/12

01/0

3/12

01/0

4/12

01/0

5/12

01/0

6/12

01/0

7/12

Pac

ket l

oss

(%)

Date

Packet loss for link segment Luxembourg-Nancy

Fig 2. Packet loss of network segment Luxembourg-Nancy.

0

0.5

1

1.5

2

01/0

9/10

01/1

1/10

01/0

1/11

01/0

3/11

01/0

5/11

01/0

7/11

01/0

9/11

01/1

1/11

01/0

1/12

01/0

3/12

01/0

5/12

01/0

7/12

Link

sta

tus

(on/

off)

Date

Link status for link segment Luxembourg-Nancy

Fig 3. Link status of network segment Luxembourg-Nancy.

85

0

5

10

15

20

01/1

0/11

01/1

1/11

01/1

2/11

01/0

1/12

01/0

2/12

01/0

3/12

01/0

4/12

01/0

5/12

01/0

6/12

01/0

7/12

Pac

ket l

oss

(%)

Date

Packet loss for link segment Nancy-Luxembourg

Fig 4. Packet loss of network segment Nancy-Luxembourg.

0

0.5

1

1.5

2

01/0

7/08

01/0

1/09

01/0

7/09

01/0

1/10

01/0

7/10

01/0

1/11

01/0

7/11

01/0

1/12

01/0

7/12

Link

sta

tus

(on/

off)

Date

Link status for link segment Nancy-Luxembourg

Fig 5. Link status of network segment Nancy-Luxembourg.

The Nagios status base measurement data set is presented in Figure 6. The statusinformation was parsed from log files and the textual state explanation (OK, WARNING,UNKNOWN, CRITICAL) was mapped to numeric states (1,2,3 and 5) to be able to

86

visualize the data set. The dataset contains 4356 data points collected over the time spanof one month. It can be observed that the normalization states 1 and 2 occur substantiallymore frequently than the states 3 and 5, which suggests that automatic learning forthe high CI service risk state probabilities will be less accurate and will have to besupplemented by expert estimation.

1

1.5

2

2.5

3

3.5

4

4.5

5

21/0

4/12

28/0

4/12

05/0

5/12

12/0

5/12

19/0

5/12

26/0

5/12

Sta

tus

(1-3

,5)

Date

Nagios status

Fig 6. Nagios status for site Luxembourg.

The Puppet status base measurement is visualized in Figure 7. The status informationwas parsed from log files and the textual state explanation (unchanged, changed,unresponsive, pending and failed) was mapped to numeric states (1-5) for visualization.The dataset contains 28987 data points collected over the time span of three months.The majority of data points are located in normalization states 1 and 2, with very fewdata points indicating failed deployments in normalization state 5. Again, those datapoints will not be sufficient to learn the risk probabilities for high CI service risk states.

87

1

1.5

2

2.5

3

3.5

4

4.5

5

10/0

3/12

17/0

3/12

24/0

3/12

31/0

3/12

07/0

4/12

14/0

4/12

21/0

4/12

28/0

4/12

05/0

5/12

12/0

5/12

19/0

5/12

Sta

tus

(1-5

)

Date

Puppet status

Fig 7. Puppet status for site Luxembourg.

The Kadeploy phoenix status presented in Figure 8 was extracted from log files andthe textual state representation (not broken, broken for first time, try to kareboot, try tokadeploy, notify admins) was mapped to numeric states (1-5) for visualization. Thetextual state representation is slightly different than the base measurement normalizationbounds presented in [V], since the log messages that are available for this case studydiffer from the actual Kadeploy phoenix state, but they have a similar meaning. Thedata set contains 1446 data points collected over the time period of nine months. Thenormalization states are more equally distributed in this data set than in the otherpresented data sets, but still a higher density in the normalization states 1 and 2 isexperienced. An adequate learning of the high CI service risk state probabilities willstill be difficult and it is expected that expert estimation is needed to supplement themissing or not adequately learned probabilities. In any case, the OAR status basemeasurement records could not be extracted from Grid’5000 at this point due to technicaland organizational difficulties. The OAR status data records are used in combinationwith the Kadeploy phoenix status data records to learn the risk probabilities of thecomputing nodes CI service. Since the OAR status data records are not available, theKadeploy phoenix status data records can not be used to learn the computing nodes CIservice risk probabilities, they have to be supplemented by expert estimation.

88

1

1.5

2

2.5

3

3.5

4

4.5

5

01/0

9/11

01/1

0/11

01/1

1/11

01/1

2/11

01/0

1/12

01/0

2/12

01/0

3/12

01/0

4/12

01/0

5/12

01/0

6/12

Sta

tus

(1-5

)

Date

Kadeploy Phoenix status

Fig 8. Kadeploy phoenix status for site Luxembourg.

Data pre-processing

The remaining CI services depending on base measurements with available datasets forautomatic probability learning are the Service Nodes, depending on the Puppet Status

Service Nodes and the Nagios Status Service Nodes, as well as the Connection Point

Luxembourg and the Connection Point Nancy which depend on the respective Link

Status and Packet Loss performance measures.The data pre-processing for the Service Nodes is visualized in Figure 9. In the Base

measurement normalization tables section, the normalization bounds for the two basemeasurements are specified. The normalization bounds are set to utilize the alreadynormalized data set values according to Figure 6 and Figure 7. Approximate missing

values is set to use the last available value for both base measurements to account formeasures that are taken at slightly different time points in the two data sets. The Data

section contains a list of all available time stamps in Unix time format and the respectivenormalized base measurement value at that time point. The CI service risk is estimatedby CI experts based on the estimation of the experienced CI service risk during a certaintime period and added to the Service Risk column. The Data section in Figure 9 shows a

89

portion of the data where the Service Nodes risk was estimated to be 5 due to an incidentmonitored by the Nagios status base measurement.

Fig 9. Data pre-processing for Service Nodes CI service.

The data pre-processing for the connection points Luxembourg and Nancy is thesame as for the service nodes with the difference that, due to the packet loss and linkstatus data sets presented in Figures 2- 5 which rarely indicate a fault, the CI expertestimated a CI service risk of 1 for the connection points Luxembourg and Nancy duringthe entire time period presented in the data records.

Risk probability learning and estimation

The risk probabilities for the Service Nodes CI service are visualized in Figure 10. Theprobabilities were automatically learned using the previously pre-processed data sets.The learning statistics of this learning process are presented in Table 3. Only some ofthe probabilities, mainly in dependency state combinations containing the normalizationvalues of 1 and 2, but also dependency state combinations where the Nagios status is 5,could be correctly learned. The other state combinations, which rarely occur in the datasets, could not be learned at all or were wrongly classified. The normalization state 4 isnot a valid state for the Nagios base measurement and therefore the risk probabilities fordependency state combinations containing a Nagios state 4 are set to 0 for all possible

90

CI service risk states. The learned risk probabilities were reviewed by the CI expert andall risk probabilities that could not be learned or were wrongly classified were manuallyre-estimated.

Automatic risk probability learning for the Connection Point Luxembourg andConnection Point Nancy only resulted in useful learning results in situations where thedependency state combination indicates no CI service risk, which is the majority ofdata points in the data sets. Since the CI service risk for the whole time period wasestimated to 1, the rare occasions, where either the packet loss or link status performancemeasures indicated a problem, were wrongly classified to a CI service risk of 1 andwere re-evaluated by expert estimation. The remaining CI service risk probabilities thatcould not be learned from data, as well as the CI service risk probabilities of all other CIservices presented in this case study that could not be automatically learned due tomissing data records were estimated by the CI expert.

Fig 10. Risk probabilities for Service Nodes CI service.

91

Table 3. Risk probability learning statistics for Service Nodes CI service.

Dependency state combinations Risk state occurrence Total number of occurrences

1 2 3 4 5

1,1 5894 965 50 7 85 70011,2 1401 845 15 1 29 22911,3 0 0 16 0 22 381,4 0 0 0 1 0 11,5 0 11 15 5 291 3222,1 4149 255 15 3 27 44492,2 82 525 10 1 10 6282,3 0 0 43 0 0 432,4 0 0 0 5 0 52,5 0 0 1 0 136 1373,1 657 0 0 0 0 6573,2 0 44 0 0 0 443,3 0 0 0 0 0 03,4 0 0 0 0 0 03,5 0 0 0 0 29 294,1 962 0 0 0 0 9624,2 0 684 0 0 0 6844,3 0 0 22 0 0 224,4 0 0 0 8 0 84,5 0 0 0 0 134 1345,1 345 0 0 0 5 3505,2 0 152 0 0 4 1565,3 0 0 4 0 0 45,4 0 0 0 2 0 25,5 0 0 0 0 32 32

Case study results and conclusion

As a result of the case study, it is possible to conduct failure simulations or emulationsand observe their cascading effect throughout the system. For example, a simple failureemulation was conducted in the scenario presented in Figure 11. At the beginning of theemulation, the system is in an initial state which represents a risk of 1 for all CI services.After a few seconds, the OAR status base measurement is set to state 2, which representsa small risk to the Cluster as well as the Site Luxembourg. A few seconds later, thePacket loss for both, the Connection Point Luxembourg and the Connection Point Nancy

are raised to a high value, which represents a service risk of 4 for both connection points.Due to the highly degraded network link, the Site Luxembourg raises its estimated

92

service risk to 3. Another few seconds later, the link status for both connection pointsindicates a complete link failure, which raises the CI service risk in both connectionpoints to 5, but the CI service risk of the Site Luxembourg remains at 3. A link failurerepresents a medium risk to the Site Luxembourg, since it only loses its connection tothe other sides of the grid, but the site remains functional. The last failure emulated inthis scenario, which is highlighted in Figure 11, is a complete failure of the coolingsystem, which poses a high risk to the Server Room. With this additional failure, the CIservice risk of the Site Luxembourg reaches a risk level of 5.

Fig 11. Emulation of a simple failure scenario.

More general results of the case study suggest that expert estimation of CI servicerisk during incidents in the data pre-processing phase, as well as risk probabilityestimation where automatic learning is not possible, are feasible. The CI expert inthis case study was comfortable with providing the risk probability estimates, whichsuggests that a CI expert familiar with the CI systems has the knowledge to evaluate themost probable CI service risk, given a certain dependency state combination. For theCI service risk estimation during incidents, on the other hand, the CI expert was notthat comfortable to provide the information. It was seen by the CI expert as a tediousprocedure to go through all data records and classify them to a certain CI service risk.Most small failures which were not leading to major incidents are forgotten and it is hardor impossible to correctly classify them a long time after the failure occurred. According

93

to the CI expert, it would be much easier to classify an incident to a CI service riskdirectly after the incident, but this was not the case for this case study. Although theCI expert experienced the CI service risk estimation as cumbersome, he attested theprocedure to be generally feasible.

Some elements of the CI security model could not be validated during the Grid’5000case study. Firstly, due to the structure of Grid’5000, which does not contain anyinterdependencies among components, the CI security modelling element that allowsto handle interdependencies could not be validated. Secondly, the case study did notallow to validate the risk prediction component of the CI security model (CI service riskestimation in the short-term, mid-term and long-term future after an incident), sinceGrid’5000 is mainly concerned with the current CI service risk and faults are correctedwith short time delay. Thirdly, the automatic risk probability learning element of the CIsecurity model could not be validated sufficiently. While the available data recordsallowed an initial validation, the base measurement data records available for this casestudy did not contain enough data about incidents to be able to learn the CI service riskprobabilities for all possible dependency state combinations.

To conclude this Section, the Grid’5000 case study implementation has shown thatthe most essential information needed during the implementation phase, which are theexpert estimation of CI service risk during incidents in the data pre-processing phase aswell as the risk probability estimation where automatic learning is not possible, can beprovided by CI experts. Other aspects of the CI security model, like automatic riskprobability learning, risk prediction and handling of interdependencies, could not besufficiently validated due to the nature of the Grid’5000 infrastructure which represents areal-world infrastructure and provides realistic data sets. Elements that are not availablein this real-world set-up can not be validated in a case study concerned with realisticobservations, but during this case study, no modelling difficulties could be identified thatwould suggest that those concepts would not be applicable to CI environments. It isassumed that a more complex case study involving multiple CI sectors would increasethe chances of a more complete validation. This should include the identification ofinterdependencies among CI sectors, the need for risk prediction in CI environments anda more comprehensive validation of the risk probability learning component due to theavailability of more complete base measurement data records. Although it is, of course,hard to draw a final verdict on the validity of the CI security model based on only onecase study, the initial results received during this case study suggest the applicability ofthe CI security model to CI environments in the context of on-line risk monitoring.

94

4 Discussion and Conclusion

In this PhD thesis, the author has presented a cross-sector model for on-line CI riskmonitoring, called the CI security model. The model is service based and takes the riskwithin the service, as well as the risk of dependencies into account. The model tries toaddress modelling challenges in the CI domain, like the complexity of CIs, the diversity

of CIs or the dependencies or interdependencies among CIs or CI sectors. Information

sharing among CIs is facilitated due to the abstract, risk based representation of CIservice states, which allows to monitor the state of dependencies without sharing internal,possibly confidential information.

CI security modelling relies on a dependency analysis methodology proposed in thiswork to analyse CIs and identify the modelling entities of the CI security model (CI

services, base measurements to observe the CI service state and dependencies among CIservices, as well as among CI services and base measurements). CI service risk withinthe CI security model is estimated from dependency states in a probabilistic way using aBayesian network based approach which estimates the most probable CI service risk,given a dependency state combination. The Bayesian network based approach utilizesdynamic Bayesian networks to allow risk prediction by estimating the most probable riskin the short-term, mid-term and long-term future, given a dependency state combination.Interdependencies, which represent directed cycles in the CI security model graph, aredifficult to model using Bayesian networks. In this work, a dynamic Bayesian networkbased approach was introduced to be able to model interdependencies in the context ofthe CI security model. A tool supporting all aspects of the implementation of a Bayesiannetwork based CI security model was presented. Furthermore, to be able to do an on-lineevaluation of the correctness of a risk estimate, a study of assurance indicators wasconducted to be able to validate the CI service risk estimates within a CI service or forCI risk estimates received from dependencies.

The proposed model was validated based on a case study within the Grid’5000project, a distributed academic computing grid. The case study presents an analysisof one computing site of Grid’5000 and a portion of the network backbone thatinterconnects the computing sites. The dependency of the computing sites to thenetwork backbone represents a dependency among two independently operated complexinfrastructures within the computing sector and the telecommunication sector. The case

95

study presented a CI security model for an availability risk indicator and has shown thatcomplex infrastructures can be decomposed and represented as a CI security modelfollowing the proposed dependency analysis method. It was shown that learning of riskprobabilities from data records following the proposed Bayesian network based riskestimation is possible and that CI experts are able to provide the additional information(CI service risk estimation during incidents) needed to learn the probabilities. It was alsoshown that it is possible to use expert estimation to estimate CI service risk probabilitieswhere automatic learning is not possible. Furthermore, the proposed assurance indicatorswere validated based on a simulation using data from Grid’5000 and it was shown thatthe estimation of trust in the validity of CI service risk estimates and in the generalbehaviour of CI service risk estimates is possible.

Some aspects of the CI security model could not be validated by the Grid’5000case study. During the analysis of Grid’5000, no interdependency was identified.Interdependencies, which represent an unstable factor due to the loop-back effect, arerather expected to be present in cross-sector CI models than in a single CI or two closelycoupled CIs like the computing sites and network backbone of Grid’5000. The riskprediction component of the CI security model could not be validated since withinGrid’5000 short-term, mid-term and long-term risks do not play a major role. Risksdue to a failure in software or hardware components are corrected with short timedelays, which makes risk prediction obsolete. Furthermore, not all identified observablebase measurements are recorded or, if they are recorded, they do not contain enoughinformation about incidents to be able to learn the risk probabilities for all possibledependency states. Base measurements with more complete data records would haveallowed a more thorough validation of the proposed risk learning component.

Future research in the context of the CI security model could focus on evaluatingadditional risk indicators. For now, CIA risk is considered, but CI operators might beinterested in additional indicators. Another possibility for future work is the evaluationof additional assurance indicators, for example, based on the uncertainty of Bayesianrisk estimates, as outlined in Section 3.3. One of the major goals for future work isthe validation of the CI security model in the context of a more extensive case studyinvolving multiple CI sectors. Establishing a cross-sector CI security model increasesthe chance of identifying interdependencies and it is assumed that commercial CIoperators keep more comprehensive data records of incidents, which would allow abetter validation of the risk learning component. Furthermore, it is assumed that riskprediction in those CIs is an important aspect and a validation of the proposed risk

96

prediction component would be possible. In addition to the availability risk indicator,the explicit validation of confidentiality and integrity risk indicators is a goal for possiblefuture case studies. Another goal for future work is a cross-sector deployment of theCI security model to assist CI operators in on-line risk monitoring. This goal seemsunrealistic at this point due to the organizational structure of multi-stakeholder CIenvironments which makes any kind of cooperation difficult, especially in the context ofa research project. A step towards this goal would be to evaluate the general usefulnessof the CI security modelling approach by presenting it to CI stakeholders and evaluatingtheir interest, which is seen as an important step towards validating the CI securitymodel, aside from validating the technical aspects presented above.

To conclude this PhD thesis, the author wishes to state that the CI security modellingframework has proven its validity and applicability in a first proof-of-concept validationand no technical obstacles that would prevent implementation or deployment wereidentified. Most aspects of the CI security model, ranging from dependency analysis torisk estimation and the evaluation of the assurance in the correctness of risk estimates,could be validated and the results suggest that those concepts are applicable to morecomplex environments. However, to be able to draw more informed conclusions, the CIsecurity model should be validated by conducting a more comprehensive case study or adeployment to actual CI environments.

97

References

1. Aubert J, Schaberreiter T, Incoul C, Khadraoui D & Gateau B (2010) Risk-based methodologyfor real-time security monitoring of interdependent services in critical infrastructures. In:International Conference on Availability, Reliability, and Security (ARES’10), pp. 262–267.

2. Aubert J, Schaberreiter T, Incoul C & Khadraoui D (2010) Real-time security monitoring ofinterdependent services in critical infrastructures. case study of a risk-based approach. In:21th European Safety and Reliability Conference (ESREL 2010).

3. Schaberreiter T, Bonhomme C, Aubert J, Incoul C & Khadraoui D (2010) Support tooldevelopment for real-time risk prediction in interdependent critical infrastructures. In:Risk and Trust in Extended Enterprises (RTEE2010) Workshop. ISSRE Wksp 2010. IEEEInternational Symposium on Sofware Reliability Engineering.

4. Group NS (2004) Technical analysis of the august 14, 2003, blackout: What happened,why, and what did we learn? Technical report, North American Electric Reliability Council(NERC).

5. GmbH EN (2006) Bericht über den stand der untersuchungen zum hergang und ursachen derstörung des kontinentaleuropäischen stromnetzes am samstag, 4. november 2006 nach 22:10uhr. Technical report, E.ON Netz GmbH.

6. Langner R (2011) Stuxnet: Dissecting a cyberwarfare weapon. IEEE Security Privacy 9(3):49–51.

7. Brunner EM & Suter M (2009) International CIIP Handbook 2008/2009, volume 4. Centerfor Security Studies, ETH Zurich.

8. Rinaldi SM, Peerenboom JP & Kelly TK (2001) Identifying, understanding, and analyzingcritical infrastructure interdependencies. IEEE Control Systems Magazine 21: 11–25.

9. Zhang P & Peeta S (2011) A generalized modeling framework to analyze interdependenciesamong infrastructure systems. Transportation Research Part B: Methodological 45(3):553–579.

10. Rinaldi S (2004) Modeling and simulating critical infrastructures and their interdependencies.In: Proceedings of the 37th Annual Hawaii International Conference on System Sciences.

11. Svendsen NK & Wolthusen SD (2012) Critical infrastructure protection. chapter Modellingapproaches, pp. 68–97. Springer-Verlag, Berlin, Heidelberg.

12. IRRIIS. [Online; Accessed: 3/2013] http://www.irriis.org.13. MICIE. [Online; Accessed: 3/2013] http://www.micie.eu/.14. Klein R, Rome E, Beyel C, Linnemann R, Reinhardt W & Usov A (2009) Critical Information

Infrastructure Security, volume 5508/2009, chapter Information Modelling and Simulationin Large Interdependent Critical Infrastructures in IRRIIS, pp. 36–47. Springer Berlin /Heidelberg.

15. Bloomfield R, Popov P, Salako K, Wright D, Buzna L, Ciancamerla E, Blasi SD, MinichinoM & Rosato V (2008). Analysis of critical infrastructure dependence - an IRRIIS perspective.IRRIIS document.

16. Setola R, Porcellinis SD & Sforna M (2009) Critical infrastructure dependency assessmentusing the input-output inoperability model. International Journal of Critical InfrastructureProtection 2(4): 170–178.

99

http://www.irriis.org

http://www.micie.eu/

17. Gasparri A, Oliva G & Panzieri S (2009) On the distributed synchronization of on-line IIMinterdependency models. In: 7th IEEE International Conference on Industrial Informatics(INDIN 2009), pp. 795–800.

18. Haimes Y, Santos J, Crowther K, Henry M, Lian C & Yan Z (2008) IFIP International Federa-tion for Information Processing, volume 253/2007, chapter Risk Analysis in InterdependentInfrastructures, pp. 297–310. Springer Boston.

19. Issacharoff L, Bologna S, Rosato V, Dipoppa G, Setola R & Tronci E (2006). A dynamicalmodel for the study of complex system’s interdependence. In: Proc. of Int. Workshop onComplex Network and Infrastructure Protection (CNIP 06).

20. Rosato V, Issacharoff L, Tiriticco F, Meloni S, Porcellinis SD & Setola R (2008) Modellinginterdependent infrastructures using interacting dynamical models. International Journal ofCritical Infrastructures 4(1-2): 63–79.

21. Sanders WH & Meyer JF (2002) Stochastic activity networks: formal definitions and concepts.In: Lectures on formal methods and performance analysis: first EEF/Euro summer school ontrends in computer science, pp. 315–343. Springer-Verlag New York, Inc.

22. Simonsen I, Buzna L, Peters K, Bornholdt S & Helbing D (2008) Transient dynamicsincreasing network vulnerability to cascading failures. Physical Review Letters 100(21).

23. Veríssimo P, Neves NF, Correia M, Deswarte Y, El Kalam AA, Bondavalli A & Daidone A(2008) Architecting Dependable Systems V, volume 5135/2008 of Lecture Notes in ComputerScience, chapter The CRUTIAL Architecture for Critical Information Infrastructures, pp.1–27. Springer Berlin / Heidelberg.

24. Bessani A, Sousa P, Correia M, Neves NF & Verissimo P (2008) The CRUTIAL way ofcritical infrastructure protection. IEEE Security and Privacy 6(6): 44–51.

25. Verissimo P, Neves NF & Correia M (2008) The CRUTIAL reference critical information in-frastructure architecture: a blueprint. International Journal of System of Systems Engineering1(1-2): 78–95.

26. Simões P, Capodieci P, Minicino M, Ciancamerla E, Panzieri S, Castrucci M & Lev L (2010)An alerting system for interdependent critical infrastructures. In: Proceedings of the 9thEuropean Conference on Information Warfare and Security. Academic Conferences Limited.

27. Castrucci M, Neri A, Caldeira F, Aubert J, Khadraoui D, Aubigny M, Harpes C, Simones P,Suraci V & Capodieci P (2012) Design and implementation of a mediation system enablingsecure communication among critical infrastructures. International Journal of CriticalInfrastructure Protection 5(2): 86–97.

28. Di Giorgio A & Liberati F (2011) Interdependency modeling and analysis of criticalinfrastructures based on dynamic bayesian networks. In: 19th Mediterranean Conference onControl Automation (MED), pp. 791–797.

29. Di Giorgio A & Liberati F (2012) A bayesian network-based approach to the criticalinfrastructure interdependencies analysis. IEEE Systems Journal 6(3): 510–519.

30. Svendsen NK & Wolthusen SD (2007) Graph models of critical infrastructure interdepend-encies. In: Proceedings of the 1st international conference on Autonomous Infrastructure,Management and Security (AIMS’07), pp. 208–211. Springer-Verlag, Berlin, Heidelberg.

31. Svendsen N & Wolthusen S (2007) Critical Infrastructure Protection, volume 253/2007of IFIP International Federation for Information Processing, chapter 24 - MultigraphDependency Models for Heterogeneous Infrastructures. Springer Boston.

32. Svendsen NK & Wolthusen SD (2007) Connectivity models of interdependency in mixed-typecritical infrastructure networks. Inf. Secur. Tech. Rep. 12(1): 44–55.

100

33. Svendsen N & Wolthusen S (2007) Analysis and statistical properties of critical infrastructureinterdependency multiflow models. In: IEEE SMC Information Assurance and SecurityWorkshop (IAW’07), pp. 247–254.

34. Porcellinis S, Oliva G, Panzieri S & Setola R (2009) A holistic-reductionistic approach formodeling interdependencies. In: Critical Infrastructure Protection III, volume 311 of IFIPAdvances in Information and Communication Technology, pp. 215–227. Springer BerlinHeidelberg.

35. Sokolowski J, Turnitsa C & Diallo S (2008) A conceptual modeling method for criticalinfrastructure modeling. In: 41st Annual Simulation Symposium (ANSS 2008), pp. 203–211.

36. Gursesli O & Desrochers A (2003) Modeling infrastructure interdependencies using petrinets. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 1506–1512.

37. Chakrabarty M & Mendonca D (2004) Integrating visual and mathematical models for themanagement of interdependent critical infrastructures. In: IEEE International Conference onSystems, Man and Cybernetics, pp. 1179–1184.

38. Permann M (2007) Toward developing genetic algorithms to aid in critical infrastructuremodeling. In: IEEE Conference on Technologies for Homeland Security, pp. 192–197.

39. Chou CC, Chen CT, Tseng SM & Lin JD (2008) A spatiotemporal model for persistingcritical infrastructure interdependencies. In: Proceedings of the 2008 Fifth InternationalConference on Fuzzy Systems and Knowledge Discovery (FSKD’08), pp. 489–493. IEEEComputer Society, Washington, DC, USA.

40. Laprie JC, Kanoun K & Kaaniche M (2007) Modelling Interdependencies Between theElectricity and Information Infrastructures, volume 4680/2007, chapter in Computer Safety,Reliability, and Security, pp. 54–67. Springer Berlin / Heidelberg.

41. Chiaradonna S, Lollini P & Di Giandomenico F (2007) On a modeling framework forthe analysis of interdependencies in electric power systems. In: 37th Annual IEEE/IFIPInternational Conference on Dependable Systems and Networks (DSN’07), pp. 185–195.

42. Casalicchio E, Galli E & Tucci S (2007) Federated agent-based modeling and simulationapproach to study interdependencies in it critical infrastructures. In: 11th IEEE InternationalSymposium on Distributed Simulation and Real-Time Applications (DS-RT 2007), pp.182–189.

43. Min HSJ, Beyeler W, Brown T, Son YJ & Jones AT (2007) Toward modeling and simulationof critical national infrastructure interdependencies. IIE Transactions 39(1): 57–71.

44. Panzieri S, Setola R & Ulivi G (2004) An agent based simulator for critical interdependentinfrastructures. In: 2nd International Conference on Critical Infrastructures (CRIS 2004).

45. Panzieri S, Setola R & Ulivi G (2005) An approach to model complex interdependentinfrastructures. In: 16th IFAC World Congress.

46. Tolone WJ, Wilson D, Raja A, Xiang WN, Hao H, Phelps S & Johnson EW (2004) Criticalinfrastructure integration modeling and simulation. In: Second Symposium on Intelligenceand Security Informatics (ISI-2004). Lecture Notes in Computer Science Nr.3073, pp.214–225.

47. Tolone WJ, Wilson D, Raja A, Xiang WN & Johnson EW (2004) Applying cougaar tointegrated critical infrastructure modeling and simulation. In: Open Cougaar Conference,New York City, pp. 3–10.

48. Kröger W (2008) Critical infrastructures at risk: A need for a new conceptual approach andextended analytical tools. Reliability Engineering and System Safety 93(12): 1781–1787.

101

49. Adar E & Wuchner A (2005) Risk management for critical infrastructure protection (CIP) chal-lenges, best practices tools. In: First IEEE International Workshop on Critical InfrastructureProtection.

50. Newman D, Nkei B, Carreras B, Dobson I, Lynch V & Gradney P (2005) Risk assessmentin complex interacting infrastructure systems. In: Proceedings of the 38th Annual HawaiiInternational Conference on System Sciences (HICSS’05).

51. Carreras BA, Newman DE, Gradney P, Lynch VE & Dobson I (2007) Interdependent risk ininteracting infrastructure systems. In: 40th Annual Hawaii International Conference onSystem Sciences, HICSS 2007.

52. Xiaolin C, Xiaobin T, Yong Z & Hongsheng X (2008) A markov game theory-based riskassessment model for network information system. In: International Conference on ComputerScience and Software Engineering, pp. 1057–1061.

53. Tan X, Zhang Y, Cui X & Xi H (2008) Using hidden markov models to evaluate the real-timerisks of network. In: IEEE International Symposium on Knowledge Acquisition and ModelingWorkshop (KAM 2008), pp. 490–493.

54. Haslum K & Arnes A (2006) Multisensor real-time risk assessment using continuous-timehidden markov models. In: International Conference on Computational Intelligence andSecurity, volume 2, pp. 1536–1540.

55. Hu ZH, Ding YS & Huang JW (2008) Knowledge based framework for real-time riskassessment of information security inspired by danger model. In: International Conferenceon Security Technology (SECTECH’08), pp. 91–94.

56. Baiardi F, Telmon C & Sgandurra D (2009) Hierarchical, Model-based Risk Management ofCritical Infrastructures. Reliability Engineering & System Safety 94(9): 1403–1415.

57. Haimes YY, Kaplan S & Lambert JH (2002) Risk filtering, ranking, and managementframework using hierarchical holographic modeling. Risk Analysis 22(2).

58. Jiang P & Haimes Y (2004) Risk management for leontief-based interdependent systems.Risk Analysis 24(5).

59. Chopade P & Bikdash M (2011) Critical infrastructure interdependency modeling: Usinggraph models to assess the vulnerability of smart power grid and scada networks. In: 8thInternational Conference Expo on Emerging Technologies for a Smarter World (CEWIT), pp.1–6.

60. Zonouz S, Rogers K, Berthier R, Bobba R, Sanders W & Overbye T (2012) Scpse: Security-oriented cyber-physical state estimation for power grid critical infrastructures. IEEETransactions on Smart Grid 3(4): 1790–1799.

61. Utne I, Hokstad P & Vatn J (2011) A method for risk modeling of interdependencies incritical infrastructures. Reliability Engineering & System Safety 96(6): 671–678.

62. Hurst W, Merabti M & Fergus P (2012) Operational support for critical infrastructuresecurity. In: IEEE 14th International Conference on High Performance Computing andCommunication & IEEE 9th International Conference on Embedded Software and Systems(HPCC-ICESS), pp. 1473–1478.

63. SERSCIS. [Online; Accessed: 3/2013] http://www.serscis.eu/.64. Kostopoulos D, Leventakis G, Tsoulkas V & Nikitakos N (2012) An intelligent fault

monitoring and risk management tool for complex critical infrastructures: The SERSCISapproach in air-traffic surface control. In: 14th International Conference on ComputerModelling and Simulation (UKSim), pp. 205–210.

102

http://www.serscis.eu/

65. Hall-May M & Surridge M (2010) Resilient critical infrastructure management using serviceoriented architecture. In: International Workshop On Coordination in Complex SoftwareIntensive Systems (COCOSS).

66. Surridge M, Chakravarthy A, Hall-May M, Chen X, Nasser B & Nossal R (2012) SERSCIS:Semantic modelling of dynamic, multi-stakeholder systems. In: Proceedings of the 2ndSESAR Innovation Days, Braunschweig, Germany (EUROCONTROL).

67. Eronen J & Laakso M (2005) A case for protocol dependency. In: IEEE InternationalWorkshop on Critical Infrastructure Protection, pp. 22–32. IEEE Computer Society.

68. Eronen J & Röning J (2006) Graphingwiki - a semantic wiki extension for visualising andinferring protocol dependency. In: Proceedings of the First Workshop on Semantic Wikis –From Wiki To Semantics. ESWC2006.

69. Eronen J, Karjalainen K, Puuperä R, Kuusela E, Halunen K, Laakso M & Röning J (2009) Soft-ware vulnerability vs. critical infrastructure - a case study of antivirus software. InternationalJournal on Advances in Security 2(1): 72–89.

70. Pietikäinen P, Karjalainen K, Eronen J & Röning J (2010) Socio-technical security assessmentof a voip system. In: The Fourth International Conference on Emerging Security Information,Systems and Technologies (SECURWARE 2010).

103

Original articles

I Schaberreiter T, Aubert J & Khadraoui D (2011) Critical infrastructure security model-ling and RESCI-MONITOR: A risk based critical infrastructure model. In: IST-AfricaConference Proceedings: 1–8.

II Schaberreiter T, Kittilä K, Halunen K, Röning J & Khadraoui D (2011) Risk assessment incritical infrastructure security modelling based on dependency analysis (short paper). In:6th international conference on critical information infrastructure security (CRITIS 2011):213–217.

III Schaberreiter T, Bouvry P, Röning J & Khadraoui D (2012) A Bayesian network basedcritical infrastructure model. In: EVOLVE - A Bridge between Probability, Set OrientedNumerics, and Evolutionary Computation II: 207–218.

IV Schaberreiter T, Bouvry P, Röning J & Khadraoui D (2013) Support tool for a Bayesiannetwork based critical infrastructure risk model. In: EVOLVE - A Bridge betweenProbability, Set Oriented Numerics, and Evolutionary Computation III: 53-75.

V Schaberreiter T, Varrette S, Bouvry P, Röning J & Khadraoui D (2013) Dependency analysisfor critical infrastructure security modelling: A case study within the Grid’5000 project. In:Multidisciplinary Research and Practice for Information Systems, IFIP International CrossDomain Conference and Workshop on Availability, Reliability and Security (CD-ARES2013): 269–287.

VI Schaberreiter T, Caldeira F, Aubert J, Monteiro E, Khadraoui D & Simones P (2011)Assurance and trust indicators to evaluate accuracy of on-line risk in critical infrastructures.In: 6th international conference on critical information infrastructure security (CRITIS2011): 30–41.

VII Caldeira F, Schaberreiter T, Varrette S, Monteiro E, Simones P, Bouvry P & Khadraoui D(2013) Trust based interdependency weighting for on-line risk monitoring in interdependentcritical infrastructures. International Journal of Secure Software Engineering (IJSSE) 4(4).

Reprinted with permission from IIMC ([I]), Springer Science + Business Media B.V.([II, III, IV, V, VI]) and IGI Global ([VII]).

Original publications are not included in the electronic version of the dissertation.

105


Book orders:Granum: Virtual book storehttp://granum.uta.fi/granum/

S E R I E S C T E C H N I C A

450. Mäkinen, Liisa (2013) Improvement of resource efficiency in deinked pulp mill

451. Guo, Yimo (2013) Image and video analysis by local descriptors and deformableimage registration

452. Pantisano, Francesco (2013) Cooperative interference and radio resourcemanagement in self-organizing small cell networks

453. Ojanperä, Tiia (2013) Cross-layer optimized video streaming in heterogeneouswireless networks

454. Pietikäinen, Martti (2013) Metalli- ja elektroniikkateollisuus Oulun eteläisenalueella : kehitys koulutuksen ja teknologian näkökulmasta

455. Pitkäaho, Satu (2013) Catalytic oxidation of chlorinated volatile organiccompounds, dichloromethane and perchloroethylene : new knowledge for theindustrial CVOC emission abatement

456. Morais de Lima, Carlos Héracles (2013) Opportunistic resource and networkmanagement in autonomous packet access systems

457. Nardelli, Pedro Henrique Juliano (2013) Analysis of the spatial throughput ininterference networks

458. Ferreira, Denzil (2013) AWARE: A mobile context instrumentation middlewareto collaboratively understand human behavior

459. Ruusunen, Mika (2013) Signal correlations in biomass combustion – aninformation theoretic analysis

460. Kotelba, Adrian (2013) Theory of rational decision-making and its applications toadaptive transmission

461. Lauri, Janne (2013) Doppler optical coherence tomography in determination ofsuspension viscosity

462. Kukkola, Jarmo (2013) Gas sensors based on nanostructured tungsten oxides

463. Reiman, Arto (2013) Holistic work system design and management : Aparticipatory development approach to delivery truck drivers’ work outside thecab

464. Tammela, Simo (2013) Enhancing migration and reproduction of salmonid fishes :Method development and research using physical and numerical modelling :Menetelmien kehittäminen ja tutkimus fysikaalisen ja numeerisen mallinnuksenavulla

C466etukansi.fm Page 2 Wednesday, September 18, 2013 11:24 AM

ABCDEFG

UNIVERSITY OF OULU P .O. B 00 F I -90014 UNIVERSITY OF OULU FINLAND


S E R I E S E D I T O R S

SCIENTIAE RERUM NATURALIUM

HUMANIORA

TECHNICA

MEDICA

SCIENTIAE RERUM SOCIALIUM

SCRIPTA ACADEMICA

OECONOMICA

EDITOR IN CHIEF

PUBLICATIONS EDITOR

Professor Esa Hohtola

University Lecturer Santeri Palviainen

Postdoctoral research fellow Sanna Taskila


University Lecturer Hannu Heikkinen

Director Sinikka Eskelinen

Professor Jari Juga


Publications Editor Kirsti Nurkkala

ISBN 978-952-62-0211-2 (Paperback)ISBN 978-952-62-0212-9 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)


TECHNICA


TECHNICA

OULU 2013

C 466

Thomas Schaberreiter

A BAYESIAN NETWORK BASED ON-LINE RISK PREDICTION FRAMEWORK FOR INTERDEPENDENT CRITICAL INFRASTRUCTURES

UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU, FACULTY OF TECHNOLOGY,DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING;INFOTECH OULU;UNIVERSITY OF LUXEMBOURG, FACULTY OF SCIENCE,TECHNOLOGY AND COMMUNICATION, COMPUTER SCIENCE ANDCOMMUNICATIONS RESEARCH UNIT, LUXEMBOURG;PUBLIC RESEARCH CENTRE HENRI TUDOR,SERVICE SCIENCE & INNOVATION, LUXEMBOURG

C 466

ACTA

Thom

as SchaberreiterC466etukansi.fm Page 1 Wednesday, September 18, 2013 11:24 AM

SERIES EDITORS TECHNICA A SCIENTIAE RERUM NATURALIUM...

Documents

Transcript of SERIES EDITORS TECHNICA A SCIENTIAE RERUM NATURALIUM...