· Abstract automata DFAs Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva 7 property...

CategoricalAutomataLearningFramework

Gerco van Heerdt Matteo Sammartino Alexandra Silva

LiVe @ ETAPS 2017

calf-project.org

http://calf-projec.org

Our team

Alexandra Silva UCL

Gerco van Heerdt UCL

Joshua Moerman Radboud University

Maverick Chardet ENS Lyon

Collaborators:

Bartek Klin Warsaw University

Michal Szynwelski Warsaw University

Matteo Sammartino UCL

Automata learning

Learnerqueries

answers

builds

Systemblack-box

S

automatonmodel of

S

Automata learning

Learnerqueries

answers

builds

Systemblack-box

S

automatonmodel of

S

No formal specification available? Learn it!

set of system behaviors is a regular language

Finite alphabet of system’s actions A

L � A�

L* algorithm (D.Angluin ’87)



L � A�


Learner TeacherL



L � A�


Learner TeacherL

Q:w � L?A: Y/N



L � A�


Learner TeacherL

Q:w � L?A: Y/N

L(H) = L?Q:A: Y / N + counterexample

H = hypothesis automatonH



L � A�


Learner TeacherL

Q:w � L?A: Y/N

Minimal DFA accepting L L

builds

L(H) = L?Q:A: Y / N + counterexample

H = hypothesis automatonH

Non-deterministic

Weighted

Register

Mealy Machines

A zoo of automata

AlternatingUniversal

Probabilistic

Non-deterministic

Weighted

Register

Mealy Machines

Algorithms Correctness proofs

involved and hard to check

A zoo of automata


Probabilistic

Non-deterministic

Weighted

Register

Mealy Machines

Algorithms Correctness proofs

involved and hard to check

A zoo of automata


Probabilistic

Category theory comes to the rescue!

Abstract automata

DFAs

Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva 7

property stating that for all s1

, s2

œ S such that ›(s1

) = ›(s2

) we must have L(s1

) = L(s2

).The Lı algorithm ensures this by having Á œ E, using that L(s) = ›(s)(Á) for any s œ S.

4 A General Correctness Theorem

In this section we work towards a general correctness theorem. We then show how it appliesto the ID algorithm, to the algorithm by Arbib and Zeiger and to Lı.

S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)

The key observation for the correctness theoremis the following. Let w = (S ‡≠æ T, T

fi≠æ P ) bea wrapper with minimization (S e≠æ H, H

m≠æ P ).If ‡ œ E , then the factorization system gives us aunique diagonal „ in the left square of (3), which by (the dual of) Proposition ?? satisfies„ œ E . Similarly, if fi œ M, we have Â in the right square of (3), with Â œ M. Composing thetwo diagrams and using again the diagonal property, one sees that „ and Â must be mutuallyinverse. We can conclude that H and T are isomorphic. Now, if w is a wrapper producedby a learning algorithm and T is the state space of the target minimal automaton, as inExample 2, our reasoning hints at a correctness criterion: ‡ œ E and fi œ M upon terminationensure that H is (isomorphic to) T . Of course, the criterion will have to guarantee that theautomata, not just the state spaces, are isomorphic.

We first show that the argument above lifts to F -algebras f : FT æ T , for an arbitraryendofunctor F : C æ C preserving E .

I Lemma 11. For a wrapper w = (‡, fi) and an F -algebra f , if ‡ œ E, then w is f -closed; iffi œ M, then w is f-consistent.

Proof. If ‡ œ E , then let „ be as in (3) and define closef = „ ¶ f ¶ F‡; if fi œ M, then let Âbe as in (3) and define consf = fi ¶ f ¶ FÂ. J

I Proposition 12. For a wrapper w = (‡, fi) and an F -algebra f , if ‡ œ E and w is f-consistent, then „ as given in (3) is an F -algebra homomorphism (T, f) æ (H, ◊f ); if fi œ Mand w is f-closed, then Â as given in (3) is an F -algebra homomorphism (H, ◊f ) æ (T, f).Proof. Assume that ‡ œ E and w is FT T H

FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m

1 definition of ›f

2 (3)3 functoriality, (3)4 definition of ◊f

5 closedness

f -consistent. The proof for the otherpart is analogous. Using Lemma 11 wesee that ◊f indeed exists. Because F‡is epic and m monic, it su�ces to showm¶„¶f ¶F‡ = m¶◊f ¶F„¶F‡, whichis done on the right. (The definition of◊f can be found in the proof of Proposition 9.) J

I Corollary 13. If ‡ œ E and fi œ M, then „ as given in (3) is an F -algebra isomorphism(T, f) æ (H, ◊f ).

Now we enrich F -algebras with initial and final states, obtaining a notion of automaton in acategory. Then we give the full correctness theorem for automata. We fix objects I and Y inC, which will serve as initial state selector and output of the automaton, respectively.I Definition 14 (Automaton). An automaton in C is an object Q FQ

Q

I Y

”Q

outQinitQ

of C equipped with an initial state map initQ : I æ Q, an outputmap outQ : Q æ Y , and dynamics ”Q : FQ æ Q. An input systemis an automaton without an output map; an output system is anautomaton without an initial state map.



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ


Endofunctor = automaton typeF : C � C

Category = universe of state-spacesC

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




DFAs

F = (�)⇥A

C = Set

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




Q⇥ADFAs

F = (�)⇥A

C = Set

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




Q⇥A

1

DFAs

F = (�)⇥A

C = Set

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




Q⇥A

1q0 2 Q

DFAs

F = (�)⇥A

C = Set

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




Q⇥A

1q0 2 Q

2

DFAs

F = (�)⇥A

C = Set

Abstract automata

DFAs



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




Q⇥A

1q0 2 Q F ✓ Q

2

DFAs

F = (�)⇥A

C = Set

Abstract learning

Abstract observation data structure

Abstract learning


approximates



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ


Target minimal automaton

Abstract learning


approximates



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ



Hypothesis automaton



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness



Now we enrich F -algebras with initial and final states, obtaining a notion of automaton in acategory. Then we give the full correctness theorem for automata. We fix objects I and Y inC, which will serve as initial state selector and output of the automaton, respectively.I Definition 14 (Automaton). An automaton in C is an object Q FH

H

I Y

”H

outHinitH


abstract closedness and consistency

Abstract learning


General correctness theorem=Guidelines for implementation

approximates



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ






, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




H

I Y

”H

outHinitH



Abstract learning


General correctness theorem=Guidelines for implementation

approximates



, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ




, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




Q

I Y

”Q

outQinitQ






, s2


) = ›(s2

) we must have L(s1

) = L(s2




S T

H P

‡

e fi„

m

S H

T P

e

‡ mÂ

fi

(3)








FS P

FT FH H

f „

fi m›f

F ‡ F e closef

F ‡ 1

2

F „

3

4

5

◊f

m



5 closedness




H

I Y

”H

outHinitH



CALF: Categorical Automata Learning FrameworkGerco van Heerdt, Matteo Sammartino, Alexandra Silva

(arXiv:1704.05676)

Other automata & optimizations

NomVect

Set DFAsNominal automataWeighted automata

Change base category


NomVect


Change base category


Side-effects (via monads)NFAs

Partial automata

Universal automataAlternating automata

PowersetPowerset with intersection

Double powersetMaybe monad

NomVect


Change base categoryObservation tables

Discrimination trees

Change main data structure



Partial automata




NomVect




Change main data structureLearning Nominal Automata (POPL ’17)

Joshua Moerman, Matteo Sammartino, Alexandra Silva, Bartek Klin, Michal Szynwelski



Partial automata




NomVect




Change main data structureLearning Nominal Automata (POPL ’17)

Joshua Moerman, Matteo Sammartino, Alexandra Silva, Bartek Klin, Michal Szynwelski



Partial automata




Learning Automata with Side-effectsGerco van Heerdt, Matteo Sammartino, Alexandra Silva

(arXiv:1704.08055)

Automaton type

Optimizations

Minimization algorithms Testing algorithms

Automata Learning algorithms

Connections with other algorithms

Ongoing and future work

• Library & tool to learn control + data-flow models (as nominal automata)

• Applications:

• Specification mining • Network verification, with • Verification of cryptographic protocols • Ransomware detection

· Abstract automata DFAs Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva 7 property...

Documents

Transcript of · Abstract automata DFAs Gerco van Heerdt, Matteo Sammartino, and Alexandra Silva 7 property...