Anomaly Based Intrusion Detection in Distributed Applications - PowerPoint PPT Presentation
Anomaly Based Intrusion Detection in Distributed Applications without global clock Eric Totel, Mouna Hkimi, Michel Hurfin, Mourad Leslous, Yvan Labiche SEC2-2016 5 July 2016 Outline of the Presentation Position of the problem Building
Anomaly Based Intrusion Detection in Distributed Applications without global clock Eric Totel, Mouna Hkimi, Michel Hurfin, Mourad Leslous, Yvan Labiche SEC2-2016 5 July 2016
Outline of the Presentation • Position of the problem • Building a distributed application behavior model • Partial Event Ordering • Automaton recognizing sequences • Temporal properties • Applying Detection on an example • Results on a distributed file system application CentraleSupelec 2
Intrusion Detection in Distributed Systems • Several nodes running processes • Intrusion Detection Systems are deployed • On the network (NIDS) • On each node (HIDS) • Local Detection of compromission • No relationship between the states of the several nodes • Alerts emitted takes into account the state of one node • Current solutions • Alert correlation: Requires total ordering of all alerts • DIDS: Requires total ordering of all events analyzed • In Cloud environments: virtual machines are often desynchronized (clock drift) CentraleSupelec 3
The Case of a Distributed Application • How to enhance the detection ? • Statement • The states of the different nodes are not independent • As such the behaviors of the different nodes of the application are not independent • The actions performed by the nodes are causally dependent on each other • Local actions • Messages exchanged • Solution • Build a reference model that takes into account the causal dependencies between the nodes • Without relying on a total ordering of the events (no global clock) CentraleSupelec 4
Logs and partial ordering • On each node, a process produces a total ordered log • Partial Ordering of the events on different nodes (Lamport happened before relationship) 8 e 2 E α , 8 f 2 E α , e � α f • e occured before f in the same log E α i • e is a message send and f its receipt e � α g and g � α f • there exists g such that • How to learn the right sequences of actions performed by the distributed processes ? CentraleSupelec 5
Example of Logs • A trace: two logs of two processes and a b c1!m Execution α p1 E α E α 1 2 1 a d p2 2 b c1?m d c1?m e 3 c1!m e • On this execution, a � α b • No order relation between b and d: • {a, b, d, c1!m, c1?m, e} is a valid sequence … but not the only one ! CentraleSupelec 6
Notion of a valid sequence • Observed correct normal sequence • Compliant with the partial relationship • A sequence of events is valid iff CentraleSupelec 7
Generation of valid sequences (1) p2 Execution α Generating the lattice of e consistent cuts E α E α 1 2 1 a d c1?m 2 b c1?m A valid sequence is a sequence 3 c1!m e of events consumed by a path d in the lattice of consistent cut a b c1!m p1 CentraleSupelec 8
Generation of valid sequences (2) Generation of an automaton containing all the paths in the lattice of consistent cuts p2 c1!m 3 d e 2 b c1?m e d 13 9 6 c1!m 1 a d 7 0 d c1?m b 17 a 16 d a b c1!m p1 CentraleSupelec 9
Automaton from several executions Execution α Execution β E β E β E α E α 1 2 1 2 1 a d a f 2 b c1?m c1!m c1?m 3 c1!m e g � ���� � � � ���� � � � � � ���� � � � � � � � � � � � ���� �� � �� ���� � � � �� �� �� ���� Merge the start states �� of all the automata � �� CentraleSupelec 10
Analysis of the automaton • Contains only the observed valid sequences • In practice: • In a heavy distributed application, it is very difficult to exhibit all the behaviors of the application due to concurrency • It is thus very difficult to learn a complete behavior model • Solution: • Generalization of the automaton • Permits to introduce new unlearned behaviors • Ensures that all the original valid sequences are included in the generalized automaton CentraleSupelec 11
Generalization (k-tail algorithm) ���� � � � � ���� � � � � � ���� � � � � � � � � Disadvantage: can introduce incorrect sequences � � � ���� �� of events at the same time � �� ���� � � � �� �� �� ���� �� � k=1 (a low k permits a higher generalization) �� ���� � � � � � � � � ���� � � Advantage: can introduce new valid unlearned � � � ���� � � � � ���� � � sequences of events � �� � � � � �� ���� �� � CentraleSupelec 12
How to deal with incorrect sequences ? • Duality of models • Automaton: exhaustive list of sequences • Temporal properties: properties on the types of events • Temporal invariants • Issued from the domain of test • Three invariants considered (a and b are event types) • a is always followed by b • a is never followed by b • a always precedes b CentraleSupelec 13
Invariants on our example ���� � � � � ���� � � � � � ���� � � � � � � � � � � � ���� �� � �� ���� � � � �� �� �� ���� �� � �� (total of 59 invariants) Generalization Model checking ���� � � � � � � � � ���� � � � � � ���� � � � ���� � � � � �� � � � � �� ���� �� � CentraleSupelec 14
Duality of models Model Generalized Invariants that can be Automaton violated by the generalized automaton ���� � � � � � � � � ���� � � � � � ���� � � � � ���� � � � �� (total of 10 invariants) � � � � �� ���� �� � Non acceptable sequence {a, b, c1!m, d, c1?m, g} CentraleSupelec 15
Valid/Accepted/Acceptable sequences • Invariants computed on the original lattice of consistent cuts ∑ '' acceptable • Invariants on valid sequences of sequences events • Invariants are less restrictive than the automaton ∑ ' • We consider a sequence is sequences acceptable if it is accepted by the accepted by the generalized ∑ automaton and complies with the automaton valid invariants sequences CentraleSupelec 16
Detection algorithm • Given a trace • Is this trace compliant: • With the generalized automaton • With the temporal invariants • Two strategies • All total ordering of the events of the trace are compliant with the model • At least one order of the events of the trace is compliant with the model • In practice • Strategy « all » is more time consuming • Similar false positive rate in both approaches CentraleSupelec 17
Simple Example: e-commerce • 3 processes: article buying, 70 possible different behaviours P2-P1!SEARCH P1-P2?AVAILABLE P1-P2?AVAILABLE P2-P1!BUY P1-P2?SOLD Process (p2) P1-P3!SOLD P3-P1?BUY P3-P1?SEARCH P1-P3!AVAILABLE Server (p1) P2-P1?BUY P1-P2!SOLD P2-P1?SEARCH P1-P2!AVAILABLE Process (p3) P3-P1!SEARCH P1-P3?AVAILABLE P3-P1!BUY P1-P3?SOLD CentraleSupelec 18
Detection Accuracy • Simulations of an intrusion • Removing an event • Modifying the order of events • Adding new events • Violating the integrity of the distributed logs • Are detected by the approach CentraleSupelec 19
Generalization and False Positive Rate False Positive Rate • Learning Phase with 10, 20, 30, 40, 50, 60 traces traces learned=10 traces learned=20 traces learned=30 traces learned=40 traces learned=50 traces learned=60 • With a generalization 90% 85% 84% 80% parameter k=1, 2, 3, 4, 5 75% 71% 70% 70% 68% 65% 60% 52% 50% • Result: 42% 40% 39% • The generalization 31% 30% 30% decreases the rate of the 22% 20% 19% 19% 18% 16% false positives, even with a 10% 10% 9% 9% 8% 8% 6% low number of traces learnt 3% 3% 2% 1% 0% 0% 0% 0% 1 2 3 4 5 k CentraleSupelec 20
Real World Evaluation: XtreemFS • High Availability Distributed Replicated File System • Intrusion Detection approach applied on a simple configuration of the nodes CentraleSupelec 21
Experimentation applied • Writing of a set of files • 500 files used to learn the model • 1640 files written to measure the false positive rate • Traces obtained on each node by instrumenting the code of the file servers • One trace for a complete file write CentraleSupelec 22
Model Size • Number of traces used Model Size to learn the model 7800 800 grow 7600 700 • The number of Number of invariants 7400 600 Number of States invariants lower 7200 500 • The size (number of invariants 7000 400 states) of the 6800 300 States automaton grows (k- 6600 200 tail applied with k=1) 6400 100 6200 0 10 50 100 200 300 400 500 Number of Traces CentraleSupelec 23
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.