Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) - - PowerPoint PPT Presentation

dependable intrusion tolerance
SMART_READER_LITE
LIVE PREVIEW

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) - - PowerPoint PPT Presentation

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Josh Levy, Hassen Sadi,Tomas Uribe October 2001 Acknowledgements Research sponsored under DARPA


slide-1
SLIDE 1

1

Dependable Intrusion Tolerance

Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Josh Levy, Hassen Saïdi,Tomas Uribe

October 2001

Acknowledgements Research sponsored under DARPA Contract N66001-00-C-8058. Views presented are those of the authors and do not represent the views of DARPA or the Space and Naval Warfare Systems Center

slide-2
SLIDE 2

2

Dependable Intrusion Tolerance

J Intrusion Detection to Date

G

Seeks to detect an arbitrary number of attacks in progress

G

Relies on signature analysis and probabilistic (including Bayes) techniques

G

Response components immature

G

No concept of intrusion tolerance

J New Emphasis

G

Detection, damage assessment, and recovery

G

Finite number of attacks or deviations from expected system behavior

G

Seek a synthesis of intrusion detection, unsupervised learning, and proof-based methods for the detection aspect

G

Concepts from fault tolerance are adapted to ensure delivery of service (possibly degraded)

slide-3
SLIDE 3

3

Outline

J Architecture overview J Sensor subsystem J Proxy functionality J Stopping Code Red J Selecting optimal response J Summary

slide-4
SLIDE 4

4

Architecture

e EMERALD Network Appliance Sensor Subnet Proxy-AS Subnet External traffic e e2 App Server App Server App Server App Server Proxy

slide-5
SLIDE 5

5

Architecture (2)

Proxy e e2 App Server App Server App Server App Server e EMERALD Network Appliance Sensor Subnet Proxy-AS Subnet External traffic Proxy Proxy Proxy

slide-6
SLIDE 6

6

Sensor Subsystem for Situational Awareness

J EMERALD host and network sensors detect a variety of known attacks. J EMERALD probabilistic sensors potentially detect novel attacks, but perhaps symptomatically (and therefore after the fact). J Content agreement and challenge/response protocols detect corrupted content, regardless of the mechanism by which it became corrupted. J On-line verifiers which check overall system compliance with the system specification by formal means at run time.

slide-7
SLIDE 7

7

The Sensor Picture

Application Server

EMERALD Host Monitor Critical APP EMERALD APP Monitor Proof Based Trigger

Tolerance Proxy

EMERALD Host Monitor Proxy function On-Line Verifier IDS Network Appliance EMERALD AMI Net Experts Blue Sensor eBayes-TCP

Application Server

EMERALD Host Monitor Critical APP EMERALD APP Monitor Proof Based Trigger

Application Server

EMERALD Host Monitor Critical APP EMERALD APP Monitor Proof Based Trigger

Application Server

EMERALD Host Monitor Critical APP EMERALD APP Monitor Correlation Note: The Net appliance has a passive interface for the network traffic. Net appliance and app servers have write-only access to sensor subnet (for alert reporting). Proxies use sensor subnet for alert reporting and management.

e

slide-8
SLIDE 8

8

Proxy Implementation

J Basic functionality:

G

Accept HTTP connection

G

Read client HTTP request

G

Check ACLs

G

Load balancing

G

Send reply to client

J Functionality to implement intrusion tolerance

G

Effect change of policy if needed

G

Check content agreement (depends on dynamic policy)

G

Challenge/response protocol monitors file system integrity

G

Alert the sensor subsystem if required

slide-9
SLIDE 9

9

Ensuring Correct Content

J In agreement modes, we compare content from more than one APP server J For efficiency and bandwidth, we actually check MD5 checksums for all polled servers J If these agree, we obtain content from one of the servers and actually verify the MD5 at the proxy J If this agrees with the previous MD5 check, the content is forwarded to the client

slide-10
SLIDE 10

10

Four policy levels

J Benign - 1 GET request J Duplex (default regime at system start)

G

1 HEAD (get MD5 only) and 1 GET (MD5 plus content).

G

If MD5 agree, send content to client

G

Otherwise, go to Triplex

J Triplex -

G

2 HEAD- and 1 GET-request.

G

If MD5 all agree, send content to client. If majority obtained, consider minority AS COMPROMISED. Send content to client, rebuild AS, continue Triplex

J Full Agreement J Transition to a more permissive regime after some time of normal activity

slide-11
SLIDE 11

11

Stopping Code Red (and NIMDA)

Distributed Proxy Bank IDS Appliance

IIS

  • 1. 3/4 of Code Red atempts miss the IIS server
  • 2. IDS detects attempt. System invokes agreement mode
  • 3. In case of a successful infection, corrupt content is

detected and reinfection attempts are blocked

  • 4. Clients get valid content while compromised server is rebuilt
slide-12
SLIDE 12

12

Selecting the Optimal Response

J System responses include increased agreement modes, restarting servers, or restarting proxies J These responses can all temporarily degrade system performance J Responses are invoked based on imperfect or delayed situational awareness J Approach:

G

Define objective functions for the system (percent of dropped requests and percent invalid replies)

G

Estimate degree to which system state optimizes the objective

G

Consider present state and likely evolution under the available

  • responses. Response actions have a cost with respect to the
  • bjective.

G

Select response that best optimizes the objective over time

G

Elements of dynamic programming and Markov Decision Processes

slide-13
SLIDE 13

13

Simulation Analysis of MDP

J System is modeled as 14 Poisson processes

G

Processes include client requests, server replies, challenge/response requests (from proxy, to assess content validity), random failures, attacks (which make transitions between attack states), IDS false alarms, IDS detections,...

G

Process rates are state dependent

G

Requests, attacks, failures always ON. Response process is ON if there are active requests. False alarms are always ON, detections are ON if there are active attacks in a detectable state.

J System performance is based on true state. Tolerance response is based on sensor reports

G

Responses include various levels of content agreement as well as server reboot

J Objective: Minimize dropped requests and requests with invalid replies (the latter come from a root-compromised app server)

G

All tolerance responses have a cost with respect to these objectives, but not responding can also cost

slide-14
SLIDE 14

14

Initial Results

J Requests arrive at 1000/unit time. Total reply capacity is 4000/unit

  • time. Attack rate is 50/unit time.

J Redundancy is beneficial, but diminishing returns beyond 2 App Servers (Total server capacity is 4000/unit time) J Frequent challenge/response requests improve system objectives

App Servers % Drop % Invalid 1@4000/time 3.62 2.78 2@2000/time each 0.04 1.26 3@1333/time each 0.16 0.59 4@1000/time each 0.99 0.51 λ Challenge % Drop % Invalid 26.21 100 0.43 1.89 500 0.99 0.51 1000 0.31 0.33

slide-15
SLIDE 15

15

Summary

J Developing an intrusion tolerant server architecture J Key feature is redundant capability provided by diverse implementation J A variety of IDS, symptom detectors, and on-line verifiers provide situational awareness J Stepped policy response enforces content agreement in suspicious situations

slide-16
SLIDE 16

16

(Backup) Poisson Processes

J Poisson process: Event stream where inter-event times have an exponential distribution. Parameter is referred to as the process rate, typically denoted λ J Mathematical properties of multiple simultaneous Poisson processes lead to tractable implementation:

G

Overall process is Poisson, with overall rate equal to the sum of the rates of the individual processes

G

Next event is of a given class with the following probability:

λoverall = λi

i

P Next event is of class i

( ) = λi λ overall

slide-17
SLIDE 17

17

Proxy Capabilities Simulated

J IDS detect probes and root compromises, but occasionally fail to detect or are too slow, or generate false alerts J Asset distress monitor (blue sensor) can detect a “down” server by rate of failed requests J Proxy detects AHBL when request queue overflows J Challenge/Response: Periodically issues a request to all servers, for which the reply is known

G

Can detect compromised server if reply is invalid

G

Can detect a “down” server

G

These detections are typically much later than from an IDS

J Available responses are:

G

Invoke a content agreement regime for client requests with 2..n servers

G

Reboot a server

slide-18
SLIDE 18

18

Processes and Rates

Process Rate per unit time Comment Request 1000 Reply 4000 total Active if there are active requests Challenge/Response 500 Compete with client requests for server bandwidth Non-malicious crash 1 Reboot 100 So E(reboot time)=0.01 Probe attack 50 Probe_to_root 10 Probe_to_crash 5 Probe_to_term 5 Root_to_crash 5 Root_to_term 5 Attack in this state compromises host Probe_detect 10 Root detect 50 Must detect before root_to_term False Detect 5

Note: Time units not specified. These rates should be viewed as relative.