Visual analysis of anonymized location and movement data - - PowerPoint PPT Presentation

visual analysis of anonymized location and movement data
SMART_READER_LITE
LIVE PREVIEW

Visual analysis of anonymized location and movement data - - PowerPoint PPT Presentation

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universitt Mnchen, Germany Bachelor Thesis Visual analysis of anonymized location and movement data Supervisor: Prof. Dr.-Ing. Georg Carle Advisor:


slide-1
SLIDE 1

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany

Bachelor Thesis

Visual analysis of anonymized location and movement data

Supervisor: Prof. Dr.-Ing. Georg Carle Advisor: Dipl.-Inf. Johann Schlamp Timo Lamprecht 6th Semester SS13 Bachelor Computer Science

slide-2
SLIDE 2

Agenda

➢ Goals & Benefits ➢ Basics

MEASRDROID, related work, 2-d-tree, k-anonymity

➢ Design ➢ Implementation ➢ Evaluation

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data 2

slide-3
SLIDE 3

Goals & Benefits

Goals:

➢ Visualization

(Street map, dynamcis e.g. zoom-function, time bar)

➢ Further use

(Adaptable or expandable, generic objects)

➢ Data management

(Increasing amount of data, acceptable performance)

➢ Anonymization

(Sensitive location data)

Benefits:

➢ Study behaviour of people

(soccer matches, demonstrations, concerts)

➢ Traffic infrastructure

(analysing frequently used routes)

➢ Public transportation

(analysing different time intervals)

➢ Traffic planning

(adjust traffic lights)

➢ Personal navigation

(avoid rush-hour traffic)

Display of places of locations and movement tendencies

3 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-4
SLIDE 4

Basics - MEASRDROID

Measured and send every 15 minutes (by default)

Database Server Android devices

with MEASRDOID application installed

Sensor information

  • Gravity
  • Gyroscope
  • Humidity
  • Light
  • Magnetic field

Network information

  • Reachable WLAN hotspots
  • Telephony information
  • IPv4 and IPv6 routes

Location information

  • Position
  • Satellite information

General information

  • Kernel build
  • Timestamp
  • Client ID

4 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-5
SLIDE 5

Basics - Related work

Community Seismic Network

➢ Display of all participating

seismic devices

➢ Visualization with rectangles ➢ Intuitive ➢ Limited number of objects ➢ Anonymization maintained?

5 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-6
SLIDE 6

Basics - k-anonymity

Motivation:

Non-anonymized data can cause massive legal problems → take anonymity seriously

➢ Consists of Identifiers, Quasi-Identifiers,

Sensitive Data

➢ Each combination of Quasi-Identifiers must

be merged to at least k times the same value

6 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-7
SLIDE 7

Basics - 2-d-tree

➢ Each node splits a plane into 2 half-planes ➢ Splitting-coordinate alternates ➢ Each leaf is a 2-dimensional point ➢ Has not to be balanced ➢ Well searchable data structure: O(log(n))

7 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-8
SLIDE 8

Design

8 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-9
SLIDE 9

Design – Data security

No sensitive data outside the database server → Prevent data leak even if web server gets compromised No access rights to database server → Database server initiates data transfer

9 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-10
SLIDE 10

Design – Data protection

Problem:

➢ Handling of sensitive location data ➢ Avoid showing precise positions of single users

(Problem: Telephone book + Single house) Solution:

➢ Show areas instead of single positions ➢ Aggregate with help k-anonymity ➢ There have to be at least k clients in an area

Interpretation of k-anonymity

10 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-11
SLIDE 11

Design - Visualization

➢ Location analysis

Rectangles to mark areas Available if there are at least k persons in the area Opacity should depend on available information for the area

➢ Movement analysis

Arrows to mark directions Available if at least k persons went in the same direction Minimal length of movement (dependent on zoom level) necessary for counting Stroke weight depends on the number of clients moved

11 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-12
SLIDE 12

Design - 2-d-tree adaption

➢ First split at 0° west ➢ Equal sized half-planes ➢ Increasingly precise grid evolves ➢ Relation between tree level and zoom-level of street map

12 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-13
SLIDE 13

Implementation

13 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-14
SLIDE 14

Implementation - Form

14

➢ Data packages available on web server ➢ Dynamic data requests to minimize network traffic ➢ Data package depends on selected time settings

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-15
SLIDE 15

Implementation - 2-d-tree buildup

15

➢ Information aggregation in each node ➢ Calculation of all necessary information for the drawable object ➢ Recursive method ➢ Check on anonymization while build up

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-16
SLIDE 16

Implementation – File transfer

Web server Database server Client

Creation of data file

Sending requested data file Dynamic data request via JS AJAX Frequent synchronization of completely processed data files 16 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-17
SLIDE 17

Evaluation

17 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-18
SLIDE 18

Evaluation – GPS data

➢ Overall distance travelled: 65,000 kilometres ➢ 15,000 GPS measurements available ➢ Number of satellites used per measurement:

18

➢ American NAVSTAR GPS consists of only 32 satellites but ~60 different

and unique GPS signals measured → Ground based augmentation system used (e.g. Munich airport)

  • Usually 4 satellites necessary (4 dimensions of room and time)
  • Average between 5 and 6 satellites
  • Probably measurement errors with just 2 satellites used
  • Measurement of 3 from 4 dimensions possible with 3 satellites
  • Precise position with a lot of satellites

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-19
SLIDE 19

Evaluation – Location analysis

19

Global overview Overview over distribution

  • f MEASRDROID

application Mostly all measurements were done in Europe High measurement density around Munich Almost no measurements in north east of Germany Run of Autobahn visible

(Munich → Nuremberg → Kassel, Munich → Augsburg → Stuttgart)

Precise rectangles between Munich and Garching High density along this axis

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-20
SLIDE 20

Evaluation – Movement analysis

20

➢ A lot of movement between Munich

and Garching

➢ Bigger arrows in north-south direction

than in east-west direction → more people moved in north-south direction

➢ Almost no movement out of the axis

Munich – Garching - Neufahrn

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-21
SLIDE 21

Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany

Thank you

Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-22
SLIDE 22

Backup

22 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-23
SLIDE 23

Basics - GPS

23

➢ Position determination with satellites ➢ NAVSTAR-II as most used system ➢ 2010: GLONASS, 2020: GALILEO, Beidou ➢ At least 24 active satellites necessary for global coverage ➢ Precision up to 10 meters

Functionality:

➢ Broadcast of position and time ➢ Measurement of time difference to determine radius ➢ Lack of high precision clocks in GPS receivers ➢ 4th satellite to measure 4th dimension (time)

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-24
SLIDE 24

Design - Achievements

24

Achievements:

➢ Overall number of GPS measurements ➢ Overall distance travelled ➢ Number of GPS measurements within 1° x 1° ➢ Number of GPS measurements within 100 seconds

Achievement server Database server

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-25
SLIDE 25

Overview

SQL alchemy, Python

Make data anonymous Build up of data structure Integrate data into a street map Presentation

  • n a website

JavaScript, HTML, CSS JSON, existing server scripts

Database server Web server Client Transfer

25 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-26
SLIDE 26

Implementation – Generic objects

26

➢ JavaScript prototype framework to simulate classes

➢ Set of functions

Constructor: function * *.prototype.createGoogleObject = function *.prototype.setVisible = function *.prototype.getInformationString = function *.prototype.recalculateAttributes = function ( * = class name )

➢ Necessary for each supported object type

(rectangle, circle, arrow, …) to ensure expandability

09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-27
SLIDE 27

Implementation – SQL alchemy

➢ SQL alchemy ➢ Object oriented writing ➢ Selection of all necessary information ➢ Filter on time interval

27 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data

slide-28
SLIDE 28

Implementation – Storing of 2-d-tree

➢ Recursive method ➢ Latitude, longitude, children: Node information and tree structure ➢ Attributes: Information for drawable object ➢ Information: Additional information (number of persons, size of area)

28 09/10/2013 Timo Lamprecht – Visual analysis of anonymized location and movement data