Two Questions that come to mind The Design of an Acquisitional Query - - PDF document

two questions that come to mind the design of an
SMART_READER_LITE
LIVE PREVIEW

Two Questions that come to mind The Design of an Acquisitional Query - - PDF document

Two Questions that come to mind The Design of an Acquisitional Query One of the major purposes of Sensor Networks? Processor For Sensor Networks Monitoring (data collection using sensors) What is the biggest limitation of sensor networks? By:


slide-1
SLIDE 1

1

1

The Design of an Acquisitional Query Processor For Sensor Networks

By: Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein and Wei Hong Presented by: Ibrahim Noorzaie

2

Two Questions that come to mind

One of the major purposes of Sensor Networks?

Monitoring (data collection using sensors)

What is the biggest limitation of sensor networks?

Power

3

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

4

Main Ideas

Acquisitional Issues

When and how often data is physically acquired and delivered to query processing operators

Regular SQL used to optimize data acquisition techniques

Influence query optimization, dissemination and execution

TinyDB - a special purpose distributed query processor designed for sensor networks Techniques to reduce power consumption

5

Introduction

One of main difference between WSN and other Ad-hoc networks is minimum human interaction

Interaction: configuration of network routes, recharging or tuning parameters Energy efficiency to support long term operation

Availability of data not assumed at the time of request

Request for data (queries) are not blocking and can run for days and months

TinyDB an aquisitional query processor

Supports select, join, project and data aggregation with power consumption in mind

Some questions about data acquisition answered

When should data be sampled? Which sensor nodes are relevant to a data request (query)? What order of sampling data? Is it worth processing a particular sample?

6

Basic Architecture

Queries are submitted at the base-station

Parsing, optimization at the base-station

slide-2
SLIDE 2

2

7

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

8

Sensor Network Overview

An early application of WSN was natural habitat monitoring Some restrictions/requirements:

No maintenance for up to a year Monitor sea birds Monitor burrows (coming/going, temperature, light etc.) Power consumption is a serious issue Data collection intervals (cycles)

9

Properties of Sensor Devices

Power is main issue Frequent enough data collection cycles for correct representation Longer sleep times for longer operation of the system TinyDB best suited for data collection and power consumption schemes Power consumption in Motes withTinyDB Power consumption in four phases (sleeping, processing, processing/listening, transmitting)

The processor and radio are idle 10

Communication in Sensor Networks

Typical radio range for low power radio is between a few to 100 ft. Radio range depends on transmission in environmental conditions Multi-hop networking is useful in such situation Communication in Mica motes

Broadcast : any node in the network can hear anything Snooping

Acknowledgments only from neighboring nodes

Per-message, link-level (hop count) acknowledgements from neighbors No end to end communication

11

Communication in Sensor Networks

Queries in Sensor networks

Queries disseminated using routing tree Several trees could be made due to multiple parents. This could help support multiple queries

Query SELECT light WHERE x>3 AND x<7

12

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

slide-3
SLIDE 3

3

13

Acquisitional Query Language

Focus is on when and how often samples are acquired ACQL is a subset of standard SQL with support for WSN functions Queries are distributed and data collection can last for specified amount of time A single table manages all the sensors in the network Data is sampled from sensors at certain intervals which are spaced an epoch apart Tuple is a unique row in the data table Comparing three types of queries: English Query: Return nodeid, light and temperature data from sensors. Standard SQL Query: SELECT nodeid,light,temp FROM sensors; TinyDB Query: SELECT nodeid,light,temp FROM sensors SAMPLE INTERVAL 1s FOR 10s

14

Acquisitional Query Language

The sensors table is a virtual unbounded table All data is streamed back to the root in online fashion (stream)

Certain operations such as sort, symmetric join etc. are not possible

Storage Point (Window): data can be buffered in a node for use by other queries

Makes it possible to do sort, joins etc. Uses interpolation when intervals are different

SELECT count(*) FROM sensors AS s, recentlight As r1 WHERE r1.nodeid = s.nodeid AND s.light < r1.light SAMPLE INTERVAL 10s CREATE STORAGE POINT recentlight SIZE 8 AS (SELECT nodeid, light FROM sensors SAMPLE INTERVAL 10s)

15

Acquisitional Query Language

TinyDB can use aggregate functions to reduce the amount of traffic in the network Following type of query also know as sliding window query

Will report average volume over last 30s once every 5s, sampling once per second. SELECT WINAVG(volume, 30s, 5s) FROM sensors SAMPLE INTERVAL 1s

Queries assigned ids can be stopped anytime using “STOP QUERY id” command or a specific time limit can be set

16

Event-Based Queries

TinyOS uses events to conserve power

Events are triggered either by another query or the OS

TinyDB uses the same model in data collection

Events trigger data collection cycles Events can also serve as stopping conditions for queries Event-based query: ON EVENT bird-detect (loc): SELECT AVG (light), AVG (temp), event.loc FROM sensors AS s WHERE dist (s.loc, event.loc) < 10m SAMPLE INTERVAL 2s for 30s

Events are local (not-distributed) currently

17

Event-Based Queries

18

Lifetime-Based Queries

Researchers are not concerned with low level details of power management etc. Lifetime-Based query

SELECT nodeid, accel FROM sensors LIFETIME 30 days

TinyDB figures out best sampling rate by considering:

Joules of energy remaining Cost of transmitting sensor data Cost of accessing sensor data Etc.

slide-4
SLIDE 4

4

19

Lifetime-Based Queries

Estimation

Available power per hour Energy to collect and transmit

  • ne sample (including children)

data

20

Lifetime-Based Queries

In the previous calculation the user has no control over sampling rate The user can set a minimum sample rate to keep data meaningful.

MIN SAMPLE RATE r

If estimated rate lower than r, then r used

21

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

22

Power-Aware Optimization

We discuss (power) optimization, query dissemination and execution Queries are parsed (into binary) and optimized at the base-station before dissemination Optimizer needs information (metadata) about nodes

Metadata could include local attributes, events and user-define functions

23

Metadata Management

Nodes in TinyDB keep a catalog of local information called metadata Metadata used in query optimization, query dissemination and result processing Metadata also used to keep information on TinyDB’s aggregate functions

Monotonic and Exemplary or Summary COUNT is monotonic, MIN is Exemplary, AVERAGE is SUMMARY

Metadata of a single attribute

24

Ordering of Sampling and Predicates

We discuss how meta data is used in query optimization Power consumption depends on sensor types Short circuiting could be very useful for conserving power

SELECT accel, mag FROM sensors WHERE accel > c1 AND mag > c2 SAMPLE INTERVAL 1s

slide-5
SLIDE 5

5

25

Ordering of Sampling and Predicates

Order in processing aggregation is also important This reordering is called exemplary aggregate pushdown

Applicable to any exemplary aggregate i.e MAX, MIN In the following query, it may be cheaper to check if current light reading is greater than the previous. This could avoid sampling the more expensive magnetometer.

SELECT MAX(light) FROM sensors WHERE mag > x SAMPLE INTERVAL 8s

26

Event Query Batching to Conserve Power

Interesting, useful and common sense approach to saving power In event-based model, many instances of similar internal queries maybe running at the same time for each event The following query starts an instance of the query each time an EVENT e

  • ccurs

ON EVENT e (nodeid) SELECT a1 FROM sensors AS s WHERE s.nodeid = e.nodeid SAMPLE INTERVAL d FOR k

Our goal is to reduce power consumption by reducing sampling

27

Event Query Batching to Conserve Power

Multi-query optimization based on rewriting

The events of type e are stored in a buffer Uses a blackboard approach The internal query services all events in the buffer Event buffer checked on each sampling More power saved, but exact sampling times not guaranteed for each event Some extra random sampling to offset latency Sliding window join between events and sensors, with window size k Events older than k seconds are dropped from the buffer

SELECT s.a1 FROM sensors AS s, events As e WHERE s.nodeid = e.nodeid AND e.type = e AND s.time – e.time <= k AND s.time > e.time SAMPLE INTERVAL d

28

Event Query Batching to Conserve Power

This approach offers significant saving of power when many event can be

  • ccurring closely
  • Async. events occurring with increasing duration have higher costs since more

simultaneous queries can start executing We notice the number of events have no effect on the stream join approach

29

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

30

Power Sensitive Dissemination and Routing

We discussed data and query dissemination in past several papers Here we discuss dissemination and routing keeping power

  • ptimization in mind

In multi-hop networks queries disseminate from sink to source, node by node Significant amount of power can be saved if parent can predict the child has zero probability of producing data for a query Parent nodes keep track of attributes of the children nodes by maintaining a Semantic Routing Tree (SRT) based on metadata The tree is useful in limiting the scope of a query thus reducing dissemination, processing and transmission costs

slide-6
SLIDE 6

6

31

Semantic Routing Tree (SRT)

SRT used for tracking which nodes can participate in what kind of queries (by attributes) The tree is not adaptable to dynamic attributes

Shouldn’t this be 2: [1,1] 32

Semantic Routing Tree (SRT)

SRT built in two phases 1 2 3 4 5

Root node

SELECT light WHERE x > 3 AND x < 7

Location: (1,7) Location: (5,3) Location: (10,3) Location: (8,7) Location: (4,12)

33

Semantic Routing Tree (SRT)

Root/parent sends out “SRT build request” containing name of attribute A

1 2 3 4 5

SELECT light WHERE x > 3 AND x < 7

Location: (5,3) Location: (10,3) Location: (8,7) Location: (1,7) SRT(x) Location: (4,12)

34

Semantic Routing Tree (SRT)

Root/parent sends out “SRT build request” containing name of attribute A

1 2 3 4 5

SELECT light WHERE x > 3 AND x < 7

Location: (5,3) SRT(x) Location: (10,3) SRT(x) Location: (8,7) Location: (1,7) SRT(x) Location: (4,12)

35

Semantic Routing Tree (SRT)

Root/parent sends out “SRT build request” containing name of attribute A

1 2 3 4 5

SELECT light WHERE x > 3 AND x < 7

Location: (5,3) SRT(x) Location: (10,3) SRT(x) Location: (1,7) SRT(x) Location: (8,7) SRT(x) 4: [5,5] 5: [10,10] Location: (4,12)

36

Semantic Routing Tree (SRT)

Root/parent sends out “SRT build request” containing name of attribute A

1 2 3 4 5

SELECT light WHERE x > 3 AND x < 7

Location: (5,3) SRT(x) Location: (10,3) SRT(x) Location: (8,7) SRT(x) 4: [5,5] 5: [10,10] Location: (1,7) SRT(x) Location: (4,12) SRT(x) 1: [1,1] 3: [5,10]

slide-7
SLIDE 7

7

37

Maintaining SRTs

SRTs are limited to constant attributes Maintenance is important because

New nodes can appear in the network Link qualities can change Existing nodes can fail

When new nodes appear in the network or link quality changes

Parent selection message is sent to new parent n If n’s interval changes, it notifies its parent and so on till the root

Handling existing nodes failure

Use active query id and last epoch with each child Send query to child and wait for t epochs On no response, request retransmission of ranges from existing children If new interval different in size than previous, send info up in the tree

38

Evaluation of Benefits of SRT

Parent selection an important issue Benefits of SRT dependent of quality of clustering of children beneath the parents

This is especially true in the case of geographic routing

We saw this in the last paper where a specific region of the network can be selected for

flooding instead of the whole network.

Three policies for SRT parent selection

Random approach (pick any parent based on communication reliability) Closest-parent approach (closest attribute value)

Each parent reports the value of its index attribute with SRT-build request and child

nodes picks one that is closest.

Clustered approach (listen to siblings selection and select own parent)

Same approach as closest-parent, but also listens to siblings selection. This is to try to

minimize spread of attribute values

39

Evaluation of Benefits of SRT

Some simulation results

Clustered approach was superior Beat random approach by 25% on average Beat closes-path approach by 10% on average Real sensor network deployments show show significant correlation

40

Evaluation of Benefits of SRT

Advantages of SRT

Provides an efficient mechanism for disseminating queries and collecting results for queries over constant attributes SRT can reduce reduce overall network traffic

Very useful in geographic distribution

Disadvantages

Maintenance and construction costs

41

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

42

Processing Queries-execution

Sequence of operations

Sleep, wake, sample sensors, apply operators to data generated, deliver results

Details of query execution were covered in the last presentation (TAG)

slide-8
SLIDE 8

8

43

Processing Queries-Prioritizing Data Delivery

Sampled data is queued onto a radio buffer which can overflow The system has to decide whether to discard the overflowing data

  • r other data in the queue or use aggregation to combine them

To make better decision, data is assigned priorities Three prioritization schemes

Naïve : no tuple is more valuable than any other. FIFO behavior observed in dropping tuples Winavg : works same as Naïve but averages two tuples at the head to make space. A count is associated with the head since its average of many Delta : a tuple is assigned an initial score relative to its difference from most recently successfully transmitted tuple from this node. Tupples with lowest score are dropped if overflow occurs. Out of order delivery allowed.

44

Processing Queries-Prioritizing Data Delivery

Sampling rate was set to k times faster than the delivery rate to ensure delivery of 1 out k tupples on average Observations: Delta emphasizes extremes and Winavg dampens extremes

45

Adapting Rates and Power Consumption

Selection and adjustment of sampling and transmission rates to limit frequency of network related loss and fill rates of queues These techniques are not available in non-acquisitional query processing systems Initially, TinyDB’s optimizer selects a transmission and sample rate based on current network load conditions, and current requested sample rates and lifetimes Adapting to newer conditions provides better power optimization

46

Adapting Rates and Power Consumption

Observation: n motes transmitting data to the network at higher rates gets to deliver lesser than the same number of motes transmitting at lower rates.

47

Outline

Main ideas Introduction Sensor Networks Overview Acquisitional Query Language Power-Aware Optimization Power Sensitive Dissemination and Routing Processing Queries Topics from supplemental paper Summary and Conclusion

48

Supplemental paper

Title: “Medians and Beyond: New Aggregation Techniques for Sensor Networks” More complex queries such as median, consensus, histogram of data distribution and range queries Average Vs. Median

In Average each child node reports two integers( sum and count) to parent Median is more complicated, since all distinct values have to be kept which increases size of message and thus bandwidth

Quantile Digest is a data structure which can be used to process more complex queries with user controlled error tolerance.

Error tolerance is proportional to memory and bandwidth consumption Q-digest is useful for preserving high frequency values while compressing low frequency values. This is useful when there is a wide variation in frequency of values.

Properties of q-digest include Error-Memory Trade-off, Confidence Factor and Multiple Queries. Q-digest provides information about the distribution of data values but not about where the information occurred.

slide-9
SLIDE 9

9

49

Summary/Conclusion

TinyDB provides data management as well as better power management which can extend lifetime of nodes. In other research, the focus has been on how to process energy-efficiently, data that is available. In this work, the prior existence of data is not assumed. This work is concerned with processing as well as acquiring data, energy- efficiently. Query optimization does work for reducing power consumption. SRT is a useful way to reduce network traffic and power consumption TinyDB offers simple aggregate methods i.e COUNT, SUM, AVERAGE etc. To do more complex aggregation such as Median, quantile digests can be used.