Charu C. Aggarwal T J Watson Research Center IBM Corporation Hawthorne, NY USA
On Biased Reservoir Sampling in the Presence of Stream Evolution
VLDB Conference, Seoul, South Korea, 2006
On Biased Reservoir Sampling in the Presence of Stream Evolution - - PowerPoint PPT Presentation
Charu C. Aggarwal T J Watson Research Center IBM Corporation Hawthorne, NY USA On Biased Reservoir Sampling in the Presence of Stream Evolution VLDB Conference, Seoul, South Korea, 2006 Synopsis Construction in Data Streams Synopsis
VLDB Conference, Seoul, South Korea, 2006
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 FRACTIONAL RESERVOIR UTILIZATION PROGRESSION OF STREAM (POINTS) VARIABLE RESERVOIR SAMPLING FIXED RESERVOIR SAMPLING
1 2 3 4 5 6 7 8 9 10 x 10
4
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 USER SPECIFIED HORIZON ABSOLUTE ERROR BIASED RESERVOIR UNBIASED RESERVOIR 1 2 3 4 5 6 7 8 9 10 x 10
4
0.04 0.045 0.05 0.055 0.06 0.065 USER SPECIFIED HORIZON ABSOLUTE ERROR BIASED RESERVOIR UNBIASED RESERVOIR
1 2 3 4 5 6 7 8 9 10 x 10
4
0.01 0.02 0.03 0.04 0.05 0.06 0.07 USER SPECIFIED HORIZON ABSOLUTE ERROR BIASED RESERVOIR UNBIASED RESERVOIR 0.5 1 1.5 2 2.5 3 3.5 4 x 10
5
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 PROGRESSION OF DATA STREAM ABSOLUTE ERROR BIASED RESERVOIR UNBIASED RESERVOIR
0.5 1.5 2.5 3.5 4.5 x 10
5
0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 PROGRESSION OF STREAM (points) CLASSIFICATION ACCURACY UNBIASED RESERVOIR BIASED RESERVOIR 1 2 3 4 x 10
5
0.75 0.8 0.85 0.9 0.95 1 PROGRESSION OF STREAM (points) CLASSIFICATION ACCURACY UNBIASED RESERVOIR BIASED RESERVOIR
−1.5 −1 −0.5 0.5 1 1.5 2 2.5 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −3 −2 −1 1 2 3 4 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −4 −3 −2 −1 1 2 3 4 −1 −0.5 0.5 1 1.5 2 2.5 3 3.5