Top-k Queries over Uncertain Scores
Top-k Queries over Uncertain Scores
Qing Liu, Debabrota Basu, Talel Abdessalem, St´ ephane Bressan
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 1 / 19
Top-k Queries over Uncertain Scores Qing Liu, Debabrota Basu, Talel - - PowerPoint PPT Presentation
Top-k Queries over Uncertain Scores Top-k Queries over Uncertain Scores Qing Liu, Debabrota Basu, Talel Abdessalem, St ephane Bressan CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 1 / 19 Top-k Queries over Uncertain Scores
Top-k Queries over Uncertain Scores
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 1 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Modern recommendation systems leverage some forms of
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 2 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Modern recommendation systems leverage some forms of
◮ Crowdsourcing Platforms
◮ easily announce their needs to the crowd / get access to the
information they need
◮ choose the highest quality / most competitively priced CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 2 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Modern recommendation systems leverage some forms of
◮ Crowdsourcing Platforms
◮ easily announce their needs to the crowd / get access to the
information they need
◮ choose the highest quality / most competitively priced
◮ Examples: TripAdvisor
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 2 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Modern recommendation systems leverage some forms of
◮ Crowdsourcing Platforms
◮ easily announce their needs to the crowd / get access to the
information they need
◮ choose the highest quality / most competitively priced
◮ Examples: TripAdvisor
◮ collaborative user or crowdsourced collection of information,
e.g., user generated ratings and reviews, to recommend travel plans and hotels, vacation rentals and restaurants.
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 2 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Crowdsourcing and Collaborative Economy:
◮ communities or crowds rent, share, sell products or services CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 3 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Crowdsourcing and Collaborative Economy:
◮ communities or crowds rent, share, sell products or services CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 3 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Crowdsourcing and Collaborative Economy:
◮ communities or crowds rent, share, sell products or services CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 3 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Crowdsourcing and Collaborative Economy:
◮ communities or crowds rent, share, sell products or services CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 3 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Independent collection of information → uncertainty and
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 4 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Independent collection of information → uncertainty and
◮ Objects (services, vacation rentals and restaurants...) have
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 4 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Independent collection of information → uncertainty and
◮ Objects (services, vacation rentals and restaurants...) have
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 4 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Independent collection of information → uncertainty and
◮ Objects (services, vacation rentals and restaurants...) have
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 4 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Independent collection of information → uncertainty and
◮ Objects (services, vacation rentals and restaurants...) have
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 4 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Ranking is one of the building blocks of recommendation. ◮ A top-k query returns the sequence of the k objects with the
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 5 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Ranking is one of the building blocks of recommendation. ◮ A top-k query returns the sequence of the k objects with the
◮ Price of the apartments.
2000 3000 4000 5000 6000 0.2 0.4 0.6 0.8 1 1.2 1.4 x 10
−3
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 5 / 19
Top-k Queries over Uncertain Scores Introduction
◮ Ranking is one of the building blocks of recommendation. ◮ A top-k query returns the sequence of the k objects with the
◮ Price of the apartments.
2000 3000 4000 5000 6000 0.2 0.4 0.6 0.8 1 1.2 1.4 x 10
−3
◮ With uncertain scores, a top-k query can only return an
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 5 / 19
Top-k Queries over Uncertain Scores Related Work
◮ Soliman, Hyas and Ben-David [Soliman and Ilyas, 2009] study
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 6 / 19
Top-k Queries over Uncertain Scores Related Work
◮ Soliman, Hyas and Ben-David [Soliman and Ilyas, 2009] study
◮ In this paper, we consider probabilistic top-k queries under
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 6 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects;
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O;
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O; ◮ Xi: a random variable, equals to s(oi);
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O; ◮ Xi: a random variable, equals to s(oi); ◮ fi: bounded continuous probability density function of Xi;
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O; ◮ Xi: a random variable, equals to s(oi); ◮ fi: bounded continuous probability density function of Xi; ◮ π(k) = [o1, · · · , ok]: sequence of k objects in O;
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O; ◮ Xi: a random variable, equals to s(oi); ◮ fi: bounded continuous probability density function of Xi; ◮ π(k) = [o1, · · · , ok]: sequence of k objects in O; ◮ Pr(π(k)): probability of π(k) be the top-k sequence;
−∞
−∞
−∞
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Problem Definition
◮ O: a set of n objects; ◮ s(oi): the score of an object oi ∈ O; ◮ Xi: a random variable, equals to s(oi); ◮ fi: bounded continuous probability density function of Xi; ◮ π(k) = [o1, · · · , ok]: sequence of k objects in O; ◮ Pr(π(k)): probability of π(k) be the top-k sequence;
−∞
−∞
−∞
◮ (Objective:) Probabilistic top-k sequence: the π(k) that
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 7 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Naive: calculate Pr(π(k)) for every possible sequence π(k) and
◮
n! (n−k)! possible sequences to examine.
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 8 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Naive: calculate Pr(π(k)) for every possible sequence π(k) and
◮
n! (n−k)! possible sequences to examine.
◮ Branch-and-Bound [Soliman et al., 2010]: Prune some π(k)s.
◮ Worst case:
n! (n−k)! possible sequences to examine.
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 8 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Naive: calculate Pr(π(k)) for every possible sequence π(k) and
◮
n! (n−k)! possible sequences to examine.
◮ Branch-and-Bound [Soliman et al., 2010]: Prune some π(k)s.
◮ Worst case:
n! (n−k)! possible sequences to examine.
◮ Soliman’s Algorithm [Soliman et al., 2010]: searches the space of
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 8 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Naive: calculate Pr(π(k)) for every possible sequence π(k) and
◮
n! (n−k)! possible sequences to examine.
◮ Branch-and-Bound [Soliman et al., 2010]: Prune some π(k)s.
◮ Worst case:
n! (n−k)! possible sequences to examine.
◮ Soliman’s Algorithm [Soliman et al., 2010]: searches the space of
◮ In this paper, we explore the variants of Markov chain Monte
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 8 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Soliman’s Algorithm
◮ Initial state: a rank over the n objects
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝3)
𝑝1 𝑝3 𝑝2 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝4) 𝑙
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝6 > 𝑝5)
𝑝1 𝑝2 𝑝3 𝑝4 𝑝6 𝑝5 𝑝7
Pr(𝑝6 > 𝑝4)
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 9 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Soliman’s Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝3)
𝑝1 𝑝3 𝑝2 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝4) Top-𝑙
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝6 > 𝑝5)
𝑝1 𝑝2 𝑝3 𝑝4 𝑝6 𝑝5 𝑝7
Pr(𝑝6 > 𝑝4)
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 9 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Soliman’s Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝3)
𝑝1 𝑝3 𝑝2 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝2 < 𝑝4) Top-𝑙
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Pr(𝑝6 > 𝑝5)
𝑝1 𝑝2 𝑝3 𝑝4 𝑝6 𝑝5 𝑝7
Pr(𝑝6 > 𝑝4)
◮ Acceptance Probability: α = min(
P r(π(k)
t+1)·P r(πt|πt+1)
P r(π(k)
t
)·P r(πt+1|πt) , 1)
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 9 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Swap and SwapEXP Algorithm
◮ Initial state: a rank over the n objects
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
𝑙
𝑝1 𝑝5 𝑝3 𝑝4 𝑝2 𝑝6 𝑝7
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 10 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Swap and SwapEXP Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Top-𝑙
𝑝1 𝑝5 𝑝3 𝑝4 𝑝2 𝑝6 𝑝7
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 10 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Swap and SwapEXP Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Top-𝑙
𝑝1 𝑝5 𝑝3 𝑝4 𝑝2 𝑝6 𝑝7
◮ Acceptance Probability:
Swap: α = min(
P r(π(k)
t+1)· 1 kn
P r(π(k)
t
)· 1
kn =
P r(π(k)
t+1)
P r(π(k)
t
) , 1)
SwapEXP: α = min(
t+1)
t
) = exp(β(Pr(π(k) t+1) − Pr(π(k) t
))), 1) ( Pr(π(k)) = C−1
β
exp(βPr(π(k))))
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 10 / 19
Top-k Queries over Uncertain Scores Solutions
◮ Swap and SwapEXP Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1 𝑝2 𝑝3 𝑝4 𝑝5 𝑝6 𝑝7
Top-𝑙
𝑝1 𝑝5 𝑝3 𝑝4 𝑝2 𝑝6 𝑝7
◮ Acceptance Probability:
Swap: α = min(
P r(π(k)
t+1)· 1 kn
P r(π(k)
t
)· 1
kn =
P r(π(k)
t+1)
P r(π(k)
t
) , 1)
SwapEXP: α = min(
t+1)
t
) = exp(β(Pr(π(k) t+1) − Pr(π(k) t
))), 1) ( Pr(π(k)) = C−1
β
exp(βPr(π(k))))
◮ SwapEXP is more likely to reject the “worse” candidate state.
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 10 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSample and ReSampleEXP Algorithm
◮ Initial state: a rank over the n objects
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
𝑙
𝑝1: 9 𝑝2: 8 𝑝5: 7 𝑝3: 6 𝑝4: 5 𝑝6: 3 𝑝7: 2
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 11 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSample and ReSampleEXP Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
Top-𝑙
𝑝1: 9 𝑝2: 8 𝑝5: 7 𝑝3: 6 𝑝4: 5 𝑝6: 3 𝑝7: 2
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 11 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSample and ReSampleEXP Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
Top-𝑙
𝑝1: 9 𝑝2: 8 𝑝5: 7 𝑝3: 6 𝑝4: 5 𝑝6: 3 𝑝7: 2
◮ Acceptance Probability:
ReSample: α = min(
P r(π(k)
t+1)·P r(πt|πt+1)
P r(π(k)
t
)·P r(πt+1|πt) , 1)
ReSampleEXP: α = min( P r(πt|πt+1)
P r(πt+1|πt) · exp(β(Pr(π(k) t+1) − Pr(π(k) t
))) , 1).
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 11 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSampleAll Algorithm
◮ Initial state: a rank over the n objects
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
𝑙
𝑝3: 10 𝑝2: 9 𝑝5: 8 𝑝1: 6 𝑝7: 4 𝑝6: 3 𝑝4: 2
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 12 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSampleAll Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
Top-𝑙
𝑝3: 10 𝑝2: 9 𝑝5: 8 𝑝1: 6 𝑝7: 4 𝑝6: 3 𝑝4: 2
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 12 / 19
Top-k Queries over Uncertain Scores Solutions
◮ ReSampleAll Algorithm
◮ Initial state: a rank over the n objects ◮ Candidate State:
𝑝1: 9 𝑝2: 8 𝑝3: 6 𝑝4: 5 𝑝5: 4 𝑝6: 3 𝑝7: 2
Top-𝑙
𝑝3: 10 𝑝2: 9 𝑝5: 8 𝑝1: 6 𝑝7: 4 𝑝6: 3 𝑝4: 2
◮ Acceptance Probability: ReSample: α = 1 CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 12 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8◮ default: uniform score distributions, median score of oi:
li+ui 2
, width: ui − li
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8◮ default: uniform score distributions, median score of oi:
li+ui 2
, width: ui − li
◮ Metrics
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8◮ default: uniform score distributions, median score of oi:
li+ui 2
, width: ui − li
◮ Metrics
◮ Probability of the Probabilistic top-k sequence (higher →
better)
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8◮ default: uniform score distributions, median score of oi:
li+ui 2
, width: ui − li
◮ Metrics
◮ Probability of the Probabilistic top-k sequence (higher →
better)
◮ Convergence of the Markov chains (Gelman-Rubin
Convergence Diagnostic)
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation
◮ Datasets: synthetic datasets
Table: Distributions
Setting 1 Setting 2 Setting 3 median score G(0.5, 0.05) G(0.5, 0.2) U[0, 1] width G(0.5, 0.05) G(0.5, 0.2) U[0, 1]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8◮ default: uniform score distributions, median score of oi:
li+ui 2
, width: ui − li
◮ Metrics
◮ Probability of the Probabilistic top-k sequence (higher →
better)
◮ Convergence of the Markov chains (Gelman-Rubin
Convergence Diagnostic)
◮ Efficiency (Complexity and runtime) CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 13 / 19
Top-k Queries over Uncertain Scores Performance Evaluation Effectiveness of Six Algorithms
5 10 x 10
4
10
−8
10
−7
10
−6
10
−5
Chain Length Probability
Soliman Swap SwapEXP ReSample ReSampleEXP ReSampleAll
(a) Dataset5
5 10 x 10
4
10
−8
10
−7
10
−6
10
−5
Chain Length Probability
Soliman Swap SwapEXP ReSample ReSampleEXP ReSampleAll
(b) Dataset21
0.2 0.4 0.6 0.8 1 1 2 3 4 5 6
(c) Dataset5
0.2 0.4 0.6 0.8 1 0.5 1 1.5 2 2.5
(d) Dataset21
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 14 / 19
Top-k Queries over Uncertain Scores Performance Evaluation Convergence of Six Algorithms
5 10 x 10
4
2 4 6 8 10 Chain Length Gelman−Rubin Diagnostic Soliman Swap SwapEXP ReSample ReSampleEXP ReSampleAll
(e) Dataset5
5 10 x 10
4
2 4 6 8 10 Chain Length Gelman−Rubin Diagnostic Soliman Swap SwapEXP ReSample ReSampleEXP ReSampleAll
(f) Dataset21
0.2 0.4 0.6 0.8 1 1 2 3 4 5 6
(g) Dataset5
0.2 0.4 0.6 0.8 1 0.5 1 1.5 2 2.5
(h) Dataset21
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 15 / 19
Top-k Queries over Uncertain Scores Performance Evaluation Efficiency
Table: Worst Case Time Complexity of Generating Next State
Soliman Swap(EXP) ReSample(EXP) ReSampleAll Time Complexity O(nk) O(1) O(n) O(nlogk)
Table: Runtime Per Step of the Algorithms (seconds)
Soliman Swap SwapEXP ReSample ReSampleEXP ReSampleAll Runtime Per Step 0.0058 1.9128 0.1163 0.0523 0.0071 0.9056 CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 16 / 19
Top-k Queries over Uncertain Scores Conclusion
◮ We explore the design space for Metropolis-Hastings Markov
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 17 / 19
Top-k Queries over Uncertain Scores Conclusion
◮ We explore the design space for Metropolis-Hastings Markov
◮ We verify through extensive experiments that the proposed
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 17 / 19
Top-k Queries over Uncertain Scores Conclusion
◮ We explore the design space for Metropolis-Hastings Markov
◮ We verify through extensive experiments that the proposed
◮ ReSampleAll is the best, since it samples directly from the
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 17 / 19
Top-k Queries over Uncertain Scores Q/A
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 18 / 19
Top-k Queries over Uncertain Scores References
CoopIS 2016 Qing Liu et.al. Top-k Queries over Uncertain Scores 19 / 19