Mining Airfare Data to Minimize Ticket Purchase Price Oren Etzioni ( - - PowerPoint PPT Presentation

mining airfare data to minimize ticket purchase price
SMART_READER_LITE
LIVE PREVIEW

Mining Airfare Data to Minimize Ticket Purchase Price Oren Etzioni ( - - PowerPoint PPT Presentation

Mining Airfare Data to Minimize Ticket Purchase Price Oren Etzioni ( UW ) Craig Knoblock ( USC ) Alex Yates ( UW ) Rattapoom Tuchinda ( USC ) Price change over time for American Airlines flight #192:223, LAX-BOS, departing on Jan.


slide-1
SLIDE 1

Mining Airfare Data to Minimize Ticket Purchase Price

Oren Etzioni (UW) Craig Knoblock (USC) Alex Yates (UW) Rattapoom Tuchinda (USC)

slide-2
SLIDE 2

Etzioni, UW 2

Price change over time for American Airlines flight #192:223, LAX-BOS, departing on Jan. 2.

slide-3
SLIDE 3

Etzioni, UW 3

Consumers’ Dilemma

To Buy or Not to Buy…that is the question..

Data mining à Price drops

slide-4
SLIDE 4

Etzioni, UW 4

Advisor Model

  • 1. Consumer wants to buy a ticket.
  • 2. Hamlet: ‘buy’ (this is a good price).
  • 3. Or: ‘wait’ (a better price will emerge).
  • 4. Notify consumer when price drops.
slide-5
SLIDE 5

Etzioni, UW 5

Arbitrage Model

  • 1. “going price” is $900.
  • 2. Hamlet anticipates a price of $400.
  • 3. Hamlet offers a $600 fare.
  • 4. Hamlet buys when the price drops to $400.
  • 5. Consumer saves $300; Hamlet earns $200.

(of course, Hamlet could lose money!)

slide-6
SLIDE 6

Etzioni, UW 6

Will Flights sell out?

  • 1. Watch the number of empty seats.
  • 2. Upgrade to business class.
  • 3. Place on another flight and give a free ticket.

In our experiment: upgrades were sufficient.

slide-7
SLIDE 7

Etzioni, UW 7

Is Airfare Prediction Possible???

Complex “yield management” algorithms.

  • airlines have tons of historical data.

Exogenous events create randomness. How about the stock market? True markets are unpredictable. For Hamlet, prices are set by the airlines!

slide-8
SLIDE 8

Etzioni, UW 8

Surprising Experimental Result

Savings: buy immediately versus Hamlet. Optimal: buy at the best possible time.

Though it be madness, yet there be method in it.

HAMLET’s savings were 61.8% of optimal!

slide-9
SLIDE 9

Etzioni, UW 9

Data Set

Used Fetch.com’s data collection infrastructure. Collected over 12,000 price observations:

– Lowest available fare for a one-week roundtrip. – LAX-BOS and SEA-IAD. – 6 airlines including American, United, etc. – 21 days before each flight, every 3 hours.

slide-10
SLIDE 10

Etzioni, UW 10

Learning Task Formulation

Input: price observation data. Algorithm: label observations (decision point); run learner. Output: Classify each decision point à buy versus wait.

slide-11
SLIDE 11

Etzioni, UW 11

Formulation Fine Points

Want to learn from the latest data. Run learner nightly to produce a new model.

– Learner is trained on data gathered to date.

Learned policy is a sequence of 21 models. Test set: 8 * 21 decision points for the last 1/3 of the flights.

slide-12
SLIDE 12

Etzioni, UW 12

Labeling Training Data

IF price drops between and now THEN label(O)=wait ELSE label(O) à Pr(price will drop between now and takeoff) takeoff now O

5 days 11 days

We estimate Pr based on behavior of past flights.

slide-13
SLIDE 13

Etzioni, UW 13

Candidate Approaches

Fixed: “asap”, 14 days prior, 7 days,… By hand: an expert looks at the data. Time series:

– Not effective at price jumps!

Reinforcement learning: Q-learning.

– Used in computational finance.

Rule learning: Ripper, …

). ,... , (

1 2 1

P P P F P

t t t − −

=

slide-14
SLIDE 14

Etzioni, UW 14

Ripper

. THEN BOS

  • LAX

route AND 2223 price AND 252 takeoff

  • before
  • hours

IF wait = ≥ ≥

  • Features include price, airline, route, hours-

before-takeoff, etc.

  • Learned 20-30 rules…
slide-15
SLIDE 15

Etzioni, UW 15

Simple Time Series

Predict price using a fixed window of k price

  • bservations weighted by α.

We used a linearly increasing function for α

∑ ∑

= = + − + = k i k i i k t t

i p i p

1 1 1

) ( ) ( α α

slide-16
SLIDE 16

Etzioni, UW 16

Q-learning

Natural fit to problem

( ) ( ) ( ) ( )

s a Q s a R s a Q

a

ʹ″ ʹ″ ⋅ + =

ʹ″

, max , , γ

( ) ( ) ( ) ( ) ( ) ( )

⎩ ⎨ ⎧ ʹ″ ʹ″ − = − =

  • therwise.

, , , max . after

  • ut

sells flight if 300000 , , s w Q s b Q s s w Q s price s b Q

slide-17
SLIDE 17

Etzioni, UW 17

Hamlet

Stacking with three base learners:

  • 1. Ripper (e.g., R=wait)
  • 2. Time series
  • 3. Q-learning (e.g., Q=buy)

Ripper used as the meta-level learner. Output: classifies each decision point as ‘buy’ or ‘wait’.

slide-18
SLIDE 18

Etzioni, UW 18

Experimental Results

Real price data; Simulated passengers.

– Uniform distribution over decision points. (sensitivity) Requesting specific flights (also 3hr interval).

Learner run once per day on “past data”. Execution: label each purchase point until buy (or sell out). Compute savings (or loss).

slide-19
SLIDE 19

Etzioni, UW 19

Net Savings by Method

$0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000

Savings by Method

  • Net savings = cost now – cost at purchase point.
  • Penalty for sell out = upgrade cost. 0.42% of the time.
  • Total ticket cost is $4,579,600.
  • 9.5%

3.4% 3.8% 3.8% 4.4% 7.0%

Legend: Time Series Q-Learning By Hand Ripper Hamlet Optimal

slide-20
SLIDE 20

Etzioni, UW 20 Interval Savings

$0 $50,000 $100,000 $150,000 $200,000 $250,000 $300,000 $350,000

Sensitivity Analysis

Passenger requests any nonstop flight in a 3 hour interval:

  • 5.7%

3.3% 3.6% 3.8% 4.2% 7.1%

Legend: Time Series Q-Learning By Hand Ripper Hamlet Optimal

slide-21
SLIDE 21

Etzioni, UW 21

Upgrade Penalty

Method Upgrade Cost % Upgrades Optimal $0 0% By hand $22,472 0.36% Ripper $33,340 0.45% Time Series $693,105 33.00% Q-learning $29,444 0.49% Hamlet $38,743 0.42%

slide-22
SLIDE 22

Etzioni, UW 22

Discussion

76% of the time --- no savings possible. Uniform distribution over 21 days. 33% of the passengers arrived in the last week. No passengers arrived >21 days before. Simulation understates possible savings!

slide-23
SLIDE 23

Etzioni, UW 23

Savings on “Feasible” Flights

Method Net Savings Optimal 30.6% By hand 21.8% Ripper 20.1% Time Series 25.8% Q-learning 21.8% Hamlet 23.8%

Comparison of Net Savings (as a percent

  • f total ticket price) on Feasible Flights
slide-24
SLIDE 24

Etzioni, UW 24

Related Work

Trading agent competition.

– Auction strategies

Temporal data mining. Time Series. Computational finance.

slide-25
SLIDE 25

Etzioni, UW 25

Future Work

More tests: international, multi-leg, hotels, etc. Cost sensitive learning (tried MetaCost). Additional base learners Bagging/boosting Refined predictions Commercialization: patent, license.

slide-26
SLIDE 26

Etzioni, UW 26

Conclusions

  • 1. Dynamic pricing is prevalent.
  • 2. Price mining a-la-Hamlet is feasible.
  • 3. Price drops can be surprisingly predictable.
  • 4. Need additional studies and algorithms.
  • 5. Great potential to help consumers!

All’s well that ends well.

slide-27
SLIDE 27

Etzioni, UW 27

Savings by Method

Method Savings Losses Upgrade Cost % Upgrades Net Savings % Savings % of Optimal Optimal $320,572 $0 $0 0% $320,572 7.0% 100.0% By hand $228,318 $35,329 $22,472 0.36% $170,517 3.8% 53.2% Ripper $211,031 $4,689 $33,340 0.45% $173,002 3.8% 54.0% Time Series $269,879 $6,138 $693,105 33.00%

  • $429,364
  • 9.5%
  • 134.0%

Q-learning $228,663 $46,873 $29,444 0.49% $152,364 3.4% 47.5% Hamlet $244,868 $8,051 $38,743 0.42% $198,074 4.4% 61.8%

  • Savings over “buy now”.
  • Penalty for sell out = upgrade cost.
  • Total ticket cost is $4,579,600.
slide-28
SLIDE 28

Etzioni, UW 28

Sensitivity Analysis

Passenger requests any nonstop flight in a 3 hour interval:

Method Net Savings % of Optimal % upgrades Optimal $323,802 100.0% 0.0% By hand $163,523 55.5% 0.0% Ripper $173,234 53.5% 0.0% Time Series

  • $262,749
  • 81.1%

6.3% Q-Learning $149,587 46.2% 0.2% Hamlet $191,647 59.2% 0.1%

slide-29
SLIDE 29

Etzioni, UW 29

Another Chart

Savings by Method

($500,000) ($400,000) ($300,000) ($200,000) ($100,000) $0 $100,000 $200,000 $300,000 $400,000 Time Series Q- learning By hand Ripper Hamlet Optimal Gross Savings Net Savings