A Two-Stage Method for Commodity Price Trend Forecasting SIGIR - - PowerPoint PPT Presentation
A Two-Stage Method for Commodity Price Trend Forecasting SIGIR - - PowerPoint PPT Presentation
A Two-Stage Method for Commodity Price Trend Forecasting SIGIR Workshop: FinIR 30 #$ July 2020 ustc_youdu: Bingjie Liang, Huixin Liu and Chujing He University of Science and Technology of China Problem Description The main task is to build
The main task is to build prediction models and use data from 2003 to 2018 to predict the six metals’ price movement direction in 2019 at three time-horizons.
- Problem Description
- Time series data
- Daily transaction data in LME for six metals
- Daily transaction data of main relevant commodities
and financial indices
- Textual data
- Analyst Reports published by institutional trader and
News Reports, collected from both English and Chinese sources.
- Data Description
- Data Preprocessing
11 11 11 1
- 3D
- 0E91H
Opening price
- 91H
Highest price
- .L91H
Lowest price
- CI91H
Closing price
- 5CKD
Trading volume
- Open interest
- 67
67E
- /7
/7E
- 228
228 E
- 46
46 E
- 56
56E
- 216
216 E
- 26
26 E
- .C9
1HHECCHN .C9 1HHECCHN .C9 1HHECCHN
- Feature Selection
- Feature Selection
- Use different feature combinations to train traditional
classification models
– Naïve BayesKNNRandom ForestSVM
- Input of classification model
– Feature values of the current trading day and the expected trading day
- Important features
– Open_PriceHigh_PriceLow_PriceClose_Price
Current day Expected day Classification model
Input Output
Trend label: 0/1
- Model Structure
Close_Price(lag=5)
!"#$%&'( ⋯ !"#( !* LSTM Network
!*'+, !*'-,
Close_Price_Pred Close_Price Close_Price_Pred_1d Close_Price Close_Price_Pred_20d Close_Price Close_Price_Pred_60d
Random Forest First Stage: Prediction LGB Classifier Random Forest
1d label: 0/1 20d label: 0/1 60d label: 0/1
Second Stage: Classification
- normalize
!*'(
!"#$%&'( ⋯ !"#( !* !"#$%&'( ⋯ !"#( !* !"#$%&'( ⋯ !"#( !*
Length = lag
- Prediction Curve
Figure: Prediction curves for Aluminum on validation set
- Results
Total Task1(1d) Task2(20d) Task3(60d) 55.18225736 50.00000000 48.74835310 66.79841897