[PPT] - Do Do NO NOT m measure co correlat ated observables, , but tr PowerPoint Presentation

SLIDE 1

Do Do NO NOT m measure co correlat ated observables, , but tr train ain an an Artif tific icial ial In Intellig elligenc ence e to pr predic edict them them

Boram Yoon Los Alamos National Laboratory Lattice 2018, East Lansing, Michigan, USA, July 22-28, 2018 arXiv:1807.05971

SLIDE 2

Pr Prediction of !"#$ fr from !%#$

Genuine

Directly measured on 2263 confs

1.216 1.220 1.224 1.228 1.232 1.236

gA

0.94 0.96 0.98 1.00 1.02 1.04 1.06

gS

1.020 1.024 1.028 1.032 1.036 1.040 1.044

gT

1.038 1.039 1.040 1.041 1.042 1.043 1.044 1.045 1.046

gV

ML Prediction

Directly measured on 400 confs + ML prediction on 1863 confs Systematic error due to ML prediction included in errorbars

SLIDE 3

La Lattice ce QCD D Observables are Corr rrelated

U(1) U(2) U(3) U(4) U(5) U(6) U(7) U(8) U(9)

{Mπ

(7), Fπ (7), C3pt:A (7), C3pt:V (7), …}

{Mπ

(1), Fπ (1), C3pt:A (1), C3pt:V (1), …}

Markov Chain Monte Carlo Trajectory

f Gibbs Samples

OX ≈ 1 N OX

(n) n=1 N

∑

ExpectaQon value

SLIDE 4

Cor Correlation

n Ma

Map of

f Nucleon
n Observables
Correlation between proton(uud)

3-pt and 2-pt correlation functions

Clover-on HISQ

! = 0.089 fm, '( = 313MeV + = 10!, , = +/2

Using these correlations,

/012 can be estimated from /312

n each configuration

C2pt C3pt

S,U

C3pt

V,U

C3pt

A,U

C3pt

T,U

C3pt

S,D

C3pt

V,D

C3pt

A,D

C3pt

T,D

C2pt C3pt

S,UC3pt V,UC3pt A,UC3pt T,UC3pt S,DC3pt V,DC3pt A,DC3pt T,D

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

| Correlation coefficient |

SLIDE 5

Ma Machine Learning

One can consider the machine

learning (ML) process as a data fitting

The machine ! has very general

fitting functional form with huge number of free parameters

The free parameters are determined

from large number of training data: ! "# ≈ %#

For example,

"#: pixels of a picture %# : “cat” or “dog” Machine

!

Input: "# = ((#

), (# +, (# ,, … )

Output: %#

SLIDE 6

Ma Machine Learning on

n Lattice QCD

QCD Observables

Assume N+M indep. measurements
Common observables !" on all N+M

Target observable #" on first N Machine

$

Input: !" = ('"

(, '" *, '" +, … )

Output: #"

1) Train machine F to yield #" from !"

n the Training Data

2) Predict #" of the Test data from !"

.(!") = #"

/ ≈ #"

N M

(!", #") (!") [Training Data] [Test Data]

SLIDE 7

Pr Prediction Bias

!(#$) = '$

( ≈ '$

Simple average

' = 1 + ,

$-./0 ./1

'$

(

is not correct due to prediction bias

Prediction = TrueAnswer + Noise + Bias
ML prediction may have bias

'$

( ≠ '$

Bias = '$

( − '$

High Bias Low Bias Low Variance High Variance

SLIDE 8

Bi Bias Cor Correction

n
Average of predictions on test data with bias correction

! = 1 $ %

&'()* ()+

!&

, + 1

./ %

&'(0)* (0)(1

!& − !&

,

Expectation value, ! = !&

, + !& − !& , = !&

Training data should not overlap with bias correction data
Not efficient: small training/bias correction data

Nt M

(4&, !&) (4&) [Training Data] [Test Data]

Nb

[Bias Correction Data] (4&, !&)

SLIDE 9

Bi Bias Cor Correction

n – Cr

Cros

ss Validation
n
Average of predictions on test data with bias correction

! = 1 $ %

&'( )

1 * %

+',-( ,-.

!+

/,& + 1

2 %

3'( 4

!3

& − !3 /,&

$ = 6/2, 2 ≪ 6

Full training data & precise bias estimation
Systematic error of ML prediction naturally included in error estimation

N-m m

9 = 1 9 = 2 9 = 3 … … → >(, !+

/,(= >( ?+

→ >@, !+

/,@= >@ ?+

→ >A, !+

/,A= >A ?+

…

SLIDE 10

Pr Prediction of !"#$ fr from !%#$

Boosted Decision Tree Regression

Input: &' = {*+,- 0 ≤ 0/2 ≤ 3

456 }

Output: *8,-

9,;,<,= 0, > ! "

+,- 0 8,- 0, >

A,S,T,V

SLIDE 11

De Deci cision T Tree R Regression

!"#$

%

&/( = 10, -/( = 5

Input: {!0#$ 0 ≤ &/( ≤ 20 } Output: !"#$

%

10, 5

SLIDE 12

Boos Boosted Decision

n Tree (BD

BDT)

Iterative boosting

!

" = [Simple DT ℎ"]

!

% = ! " + [Simple DT ℎ% that corrects residual error of ! "]

!

& = ! % + [Simple DT ℎ& that corrects residual error of ! %]

!

' = ! & + [Simple DT ℎ' that corrects residual error of ! &]

… !

( = ! ()% + ℎ(

! + = !

,-../0(+)

In this study, 345567 = 200 − 500

SLIDE 13

De Deci cision T Tree ℎ" fo for #$%&

'

10, 5

SLIDE 14

De Deci cision T Tree ℎ" fo for #$%&

'

10, 5

SLIDE 15

De Deci cision T Tree ℎ"# fo for $"%&

'

10, 5

SLIDE 16

Pr Prediction of !"#$ fr from !%#$

20 40 60 80 100 120 140

10 -5

5 10

Axial

τ=10 t=5

Frequency

C3pt

A [ × 10-16 ]

10 -5

5 10

Vector C3pt

V [ × 10-16 ]

C3pt

Genu− <C3pt Genu>

C3pt

Genu− C3pt Pred

20 40 60 80 100 120 140

30 -15

15 30

Scalar

τ=10 t=5

Frequency

C3pt

S [ × 10-16 ]

10 -5

5 10

Tensor C3pt

T [ × 10-16 ]

C3pt

Genu− <C3pt Genu>

C3pt

Genu− C3pt Pred

Training and Test performed for
Clover-on-HISQ
& = 0.089fm, ,- = 313 MeV
Measurements: 2263 confs ⨉ 64 srcs
# of Training data: 400 confs

# of Test data: 1864 confs

Predictions of 1234

5

10,5 / 1934 10

SLIDE 17

Pr Prediction of !"#$ fr from !%#$

(a) Train (b) Genuine (c) Pred.[2pt] (d) Pred.[2pt+3pt(12)]

1.180 1.200 1.220 1.240 1.260

gA

u-d

1.020 1.025 1.030 1.035 1.040 1.045 1.050 1.055

4
2

2 4

gV

u-d

t - τ/2

4
2

2 4

t - τ/2

4
2

2 4

t - τ/2

τ=∞

4
2

2 4

t - τ/2

τ=14 τ=12 τ=10 τ=8

SLIDE 18

Pr Prediction of !"#$ fr from !%#$

(a) Train (b) Genuine (c) Pred.[2pt] (d) Pred.[2pt+3pt(12)]

0.80 0.85 0.90 0.95 1.00 1.05 1.10

gS

u-d

1.00 1.05 1.10 1.15

4
2

2 4

gT

u-d

t - τ/2

4
2

2 4

t - τ/2

4
2

2 4

t - τ/2

τ=∞

4
2

2 4

t - τ/2

τ=14 τ=12 τ=10 τ=8

SLIDE 19

Pr Prediction of !"#$ fr from !%#$

Results extrapolated to & → ∞

! 1 Genuine Pred.[C2pt] Pred.[C2pt+C3pt(12)] gS 0.985(22) 1.013(30) 1.008(21) gA 1.2304(48) 1.2243(67) 1.2268(54) gT 1.0312(52) 1.0342(61) 1.0304(54) gV 1.0432(20) 1.0412(23) 1.0413(21) 2263 DM (Direct Meas.) 400 DM + 1863 Pred. 400 DM + 1863 Pred.

SLIDE 20

Qu Quark Ch Chromo

mo EDM

M (cE cEDM DM)

Simulation in presence of CPV cEDM interaction
Schwinger source method

Include cEDM term in valence quark propagators by modifying Dirac operator

cEDM contribution to nEDM can be obtained

by calculating vector form-factor F3 with propagators including cEDM & O"# = %&'%

S = SQCD + ScEDM ScEDM = − i 2 d 4x ! dqgsq(σ ⋅G)γ5q

∫

Dclov → Dclov +iεσ µνγ5Gµν

u" d" d" u" d" d" Pε" Pε" P" P"

seq"

u" d" d" u" d" d" P" P" P)ε"

seq"

Pε"

SLIDE 21

Pr Prediction of !"#$

%&' fr

from !"#$

()*+,

(

.

(/01234/

Predict 5678 for cEDM and 9: insertions

from 5678 without CPV

CPV interactions è phase in neutron mass

;<=9= + ?@A6BC-. DE = 0

At leading order, H can be obtained from

5678

I

≡ Tr 9: M N M

SLIDE 22

Pr Prediction of !"#$

%&' fr

from !"#$

10 20 30 40 50

2
1

1 2

cEDM

t=10

Frequency

C2pt

P, cEDM [ × 10-11 ]

2
1

1 2

γ5

C2pt

P, γ5 [ × 10-11 ]

C2pt

Genu− <C2pt Genu>

C2pt

Genu− C2pt Pred

Boosted Decision Tree Regression Input: () = {Re, Im[2345

6,7 0 ≤ :/< ≤ 16 ]}

Output: Im 2345

7 (BCDE, FG) :

Training and Test performed for
Clover-on-HISQ
< = 0.12 fm, KL = 305 MeV
Measurements: 400 confs ⨉ 64 srcs
# of Training data: 100 confs

# of Test data: 300 confs

SLIDE 23

Pr Prediction of !"#$

%&' fr

from !"#$

0.045 0.050 0.055 0.060 0.065 cEDM

αcEDM

0.155
0.150
0.145
0.140
0.135

2 4 6 8 10 12 γ5

αγ5

t

Genuine ML Prediction

()*+,

Genuine: 0.0527(16) Prediction: 0.0523(16)

(-.

Genuine: -0.1462(14) Prediction: -0.1462(16)

ØGenuine: DM on 400 confs ØPrediction: DM on 100 confs + ML prediction on 300 confs

SLIDE 24

Su Summary

Machine learning is used to predict unmeasured
bservables from measured observables
Unbiased estimator using cross-validation is presented
Demonstrated for two lattice QCD calculations:

1) Prediction of !"#$ from !%#$ 2) Prediction of !%#$

&'( from !%#$

The approach can be applied to various lattice

calculations and reduce measurement cost

SLIDE 25

BD BDT with sc sciki kit-lear learn Py Python ML Library

>>> import numpy >>> from sklearn.ensemble import GradientBoostingRegressor >>> >>> X = numpy.random.uniform(size=(100,2))*10 # 100 random samples >>> y = [x[0]**2 + 2*x[1] for x in X] >>> >>> gbr = GradientBoostingRegressor() >>> gbr.fit(X,y) # Training >>> >>> gbr.predict([[3,4]]) # 32+2⨉4 = 17 array([15.20630936]) >>> gbr.predict([[6,3]]) # 62+2⨉3 = 42 array([42.77231812]) >>> gbr.predict([[8,5]]) # 82+2⨉5 = 74 array([74.14274825])

" = $%, '% , $(, '( , … y = $%

( + 2'%, $( ( + 2'(, …

SLIDE 26

Comp Comparison

n of
f Regression
n Mod

Models

Linear Regression BDT Neural Network Speed Fastest Fast Slow Performance Bad for nonlinear Okay Possibly better Tuning Parameters None or a few Few; not sensitive Many; sensitive Overfitting Risk Very Low Low High Training Data Requirement Small Medium Large Interpretability Yes Somewhat Not likely

Do Do NO NOT m measure co correlat ated observables, , but tr train ain an an Artif tific icial ial In Intellig elligenc ence e to pr predic edict them them

Boram Yoon Los Alamos National Laboratory Lattice 2018, East Lansing, Michigan, USA, July 22-28, 2018 arXiv:1807.05971

Pr Prediction of !"#$ fr from !%#$

Directly measured on 2263 confs

gA

gS

gT

gV

Directly measured on 400 confs + ML prediction on 1863 confs Systematic error due to ML prediction included in errorbars

La Lattice ce QCD D Observables are Corr rrelated

Markov Chain Monte Carlo Trajectory

OX ≈ 1 N OX

∑

ExpectaQon value

Cor Correlation

Map of

3-pt and 2-pt correlation functions

! = 0.089 fm, '( = 313MeV + = 10!, , = +/2

/012 can be estimated from /312

| Correlation coefficient |

Ma Machine Learning

learning (ML) process as a data fitting

fitting functional form with huge number of free parameters

from large number of training data: ! "# ≈ %#

"#: pixels of a picture %# : “cat” or “dog” Machine

!

Input: "# = ((#

), (# +, (# ,, … )

Output: %#

Ma Machine Learning on

QCD Observables

Target observable #" on first N Machine

$

Input: !" = ('"

(, '" *, '" +, … )

Output: #"

1) Train machine F to yield #" from !"

2) Predict #" of the Test data from !"

.(!") = #"

/ ≈ #"

N M

(!", #") (!") [Training Data] [Test Data]

Pr Prediction Bias

' = 1 + ,

'$

is not correct due to prediction bias

'$

Bias = '$

High Bias Low Bias Low Variance High Variance

Bi Bias Cor Correction

! = 1 $ %

!&

./ %

!& − !&

Nt M

(4&, !&) (4&) [Training Data] [Test Data]

Nb

[Bias Correction Data] (4&, !&)

Bi Bias Cor Correction

Cros

! = 1 $ %

1 * %

!+

2 %

!3

$ = 6/2, 2 ≪ 6

N-m m

Pr Prediction of !"#$ fr from !%#$

Boosted Decision Tree Regression

Input: &' = {*+,- 0 ≤ 0/2 ≤ 3

456 }

Output: *8,-

9,;,<,= 0, > ! "

*+,- 0 *8,- 0, >

De Deci cision T Tree R Regression

!"#$

&/( = 10, -/( = 5

Input: {!0#$ 0 ≤ &/( ≤ 20 } Output: !"#$

10, 5

Boos Boosted Decision

+,- 0 8,- 0, >