Do Do NO NOT m measure co correlat ated observables, , but tr - - PowerPoint PPT Presentation

do do no not m measure co correlat ated observables but
SMART_READER_LITE
LIVE PREVIEW

Do Do NO NOT m measure co correlat ated observables, , but tr - - PowerPoint PPT Presentation

Do Do NO NOT m measure co correlat ated observables, , but tr train ain an an Artif tific icial ial In Intellig elligenc ence e to pr predic edict them them Boram Yoon Los Alamos National Laboratory Lattice 2018, East Lansing,


slide-1
SLIDE 1

Do Do NO NOT m measure co correlat ated observables, , but tr train ain an an Artif tific icial ial In Intellig elligenc ence e to pr predic edict them them

Boram Yoon Los Alamos National Laboratory Lattice 2018, East Lansing, Michigan, USA, July 22-28, 2018 arXiv:1807.05971

slide-2
SLIDE 2

Pr Prediction of !"#$ fr from !%#$

  • Genuine

Directly measured on 2263 confs

1.216 1.220 1.224 1.228 1.232 1.236

gA

0.94 0.96 0.98 1.00 1.02 1.04 1.06

gS

1.020 1.024 1.028 1.032 1.036 1.040 1.044

gT

1.038 1.039 1.040 1.041 1.042 1.043 1.044 1.045 1.046

gV

  • ML Prediction

Directly measured on 400 confs + ML prediction on 1863 confs Systematic error due to ML prediction included in errorbars

slide-3
SLIDE 3

La Lattice ce QCD D Observables are Corr rrelated

U(1) U(2) U(3) U(4) U(5) U(6) U(7) U(8) U(9)

{Mπ

(7), Fπ (7), C3pt:A (7), C3pt:V (7), …}

{Mπ

(1), Fπ (1), C3pt:A (1), C3pt:V (1), …}

Markov Chain Monte Carlo Trajectory

  • f Gibbs Samples

OX ≈ 1 N OX

(n) n=1 N

ExpectaQon value

slide-4
SLIDE 4

Cor Correlation

  • n Ma

Map of

  • f Nucleon
  • n Observables
  • Correlation between proton(uud)

3-pt and 2-pt correlation functions

  • Clover-on HISQ

! = 0.089 fm, '( = 313MeV + = 10!, , = +/2

  • Using these correlations,

/012 can be estimated from /312

  • n each configuration

C2pt C3pt

S,U

C3pt

V,U

C3pt

A,U

C3pt

T,U

C3pt

S,D

C3pt

V,D

C3pt

A,D

C3pt

T,D

C2pt C3pt

S,UC3pt V,UC3pt A,UC3pt T,UC3pt S,DC3pt V,DC3pt A,DC3pt T,D

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

| Correlation coefficient |

slide-5
SLIDE 5

Ma Machine Learning

  • One can consider the machine

learning (ML) process as a data fitting

  • The machine ! has very general

fitting functional form with huge number of free parameters

  • The free parameters are determined

from large number of training data: ! "# ≈ %#

  • For example,

"#: pixels of a picture %# : “cat” or “dog” Machine

!

Input: "# = ((#

), (# +, (# ,, … )

Output: %#

slide-6
SLIDE 6

Ma Machine Learning on

  • n Lattice QCD

QCD Observables

  • Assume N+M indep. measurements
  • Common observables !" on all N+M

Target observable #" on first N Machine

$

Input: !" = ('"

(, '" *, '" +, … )

Output: #"

1) Train machine F to yield #" from !"

  • n the Training Data

2) Predict #" of the Test data from !"

.(!") = #"

/ ≈ #"

N M

(!", #") (!") [Training Data] [Test Data]

slide-7
SLIDE 7

Pr Prediction Bias

  • !(#$) = '$

( ≈ '$

  • Simple average

' = 1 + ,

$-./0 ./1

'$

(

is not correct due to prediction bias

  • Prediction = TrueAnswer + Noise + Bias
  • ML prediction may have bias

'$

( ≠ '$

Bias = '$

( − '$

High Bias Low Bias Low Variance High Variance

slide-8
SLIDE 8

Bi Bias Cor Correction

  • n
  • Average of predictions on test data with bias correction

! = 1 $ %

&'()* ()+

!&

, + 1

./ %

&'(0)* (0)(1

!& − !&

,

  • Expectation value, ! = !&

, + !& − !& , = !&

  • Training data should not overlap with bias correction data
  • Not efficient: small training/bias correction data

Nt M

(4&, !&) (4&) [Training Data] [Test Data]

Nb

[Bias Correction Data] (4&, !&)

slide-9
SLIDE 9

Bi Bias Cor Correction

  • n – Cr

Cros

  • ss Validation
  • n
  • Average of predictions on test data with bias correction

! = 1 $ %

&'( )

1 * %

+',-( ,-.

!+

/,& + 1

2 %

3'( 4

!3

& − !3 /,&

$ = 6/2, 2 ≪ 6

  • Full training data & precise bias estimation
  • Systematic error of ML prediction naturally included in error estimation

N-m m

9 = 1 9 = 2 9 = 3 … … → >(, !+

/,(= >( ?+

→ >@, !+

/,@= >@ ?+

→ >A, !+

/,A= >A ?+

slide-10
SLIDE 10

Pr Prediction of !"#$ fr from !%#$

Boosted Decision Tree Regression

Input: &' = {*+,- 0 ≤ 0/2 ≤ 3

456 }

Output: *8,-

9,;,<,= 0, > ! "

*+,- 0 *8,- 0, >

A,S,T,V

slide-11
SLIDE 11

De Deci cision T Tree R Regression

!"#$

%

&/( = 10, -/( = 5

Input: {!0#$ 0 ≤ &/( ≤ 20 } Output: !"#$

%

10, 5

slide-12
SLIDE 12

Boos Boosted Decision

  • n Tree (BD

BDT)

  • Iterative boosting

!

" = [Simple DT ℎ"]

!

% = ! " + [Simple DT ℎ% that corrects residual error of ! "]

!

& = ! % + [Simple DT ℎ& that corrects residual error of ! %]

!

' = ! & + [Simple DT ℎ' that corrects residual error of ! &]

… !

( = ! ()% + ℎ(

! + = !

,-../0(+)

  • In this study, 345567 = 200 − 500
slide-13
SLIDE 13

De Deci cision T Tree ℎ" fo for #$%&

'

10, 5

slide-14
SLIDE 14

De Deci cision T Tree ℎ" fo for #$%&

'

10, 5

slide-15
SLIDE 15

De Deci cision T Tree ℎ"# fo for $"%&

'

10, 5

slide-16
SLIDE 16

Pr Prediction of !"#$ fr from !%#$

20 40 60 80 100 120 140

  • 10 -5

5 10

Axial

τ=10 t=5

Frequency

C3pt

A [ × 10-16 ]

  • 10 -5

5 10

Vector C3pt

V [ × 10-16 ]

C3pt

Genu− <C3pt Genu>

C3pt

Genu− C3pt Pred

20 40 60 80 100 120 140

  • 30 -15

15 30

Scalar

τ=10 t=5

Frequency

C3pt

S [ × 10-16 ]

  • 10 -5

5 10

Tensor C3pt

T [ × 10-16 ]

C3pt

Genu− <C3pt Genu>

C3pt

Genu− C3pt Pred

  • Training and Test performed for
  • Clover-on-HISQ
  • & = 0.089fm, ,- = 313 MeV
  • Measurements: 2263 confs ⨉ 64 srcs
  • # of Training data: 400 confs

# of Test data: 1864 confs

  • Predictions of 1234

5

10,5 / 1934 10

slide-17
SLIDE 17

Pr Prediction of !"#$ fr from !%#$

(a) Train (b) Genuine (c) Pred.[2pt] (d) Pred.[2pt+3pt(12)]

1.180 1.200 1.220 1.240 1.260

gA

u-d

1.020 1.025 1.030 1.035 1.040 1.045 1.050 1.055

  • 4
  • 2

2 4

gV

u-d

t - τ/2

  • 4
  • 2

2 4

t - τ/2

  • 4
  • 2

2 4

t - τ/2

τ=∞

  • 4
  • 2

2 4

t - τ/2

τ=14 τ=12 τ=10 τ=8

slide-18
SLIDE 18

Pr Prediction of !"#$ fr from !%#$

(a) Train (b) Genuine (c) Pred.[2pt] (d) Pred.[2pt+3pt(12)]

0.80 0.85 0.90 0.95 1.00 1.05 1.10

gS

u-d

1.00 1.05 1.10 1.15

  • 4
  • 2

2 4

gT

u-d

t - τ/2

  • 4
  • 2

2 4

t - τ/2

  • 4
  • 2

2 4

t - τ/2

τ=∞

  • 4
  • 2

2 4

t - τ/2

τ=14 τ=12 τ=10 τ=8

slide-19
SLIDE 19

Pr Prediction of !"#$ fr from !%#$

  • Results extrapolated to & → ∞

! 1 Genuine Pred.[C2pt] Pred.[C2pt+C3pt(12)] gS 0.985(22) 1.013(30) 1.008(21) gA 1.2304(48) 1.2243(67) 1.2268(54) gT 1.0312(52) 1.0342(61) 1.0304(54) gV 1.0432(20) 1.0412(23) 1.0413(21) 2263 DM (Direct Meas.) 400 DM + 1863 Pred. 400 DM + 1863 Pred.

slide-20
SLIDE 20

Qu Quark Ch Chromo

  • mo EDM

M (cE cEDM DM)

  • Simulation in presence of CPV cEDM interaction
  • Schwinger source method

Include cEDM term in valence quark propagators by modifying Dirac operator

  • cEDM contribution to nEDM can be obtained

by calculating vector form-factor F3 with propagators including cEDM & O"# = %&'%

S = SQCD + ScEDM ScEDM = − i 2 d 4x ! dqgsq(σ ⋅G)γ5q

Dclov → Dclov +iεσ µνγ5Gµν

u" d" d" u" d" d" Pε" Pε" P" P"

seq"

u" d" d" u" d" d" P" P" P)ε"

seq"

Pε"

slide-21
SLIDE 21

Pr Prediction of !"#$

%&' fr

from !"#$

()*+,

(

  • .

(/01234/

  • Predict 5678 for cEDM and 9: insertions

from 5678 without CPV

  • CPV interactions è phase in neutron mass

;<=9= + ?@A6BC-. DE = 0

  • At leading order, H can be obtained from

5678

I

≡ Tr 9: M N M

slide-22
SLIDE 22

Pr Prediction of !"#$

%&' fr

from !"#$

10 20 30 40 50

  • 2
  • 1

1 2

cEDM

t=10

Frequency

C2pt

P, cEDM [ × 10-11 ]

  • 2
  • 1

1 2

γ5

C2pt

P, γ5 [ × 10-11 ]

C2pt

Genu− <C2pt Genu>

C2pt

Genu− C2pt Pred

Boosted Decision Tree Regression Input: () = {Re, Im[2345

6,7 0 ≤ :/< ≤ 16 ]}

Output: Im 2345

7 (BCDE, FG) :

  • Training and Test performed for
  • Clover-on-HISQ
  • < = 0.12 fm, KL = 305 MeV
  • Measurements: 400 confs ⨉ 64 srcs
  • # of Training data: 100 confs

# of Test data: 300 confs

slide-23
SLIDE 23

Pr Prediction of !"#$

%&' fr

from !"#$

0.045 0.050 0.055 0.060 0.065 cEDM

αcEDM

  • 0.155
  • 0.150
  • 0.145
  • 0.140
  • 0.135

2 4 6 8 10 12 γ5

αγ5

t

Genuine ML Prediction

  • ()*+,

Genuine: 0.0527(16) Prediction: 0.0523(16)

  • (-.

Genuine: -0.1462(14) Prediction: -0.1462(16)

ØGenuine: DM on 400 confs ØPrediction: DM on 100 confs + ML prediction on 300 confs

slide-24
SLIDE 24

Su Summary

  • Machine learning is used to predict unmeasured
  • bservables from measured observables
  • Unbiased estimator using cross-validation is presented
  • Demonstrated for two lattice QCD calculations:

1) Prediction of !"#$ from !%#$ 2) Prediction of !%#$

&'( from !%#$

  • The approach can be applied to various lattice

calculations and reduce measurement cost

slide-25
SLIDE 25

BD BDT with sc sciki kit-lear learn Py Python ML Library

>>> import numpy >>> from sklearn.ensemble import GradientBoostingRegressor >>> >>> X = numpy.random.uniform(size=(100,2))*10 # 100 random samples >>> y = [x[0]**2 + 2*x[1] for x in X] >>> >>> gbr = GradientBoostingRegressor() >>> gbr.fit(X,y) # Training >>> >>> gbr.predict([[3,4]]) # 32+2⨉4 = 17 array([15.20630936]) >>> gbr.predict([[6,3]]) # 62+2⨉3 = 42 array([42.77231812]) >>> gbr.predict([[8,5]]) # 82+2⨉5 = 74 array([74.14274825])

" = $%, '% , $(, '( , … y = $%

( + 2'%, $( ( + 2'(, …

slide-26
SLIDE 26

Comp Comparison

  • n of
  • f Regression
  • n Mod

Models

Linear Regression BDT Neural Network Speed Fastest Fast Slow Performance Bad for nonlinear Okay Possibly better Tuning Parameters None or a few Few; not sensitive Many; sensitive Overfitting Risk Very Low Low High Training Data Requirement Small Medium Large Interpretability Yes Somewhat Not likely