[PPT] - Ray-Traced Global Illumination for Games: Massively Parallel Path PowerPoint Presentation

SLIDE 1

Ray-Traced Global Illumination for Games: Massively Parallel Path Space Filtering

Nikolaus Binder and Alexander Keller

SLIDE 2

Principles of Image Synthesis

Solving the visibility problem Rasterization

2

SLIDE 3

Principles of Image Synthesis

Solving the visibility problem Rasterization

2

SLIDE 4

Principles of Image Synthesis

Solving the visibility problem Rasterization clipping

2

SLIDE 5

Principles of Image Synthesis

Solving the visibility problem Rasterization clipping Z-buffer

2

SLIDE 6

Principles of Image Synthesis

Solving the visibility problem Rasterization clipping Z-buffer

2

SLIDE 7

Principles of Image Synthesis

Solving the visibility problem Rasterization Reyes clipping dicing Z-buffer kind of Z-Buffer

2

SLIDE 8

Principles of Image Synthesis

Solving the visibility problem Rasterization Reyes clipping dicing Z-buffer kind of Z-Buffer shadow maps shadow maps

2

SLIDE 9

Principles of Image Synthesis

Solving the visibility problem Rasterization Reyes Ray Tracing

P L Camera

clipping dicing acceleration data structure Z-buffer kind of Z-Buffer tracing rays with arbitrary origins shadow maps shadow maps shadow rays

2

SLIDE 10

Path tracing on a budget

SLIDE 11

Massively Parallel Path Space Filtering

SLIDE 12

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space 5

SLIDE 13

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm 5

SLIDE 14

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm

1. generate paths,

select and store vertices

5

SLIDE 15

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm

1. generate paths,

select and store vertices

2. average contributions

with similar vertex descriptors

5

SLIDE 16

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm

1. generate paths,

select and store vertices

2. average contributions

with similar vertex descriptors

3. use averaged contributions

5

SLIDE 17

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm

1. generate paths,

select and store vertices

2. average contributions

with similar vertex descriptors

3. use averaged contributions

5

SLIDE 18

Massively Parallel Path Space Filtering

Sharing instead of splitting

filtering beyond screen space algorithm

1. generate paths,

select and store vertices

2. average contributions

with similar vertex descriptors

3. use averaged contributions

5

SLIDE 19

Massively Parallel Path Space Filtering

Bottleneck: Calculating averages

include many “close by” contributions in average 6

SLIDE 20

Massively Parallel Path Space Filtering

Bottleneck: Calculating averages

include many “close by” contributions in average – efficient culling by range search 6

SLIDE 21

Massively Parallel Path Space Filtering

Bottleneck: Calculating averages

include many “close by” contributions in average – efficient culling by range search – but still have to iterate over all of them 6

SLIDE 22

Massively Parallel Path Space Filtering

Bottleneck: Calculating averages

include many “close by” contributions in average – efficient culling by range search – but still have to iterate over all of them – and every vertex needs to do this individually 6

SLIDE 23

Massively Parallel Path Space Filtering

Principle

input

7

SLIDE 24

Massively Parallel Path Space Filtering

Principle

input local averaging

7

SLIDE 25

Massively Parallel Path Space Filtering

Principle

input local averaging average per cell

instead of calculating one average per vertex, calculate one average per cell – cell identified by quantizing a descriptor (xi,...) – proximity defined by equality after quantization instead of distance – worst case complexity O(N) instead of O(N2) 7

SLIDE 26

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell

8

SLIDE 27

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell with jittering

8

SLIDE 28

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell with jittering

jittering descriptor (xi, ...) on store and look up 9

SLIDE 29

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell with jittering

jittering descriptor (xi, ...) on store and look up – hides quantization artifacts 9

SLIDE 30

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell with jittering

jittering descriptor (xi, ...) on store and look up – hides quantization artifacts – resulting uniform noise amenable to (existing) post filtering 9

SLIDE 31

Massively Parallel Path Space Filtering

Resolving quantization artifacts

input average per cell with jittering

jittering descriptor (xi, ...) on store and look up – hides quantization artifacts – resulting uniform noise amenable to (existing) post filtering amounts to stochastic evaluation of interpolation 9

SLIDE 32

Massively Parallel Path Space Filtering

Hashing instead of searching

descriptors for selected vertices include

world space location x

10

SLIDE 33

Massively Parallel Path Space Filtering

Hashing instead of searching

descriptors for selected vertices include

world space location x and optionally normal n,

10

SLIDE 34

Massively Parallel Path Space Filtering

Hashing instead of searching

descriptors for selected vertices include

world space location x and optionally normal n, incident angle ω,

10

SLIDE 35

Massively Parallel Path Space Filtering

Hashing instead of searching

descriptors for selected vertices include

world space location x and optionally normal n, incident angle ω, and BRDF layer

10

SLIDE 36

Massively Parallel Path Space Filtering

Storing and looking up data with quantized descriptors

fast updates, no pre-processing 11

SLIDE 37

Massively Parallel Path Space Filtering

Storing and looking up data with quantized descriptors

fast updates, no pre-processing access in constant time 11

SLIDE 38

Massively Parallel Path Space Filtering

Storing and looking up data with quantized descriptors

fast updates, no pre-processing access in constant time – requires injective mapping (x,n,...) → [0,M) 11

SLIDE 39

Massively Parallel Path Space Filtering

Storing and looking up data with quantized descriptors

fast updates, no pre-processing access in constant time – requires injective mapping (x,n,...) → [0,M)

⇒ hash map

11

SLIDE 40

Massively Parallel Path Space Filtering

Fast hash map

trade a larger table size for faster access 12

SLIDE 41

Massively Parallel Path Space Filtering

Fast hash map

trade a larger table size for faster access simple, fast hash functions 12

SLIDE 42

Massively Parallel Path Space Filtering

Fast hash map

trade a larger table size for faster access simple, fast hash functions linear probing for collision resolution 12

SLIDE 43

Massively Parallel Path Space Filtering

Fast hash map

trade a larger table size for faster access simple, fast hash functions linear probing for collision resolution use a second hash of the descriptor instead of storing full keys – may fail, but is very very unlikely 12

SLIDE 44

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

i ← hash(˜ x,...) % table size for both averaging and querying

13

SLIDE 45

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

i ← hash(˜ x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

13

SLIDE 46

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

l′ ← level of detail(|pcam −x′|) ˜ x ←

x′

scale·2l′

i ← hash(˜

x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

13

SLIDE 47

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

l ← level of detail(|pcam −x|) x′ ← x+ jitter(n) · scale ·2l l′ ← level of detail(|pcam −x′|) ˜ x ←

x′

scale·2l′

i ← hash(˜

x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

13

SLIDE 48

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

l ← level of detail(|pcam −x|) x′ ← x+ jitter(n) · scale ·2l l′ ← level of detail(|pcam −x′|) ˜ x ←

x′

scale·2l′

i ← hash(˜

x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

13

SLIDE 49

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

l′ ← level of detail(|pcam −x′|) ˜ x ←

x′

scale·2l′

i ← hash(˜

x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

13

SLIDE 50

Massively Parallel Path Space Filtering

Linear instead of quadratic

finding the hash table location i

l ← level of detail(|pcam −x|) x′ ← x+ jitter(n) · scale ·2l l′ ← level of detail(|pcam −x′|) ˜ x ←

x′

scale·2l′

i ← hash(˜

x,...) % table size v ← hash2(˜ x,n,...) for both averaging and querying

jittering before quantization hides discretization artifacts in uniform noise 13

SLIDE 51

Massively parallel path space filtering at second bounce (2ms@HD)

SLIDE 52

Massively Parallel Path Space Filtering

Temporal filtering vs. temporal accumulation

exponential moving average ˜

f = (1−α)fold +αfnew with constant α

– will not converge over time – lags and blurs over time – in fact low pass filter 15

SLIDE 53

Massively Parallel Path Space Filtering

Temporal filtering vs. temporal accumulation

cumulative moving average ˜

f = (1− 1

N )fold + 1 N fnew

– converges over time – vanishing adaptivity with increasing number of samples – equivalent to exponential average with α = 1

N

16

SLIDE 54

Massively Parallel Path Space Filtering

Temporal filtering vs. temporal accumulation

exponential moving average ˜

f = (1−α)fold +αfnew with adaptive α

– temporal gradients to determine α = 1 N if no temporal changes are detected α ∈ ( 1 N ,1] depending on the amount of change 17

SLIDE 55

Massively parallel path space filtering at first bounce (1ms@HD)

SLIDE 56

Reinforcement Learning

SLIDE 57

Reinforcement Learning

Goal: maximize reward

state transition yields reward

rt+1(at | st) ∈ R Agent st Environment at st+1 rt+1(at | st)

20

SLIDE 58

Reinforcement Learning

Goal: maximize reward

state transition yields reward

rt+1(at | st) ∈ R

learn a policy πt – to select an action at ∈ A(st) – given the current state st ∈ S

Agent st Environment at st+1 rt+1(at | st)

20

SLIDE 59

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

21

SLIDE 60

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

21

SLIDE 61

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

21

SLIDE 62

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

21

SLIDE 63

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

21

SLIDE 64

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

graphics example: learning the incident radiance

Q′(x,ω) = (1−α)Q(x,ω)+α

Le(y,−ω)+
S 2

+(y) fs(ωi,y,−ω)cosθiQ(y,ωi)dωi

21

SLIDE 65

Reinforcement Learning

Maximize reward by learning importance sampling

structural equivalence of integral equation and Q-learning

L(x,ω) = Le(x,ω) +

S 2

+(x)

fs(ωi,x,ω)cosθi L(h(x,ωi),−ωi) dωi Q′(s,a) = (1−α)Q(s,a)+α

r(s,a)

+ γ

A

π(s′,a′) Q(s′,a′) da′

graphics example: learning the incident radiance

Q′(x,ω) = (1−α)Q(x,ω)+α

Le(y,−ω)+
S 2

+(y) fs(ωi,y,−ω)cosθiQ(y,ωi)dωi

to be used as a policy for selecting an action ω in state x to reach the next state y := h(x,ω)

– the learning rate α is the only parameter left ◮ Technical Note: Q-Learning 21

SLIDE 66

approximate solution Q stored on discretized hemispheres across scene surface

SLIDE 67

2048 paths traced with BRDF importance sampling in a scene with challenging visibility

SLIDE 68

Path tracing with online reinforcement learning at the same number of paths

SLIDE 69

Metropolis light transport at the same number of paths

SLIDE 70

Reinforcement Learning

Principle

radiance L is light sources Le plus transported radiance Tf L

L = Le +Tf L

26

SLIDE 71

Reinforcement Learning

Principle

radiance L is light sources Le plus transported radiance Tf L

L = Le +Tf L

use an approximation, e.g. discretized hemispheres at selected locations in scene to store

Lc = Le +Tf Lc

26

SLIDE 72

Reinforcement Learning

Principle

radiance L is light sources Le plus transported radiance Tf L

L = Le +Tf L

use an approximation, e.g. discretized hemispheres at selected locations in scene to store

Lc = Le +Tf Lc for guiding importance sampling towards where the radiance comes from

26

SLIDE 73

Reinforcement Learning

Principle

radiance L is light sources Le plus transported radiance Tf L

L = Le +Tf L

use an approximation, e.g. discretized hemispheres at selected locations in scene to store

Lc = Le +Tf Lc for guiding importance sampling towards where the radiance comes from

learn the approximation by

L′

c = (1−α)Lc +α (Le +Tf Lc)

26

SLIDE 74

Reinforcement Learning

Principle

radiance L is light sources Le plus transported radiance Tf L

L = Le +Tf L

use an approximation, e.g. discretized hemispheres at selected locations in scene to store

Lc = Le +Tf Lc for guiding importance sampling towards where the radiance comes from

learn the approximation by

L′

c = (1−α)Lc +α (Le +Tf Lc) = (1−α)Lc +α

Le +∑fr,iLc,i
using the current approximation instead of tracing single paths at higher variance

26

SLIDE 75

Reinforcement Learning

Principle

shorter expected path length dramatically reduced number of paths with zero contribution 27

SLIDE 76

Reinforcement Learning

Principle

shorter expected path length dramatically reduced number of paths with zero contribution challenges – product importance sampling proportional to the integrand, i.e. policy γ ·π times value Q – efficient representation of value Q ◮ S9900 - Irradiance Fields: RTX Diffuse Global Illumination for Local and Cloud Graphics ◮ Learning light transport the reinforced way ◮ Machine learning and integral equations ◮ Neural importance sampling 27

SLIDE 77

Photon-Guided Shadow Rays

SLIDE 78

Photon-Guided Shadow Rays

Photon maps similar to massively parallel path space filtering

incorporate knowledge about visibility to improve efficiency control the number of shadow rays look up photons around a shading point and trace shadow rays towards their origin – photon origins used as virtual point light sources (VPL) 29

SLIDE 79

Light hierarchy

SLIDE 80

Photon-guided shadow rays (PGSR)

SLIDE 81

PGSR + single pass screen space PSF

SLIDE 82

Point Clouds

Stochastically hashed particle maps

n hash collision keep particle with the smallest random number and increment cell counter

issue of wide bit-width memory access on GPU hash table size vs. quantization vs. uniformity of hash function

large hash table size: less collisions, totally divergent memory access small hash table size: more collisions, automatically thinning particles in dense regions

33

SLIDE 83

Generic material model to reduce divergence

ne BSDF model fed by parameter point cloud

SLIDE 84

Ray Traced Global Illumination for Games

Building blocks

massively parallel path space filtering efficient light transport simulation by reinforcement learning photon-guided shadow rays ◮ S9900 - Irradiance Fields: RTX Diffuse Global Illumination for Local and Cloud Graphics ◮ Gradient estimation for real-time adaptive temporal filtering ◮ Massively parallel path space filtering 35