A RAY TRACING DEEP DIVE Holger Gruen (NVIDIA), Jon Story (NVIDIA), - - PowerPoint PPT Presentation

a ray tracing deep dive
SMART_READER_LITE
LIVE PREVIEW

A RAY TRACING DEEP DIVE Holger Gruen (NVIDIA), Jon Story (NVIDIA), - - PowerPoint PPT Presentation

SHADOWS OF THE TOMB RAIDER A RAY TRACING DEEP DIVE Holger Gruen (NVIDIA), Jon Story (NVIDIA), Michiel Roza (Nixxes) 03/19/2019 www.nvidia.com/GTC Shadow of the Tomb Raider Shadows of the Tomb Raider Why ray traced shadows?


slide-1
SLIDE 1

www.nvidia.com/GTC

Holger Gruen (NVIDIA), Jon Story (NVIDIA), Michiel Roza (Nixxes) 03/19/2019

“SHADOWS” OF THE TOMB RAIDER – A RAY TRACING DEEP DIVE

slide-2
SLIDE 2

2

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-3
SLIDE 3

3

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-6
SLIDE 6

6

Shadow Mapped

slide-7
SLIDE 7

7

Shadow Mapped

slide-8
SLIDE 8

8

Shadow Mapped

slide-9
SLIDE 9

9

Raytraced

slide-10
SLIDE 10

10

WHY RAY TRACED SHADOWS?

Pixel perfect shadows Translucent shadows Point lights

Currently faked by using two spot lights

Area lights

There’s so much that shadow mapping can’t do!

slide-11
SLIDE 11

11

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-12
SLIDE 12

12

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-13
SLIDE 13

13

BLAS

DXR SHADERS

DXR Mini intro

TLAS BLAS BLAS BLAS BLAS BLAS

Acceleration Structures Shaders

Raygen() Anyhit() ClosestHit()

slide-14
SLIDE 14

14

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-15
SLIDE 15

15

DXR SHADERS

Noise / random number generation

Penumbra Umbra

Ideally we want to trace many rays to find out how much of the light source a point can see

slide-16
SLIDE 16

16

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-17
SLIDE 17

17

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-18
SLIDE 18

18

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-19
SLIDE 19

19

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-20
SLIDE 20

20

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-21
SLIDE 21

21

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-22
SLIDE 22

22

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-23
SLIDE 23

23

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-24
SLIDE 24

24

  • For great performance we want to shoot only 1 ray per pixel
  • So instead of one pixel shooting many rays, a neighborhood of pixels samples

‘enough’ random positions on the light source

DXR SHADERS

Noise / random number generation

1 2 3 4 5 6 7 8 9

slide-25
SLIDE 25

25

DXR SHADERS

  • Random positions on the light are based on generating pseudo random numbers
  • The trick is to choose the right random seed for the generator
  • We use a seed that is based on the 2D position of the pixel

Noise / random number generation Seed( ( pixel_2d_pos ) % TILE_SIZE_2D );

http://www.reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/

slide-26
SLIDE 26

26

Noisy shadows

slide-27
SLIDE 27

27

DXR SHADERS

Noise / random number generation

slide-28
SLIDE 28

28

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-29
SLIDE 29

29

We use specialized raygen shaders for each light type for optimal performance

DXR SHADERS

Ray generation

Directional light source with an angular extent Area Cone light Point light with a spherical area Rectangular area cone light

slide-30
SLIDE 30

30

All light types

DXR SHADERS

Ray generation z-buffer

WS Pixels

N Move some small distance along the normal to prevent self-shadowing!

slide-31
SLIDE 31

31

DXR SHADERS

All light types

Ray generation z-buffer

WS Pixel

No rays for pixels that face away from the current light! N

slide-32
SLIDE 32

32

Directional lights

DXR SHADERS

Ray generation z-buffer

WS Pixels

slide-33
SLIDE 33

33

DXR SHADERS

Spot lights

Ray generation z-buffer

WS Pixels

Rays only get generated for pixels:

  • Inside the cone of the light
  • Within reach of the light
slide-34
SLIDE 34

34

Point lights

DXR SHADERS

Ray generation z-buffer

WS Pixels

Rays only get generated for pixels:

  • Within reach of the light
slide-35
SLIDE 35

35

Rectangular lights

WS Pixels

DXR SHADERS

Ray generation z-buffer

Rays only get generated for pixels:

  • Inside the cone of the light
  • Within reach of the light
slide-36
SLIDE 36

36

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-37
SLIDE 37

37

DXR SHADERS

  • We don‘t use the closest hit along the ray
  • Instead we use the first opaque one that gets reported to anyhit()
  • Translucency for DXR ultra quality is an exception
  • We fetch texture coords for alpha tested prims to carry out the

alpha test Hit shaders

slide-38
SLIDE 38

38

DXR SHADERS

  • Opaque Geometry
  • Alpha-tested Geometry

Hit Shaders

void OpaqueClosestHit(…) { payload.hitT = RayTCurrent(); payload.visibility = 0.0f; } void OpaqueAnyHit(…) { AcceptHitAndEndSearch(); } void AlphaClosestHit(…) { payload.hitT = RayTCurrent(); } void AlphaAnyHit(…) { float alpha = GetHitAlpha(bary); if( alpha < g_fAlphaThreshold ) IgnoreHit(); else { payload.visibility = 0.0f; AcceptHitAndEndSearch(); } }

slide-39
SLIDE 39

39

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-40
SLIDE 40

40

DXR SHADERS

  • Adaptive raytracing is only used for ultra DXR quality
  • In this mode we cast more than 1 ray for some pixels
  • Let’s dive into the details …

Adaptive raytracing

slide-41
SLIDE 41

41

Phase 1 – Cast 1 ray per pixel Phase 2 – Cast up to 2 more rays ‚where necessary‘

DXR SHADERS

Adaptive raytracing

Visibility hitT Neighborhood of pixel

Visibility:

  • Black => can’t see the light
  • White => can see the light

hitT:

  • Distance to blocker along ray
  • White means ‘infinity’
  • Dark means close-by blocker
slide-42
SLIDE 42

42

Phase 1 – Cast 1 ray per pixel Phase 2 – Cast up to 2 more rays ‚where necessary‘

DXR SHADERS

Adaptive raytracing

Visibility hitT Neighborhood of pixel

No additional rays

slide-43
SLIDE 43

43

Phase 1 – Cast 1 ray per pixel Phase 2 – Cast up to 2 more rays ‚where necessary‘

DXR SHADERS

Adaptive raytracing

Visibility hitT Neighborhood of pixel

No additional rays

slide-44
SLIDE 44

44

Phase 1 – Cast 1 ray per pixel Phase 2 – Cast up to 2 more rays ‚where necessary‘

DXR SHADERS

Adaptive raytracing

Visibility hitT Neighborhood of pixel

1 more ray

Visibility = (Visibility0+Visibility1)/2

slide-45
SLIDE 45

45

Phase 1 – Cast 1 ray per pixel Phase 2 – Cast up to 2 more rays ‚where necessary‘

DXR SHADERS

Adaptive raytracing

Visibility hitT Neighborhood of pixel

2 more rays

Visibility = (Visibility0+Visibility1+Visibility2)/3

slide-46
SLIDE 46

46

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-47
SLIDE 47

47

DXR SHADERS

  • Ultra DXR quality also features translucent shadows
  • We support up to 3 layers of translucency
  • Mainly to keep performance at acceptable levels
  • Translucency should be straightforward with DXR right?
  • Want to keep on using anyhit() instead of iterated closesthit()
  • Let’s look at the details …

Translucency

slide-48
SLIDE 48

48

WS Pixel

DXR SHADERS

Translucency

anyhit() order non- deterministic

1 3 2

slide-49
SLIDE 49

49

WS Pixel

DXR SHADERS

Translucency

anyhit() order non- deterministic

1 2 3

slide-50
SLIDE 50

50

WS Pixel

DXR SHADERS

Translucency Subtraction is order independent:

➢ Let each layer subtract 1/3 of the light ➢ Pixel in full shadow after 3 order independent hits

slide-51
SLIDE 51

51

DXR SHADERS

Translucency

void TranslucentAnyHit(…) { float alpha = GetHitAlpha(bary, PrimID); if( alpha >= g_fAlphaThreshold ) payload.visibility -= ( 1.0f / 3.0f ); if( payload.visibility < 0.01f ) { payload.visibility = 0.0f; AcceptHitAndEndSearch(); } else IgnoreHit();

}

slide-52
SLIDE 52

52

Opaque raytraced shadows

slide-53
SLIDE 53

53

Translucent raytraced shadows

slide-54
SLIDE 54

54

DXR SHADERS

  • Noise / random number generation
  • Ray generation
  • Hit shaders
  • Adaptive raytracing
  • Translucency
  • TAA and jittering

Mini Agenda

slide-55
SLIDE 55

55

DXR SHADERS

  • Like many games SotTR uses jittered TAA
  • Each frame adds a ‘random’ subpixel offset to all geometry
  • Surprisingly this creates problems with flickering shadows!

TAA + Jittering

slide-56
SLIDE 56

56

DXR SHADERS

TAA + Jittering

a 2x2 pixel grid

  • The red dots are the pixel

centers

  • This is where rasterized

geometry is sampled

slide-57
SLIDE 57

57

DXR SHADERS

TAA + Jittering

Rasterizing a 3D quadrangle

The intermediate positions and the grid are shown to help understand how 3D positions change across the quad

slide-58
SLIDE 58

58

DXR SHADERS

TAA + Jittering Jitter somewhat …

slide-59
SLIDE 59

59

DXR SHADERS

  • Jittering changes the WS position that is sampled at pixel centers
  • It also changes the depth values at the pixel centers
  • Jittering changes the reconstructed world space positions

TAA + Jittering

Shadow ray origins jitter as well

slide-60
SLIDE 60

60

DXR SHADERS

Jittered ray positions are not problematic in general, but:

  • We typically shoot only one ray per pixel
  • Which is equivalent to ‘point sampling‘ of the visibility signal
  • Large areas of flat ground are problematic
  • Vertical jittering leads to large differences in WS positions
  • Also visible with shadow maps but less because of SM filtering

TAA + Jittering

slide-61
SLIDE 61

61

VIDEO

slide-62
SLIDE 62

62

DXR SHADERS

Solutions:

  • 1. Currently we render an extra depth pass without jittering
  • Use non-jittered depth to reconstruct WS ray origins
  • 2. Future: Render ddx/ddy(1/z_buffer_depth) with depth pass
  • Reconstruct non-jittered depth
  • Use non-jittered depth to construct WS ray origins
  • 1/z-buffer-depth is linear in screen space

TAA + Jittering

slide-63
SLIDE 63

63

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-64
SLIDE 64

64

INPUT / OUTPUT

Visibility HitT Normals Depth Light Desc Params GameWorks Spatial Denoiser Edward Liu & Jon Story

slide-65
SLIDE 65

65

ISOTROPIC KERNEL

slide-66
SLIDE 66

66

ANISOTROPIC KERNEL

slide-67
SLIDE 67

67

OVERLAPPING PENUMBRA #1

slide-68
SLIDE 68

68

OVERLAPPING PENUMBRA #2

slide-69
SLIDE 69

69

Need to detect depth boundaries

BLEEDING ARTIFACTS

slide-70
SLIDE 70

70

CUSTOMIZED BOUNDARY DETECTION

slide-71
SLIDE 71

71

COULD WE DO LESS WORK?

slide-72
SLIDE 72

72

PENUMBRA MASK

slide-73
SLIDE 73

73

IMPORTANT FEATURES

Half resolution denoising

Drastically improves performance SOTTR uses this mode for ALL light types

MSAA input Depth & Normal buffers supported

Still only requires single sample Visibility & HitT buffers Produces MSAA shadow mask

Sub-viewports supported for local light sources

Just need to figure out screen area affected

slide-74
SLIDE 74

74

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-75
SLIDE 75

75

BLAS

Vertex and index buffers for each geometry Straightforward for static geometry

“a mesh”

struct D3D12_RAYTRACING_GEOMETRY_TRIANGLES_DESC { DXGI_FORMAT IndexFormat; DXGI_FORMAT VertexFormat; UINT IndexCount; UINT VertexCount; D3D12_GPU_VIRTUAL_ADDRESS IndexBuffer; D3D12_GPU_VIRTUAL_ADDRESS VertexBuffer; }

slide-76
SLIDE 76

76

BLAS

Each vertex needs to be fully transformed! Foundation Engine uses shader graphs Added a shader permutation in VS template for exporting a transformed vertex buffer Run a pass for all dynamic objects before building

What about skinned objects and vertex animations?

#if ExportVertexBuffer RWStructuredBuffer<float3> OutVertexBuffer; #endif VertexOutput main( in VertexInput vi, uint vertID : SV_VertexID) { VertexOutput vo; %ShaderGraph% #if ExportVertexBuffer OutVertexBuffer[vertID] = vo.OutPosition; #endif return vo; }

slide-77
SLIDE 77

77

Skinning gone wrong: Inner demon ☺

slide-78
SLIDE 78

78

BLAS

PureHair, an evolution of TressFX

Simulates control points Renders strands of hair as camera facing quads

Everything needs to be actual geometry in the AS

Make the simplest cylinder possible for every strand

Lara’s hair

slide-79
SLIDE 79

79

BLAS

Two modes of updating dynamic BLASes in DXR:

Rebuild, essentially “replacing” the old one (~100M tris/sec) Refit, for “small” model changes (~1000M tris/sec, 10x as fast!)

Catch: ray trace performance might degrade! Top refitting throughput only for large enough workloads

We chose to always refit BLAS unless # vertices change

Rebuild/refit strategy

slide-80
SLIDE 80

80

TLAS

Static BLASes can be instanced Always rebuilding TLAS seems to be fast enough (<1ms)

“A scene”

struct D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_INPUTS { UINT NumDescs; D3D12_GPU_VIRTUAL_ADDRESS InstanceDescs; }

slide-81
SLIDE 81

81

ACCELERATION STRUCTURE

Every LOD level is stored in a separate BLAS Using LOD 0 for everything caused self-shadowing artifacts! Just use the same LOD we use for rendering What about LOD fading?

Use “most visible” LOD

About LODs of meshes

slide-82
SLIDE 82

82

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-83
SLIDE 83

83

RENDER PIPELINE

Depth pass Shadow map pass Shadow resolve Forward

  • paque pass

Forward+ renderer

slide-84
SLIDE 84

84

RENDER PIPELINE

Vertex transform Build AS Depth pass Jittered/ Non jittered Shadow map pass (Ray traced) Shadow resolve Forward

  • paque pass

Now with ray traced shadows!

slide-85
SLIDE 85

85

RENDER PIPELINE

Vertex transform Build AS Depth pass Jittered/ Non jittered Shadow map pass (Ray traced) Shadow resolve Forward

  • paque pass

Now with ray tracing!

4ms 0.5ms 2ms 3ms 5ms 3ms

slide-86
SLIDE 86

86

RENDER PIPELINE

Vertex transform Build AS Depth pass Jittered/ Non jittered Shadow map pass (Ray traced) Shadow resolve Forward

  • paque pass

4ms 0.5ms 2ms 3ms 5ms 3ms Can run async with depth and shadow map passes! ☺

slide-87
SLIDE 87

87

Vertex transform

Depth pass Jittered/ Non jittered

Shadow map pass (Ray traced) Shadow resolve Forward opaque pass

3ms

RENDER PIPELINE

Async compute

Build AS

4ms completely hidden! 0.5ms 2ms 5ms 3ms

slide-88
SLIDE 88

88

WHY DO WE STILL NEED SHADOW MAPPING?

Translucent rendering has no depth write Can’t use shadow resolve pass! We cannot shoot rays from pixel shaders

Translucent rendering

slide-89
SLIDE 89

89

WHY DO WE STILL NEED SHADOW MAPPING?

Updating entire scene full of dynamic objects costs up to 20ms of BLAS refits  AS culling using existing shadow map culling

Performance!

slide-90
SLIDE 90

90

WHY DO WE STILL NEED SHADOW MAPPING?

For directional lights:

Replace only nearest cascades with ray traced shadows

For local lights:

Distance based fade to shadow map

How do we choose these distances?

Performance!

slide-91
SLIDE 91

91

ARTIST TOOLS

Let lighting artists decide!

slide-92
SLIDE 92

92

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-93
SLIDE 93

93

No point light shadows

slide-94
SLIDE 94

94

Raytraced point light shadows

slide-95
SLIDE 95

95

Shadow mapped area light shadows

slide-96
SLIDE 96

96

Raytraced area light shadows

slide-97
SLIDE 97

97

Shadow mapped sun light shadows

slide-98
SLIDE 98

98

Raytraced sun light shadows

slide-99
SLIDE 99

99

AGENDA

“Shadows” of the Tomb Raider Shadow of the Tomb Raider Why ray traced shadows? DXR shaders (ray generation) GameWorks spatial denoiser DXR acceleration structure Integration in render pipeline Results Future work

slide-100
SLIDE 100

100

FUTURE WORK

Reconstruct non-jittered depth Ray traced shadows on translucent geometry Tessellation Content authoring with ray tracing in mind Use vertex transform pass for rasterization as well GI / Reflections / AO / … ?

slide-101
SLIDE 101

www.nvidia.com/GTC

Holger Gruen (hgruen@nvidia.com) Jon Story (jons@nvidia.com) Michiel Roza (mroza@nixxes.com) @Paramike86

QUESTIONS