Good Times with a Pile of GPUs Elizabeth Baumel About me Unity - - - PowerPoint PPT Presentation

good times with a pile of gpus
SMART_READER_LITE
LIVE PREVIEW

Good Times with a Pile of GPUs Elizabeth Baumel About me Unity - - - PowerPoint PPT Presentation

Good Times with a Pile of GPUs Elizabeth Baumel About me Unity - DOTS Team Disbelief - Unreal games Gears of War 4 - D3D12 multi-GPU support on UWP Other, more exotic D3D12 multi-GPU stuff Sony - PS4 graphics dev support


slide-1
SLIDE 1

Good Times with a Pile of GPUs

Elizabeth Baumel

slide-2
SLIDE 2

About me

  • Unity - DOTS Team
  • Disbelief - Unreal games

○ Gears of War 4 - D3D12 multi-GPU support on UWP ○ Other, more exotic D3D12 multi-GPU stuff

  • Sony - PS4 graphics dev support
  • Intel - DirectX 11 drivers
slide-3
SLIDE 3

Haha, what?

  • A pile implies more than 1
  • Multiple GPUs

○ ...you might say……..

  • multi-GPU :)
slide-4
SLIDE 4

What is Multi-GPU??

  • Using multiple GPUs in a single machine to do more work.

○ Rendering ○ Compute ○ Games/real-time rendering and simulation

  • NOT about:

○ GPU supercomputing clusters ○ Cryptocurrency mining

slide-5
SLIDE 5

History of Multi-GPU

  • 1990
  • Silicon Graphics SkyWriter

○ Dual pipeline ■ Dual screen ○ Hyperpipeline ■ 2 GPUs, 1 display

  • Alternate frame rendering

○ the OG AFR!!

Source: https://web.archive.org/web/20110715174342/http://www.reputable.com/~s kywriter/skywriter/techreport/6.html

slide-6
SLIDE 6

History of Multi-GPU

  • 1998
  • 3dfx Voodoo2
  • SLI

○ Scan-Line Interleave ○ Each card rendered alternating scan lines

  • Higher max resolution

○ 1024x768 on 2 cards ○ 800x600 on 1 card

Source: https://en.wikipedia.org/wiki/Scan-Line_Interleave#/media/File:STBVoodoo2SLIcards. jpg

slide-7
SLIDE 7

History of Multi-GPU

  • 2002
  • ATI Multi-Rendering
  • Super Tiling

○ Tiled rendering across N GPUs ○ Where N is “dozens”

  • Used by Evans and Sutherland

Source: https://hothardware.com/reviews/ati-crossfire-multigpu-technology-preview?page=2

slide-8
SLIDE 8

History of Multi-GPU

  • 2004
  • Nvidia SLI

○ Now “Scalable Link Interface” ○ Custom PCB, links 2 identical GPUs

  • Split-frame Rendering (SFR)

○ Load balanced

  • Alternate frame Rendering (AFR)
  • DirectX 9

○ mGPU handled by driver ○ Game-specific profiles

Source: https://www.hexus.net/tech/reviews/graphics/916-nvidias-sli-an-i ntroduction/?page=6

slide-9
SLIDE 9

History of Multi-GPU

  • 2005
  • ATI CrossFire

○ Dual-link DVI Y-dongle ○ Links cards in same family

  • Modes:

○ SuperTiling ○ Scissor (SFR) ○ AFR ○ Super AA

slide-10
SLIDE 10

History of Multi-GPU

  • 2006

○ Nvidia ■ Quadro Plex up to 8 GPUs ■ GeForce SLI up to 4 GPUs ○ ATI ■ CrossFire -> bridge ■ Bought by AMD

  • 2007

○ CrossFireX up to 4 GPUs

slide-11
SLIDE 11

History of Multi-GPU

  • 2008

○ AMD Hybrid CrossFire ■ 780G/V chipset’s Radeon HD3200 integrated GPU ■ Radeon HD3450 discrete GPU ○ Lucid Logix Hydra Engine

  • 2009

○ DirectX 11

  • 2011

○ AMD Llano APU Dual Graphics ■ SoC IGP + discrete GPU

slide-12
SLIDE 12

History of Multi-GPU

  • 2013

○ AMD Mantle

■ Explicit multi-GPU support!!

  • 2015

○ DirectX 12

  • 2018

○ Vulkan 1.1

slide-13
SLIDE 13

Implicit vs Explicit Multi-GPU

  • Drivers manage resources
  • IHV implements
  • Game only sees 1 GPU
  • AFR only
  • Vendor-specific APIs let

you give the driver hints

  • Surgery while wearing
  • ven mitts!
  • Engine manages resources
  • Developer implements
  • Game can see all GPUs
  • Flexible rendering modes
  • No driver overhead
  • You do the malpractice

yourself!!

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

Why do Multi-GPU?

  • Performance
  • *extremely IHV voice*: to sell more GPUs
  • Heterogeneous multi-GPU setups common now
  • Why not?
slide-17
SLIDE 17

Games that use explicit multi-GPU

  • Ashes of the Singularity
  • Gears of War 4
  • Deus Ex Mankind Divided
  • Strange Brigade
  • Rise of the Tomb Raider
  • Shadow of the Tomb Raider
  • Civilization Beyond Earth
  • Civilization VI
  • Hitman (2016)
  • Battlefield 1
  • Sniper Elite 4
slide-18
SLIDE 18

Games that use explicit multi-GPU

  • Ashes of the Singularity

○ Broad support for 2+ mixed adapters

  • Gears of War 4

○ AFR on 2 linked adapters

  • Civilization Beyond Earth

○ SFR

slide-19
SLIDE 19

What you can do with Explicit Multi-GPU!

  • Hardware Configurations

○ Linked Device Adapters ○ Heterogeneous multi-GPU aka Mixed Device Adapters

  • Rendering/Work Distribution Modes

○ Alternate Frame Rendering (AFR) ○ Split Frame Rendering (SFR) ○ Tiled ○ Frame Pipelining ○ Asymmetric

slide-20
SLIDE 20

Linked Adapter Multi-GPU

  • Pros

○ Fast cross-GPU copies ○ Same cards, easy scaling ○ HUGE resolutions

e.g. Nvidia Mosaic

  • Cons

○ $$$$$$$$$$$$$$$$$

slide-21
SLIDE 21

Linked Adapter Multi-GPU IDXGIAdapter

slide-22
SLIDE 22

Heterogeneous Multi-GPU

  • Pros

○ Use any GPUs you have!

Compute Units

  • Cons

○ Can’t assume GPUs support the same texture layouts. ○ May have vastly different specs/feature support.

slide-23
SLIDE 23

Heterogeneous Multi-GPU

IDXGIAdapter IDXGIAdapter IDXGIAdapter

slide-24
SLIDE 24

Alternate Frame Rendering

  • Good when you have beefy GPUs and few inter-frame

dependencies

  • Well-understood
  • Issues

○ Need basically the same GPUs for this to work well ○ Frame pacing ○ Input latency ○ Syncing double-buffered stuff ○ Temporal effects

slide-25
SLIDE 25

Split Frame Rendering

  • Split final frame into even parallel workloads
  • Great for VR!
  • Good for low input latency
  • Load balancing
  • Frame compositing
  • Not widely used in recent times
slide-26
SLIDE 26

Tiled Rendering

  • Sorta like SFR but a lot more split up
  • Homogenize work all over your entire frame
  • Potentially lots of cross-GPU borders
  • Even rarer than SFR in recent times
slide-27
SLIDE 27

Frame Pipelining

  • Copy intermediate steps to the next GPU
  • Works better with temporal techniques

Source: https://developer.nvidia.com/explicit-multi-gpu-programming-directx-12-part-2

slide-28
SLIDE 28

Asymmetric Multi-GPU

  • Weak baby integrated GPU and RIPPED DISCRETE GPU?
  • As long as you got compute units, you can do Something
  • Short trip between iGPU and CPU, save PCIe bandwidth
slide-29
SLIDE 29

How do you actually do this

  • Enumerate adapters

○ Neat sample that shows both D3D and Vulkan:

■ https://github.com/GPUOpen-LibrariesAndSDKs/VkD3DDeviceMapping

  • Find out what features the GPUs support
  • Figure out where your resources will live
  • Figure out what needs to be synced

○ USE D3DDEBUG/VULKAN VALIDATION

  • Figure out what needs to be copied

○ Do copies on the COPY QUEUE!!!!!!!

slide-30
SLIDE 30

Challenges to anticipate

  • SYNCHRONIZATION

○ Cross-node, cross-adapter, CPU/GPUs, AFR frame sets…..etc…….

  • Bandwidth limitations

○ Them texture copies ain’t free

  • If you have a Finished™ engine

○ Fixing all the places you assumed you had 1 GPU (heaps, command lists, basically everything)

  • Tools?

○ lmao

slide-31
SLIDE 31

Tools...?

  • GPUView

○ Windows only :’)

  • roll your own
  • pixel shader printf debugging
slide-32
SLIDE 32

GPUView

  • Let’s profile!!
  • Download here:

○ https://docs.microsofu.com/en-us/windows-hardware/get-start ed/adk-install ○ Part of the Windows Performance Toolkit

  • Using D3D12HeterogeneousMultiadapter sample

○ https://github.com/Microsofu/DirectX-Graphics-Samples

slide-33
SLIDE 33

GPUView

  • [capture walkthrough]
slide-34
SLIDE 34

GPUView

  • [capture walkthrough]
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39

Conclusion

  • It’s cool you should try it!!!!
  • Wide open, lots of space for creativity
  • BIG CHALLENGE
  • If you ever wanted a project to really force you to think

about your hardware, here u go

slide-40
SLIDE 40

Questions?

  • @Icetigris
slide-41
SLIDE 41

ADDENDA

slide-42
SLIDE 42

Dual GPU Cards

  • 1999
  • Quantum3D

○ 2x Voodoo2 SLI board

Source: https://en.wikipedia.org/wiki/Scan-Line_Interleave#/media/File:Q uantum3D_Obsidian_X24_SLI_PCI.png

slide-43
SLIDE 43

Dual GPU Cards

  • 2008

○ AMD Radeon HD 3850 and 3870 X2

slide-44
SLIDE 44

Some PERF NUMBERS

  • Gears of War 4

○ 15.2 ms -> 8.6ms ○ single GeForce GTX 980 Ti -> AFR

  • Ashes of the Singularity

○ 17.1 ms ○ Radeon R9 Fury + GeForce GTX 980

slide-45
SLIDE 45
slide-46
SLIDE 46

GPUView

  • [capture walkthrough]
slide-47
SLIDE 47

GPUView

  • [capture walkthrough]
slide-48
SLIDE 48

GPUView

  • [capture walkthrough]
slide-49
SLIDE 49

GPUView

  • [capture walkthrough]