Superinstructions and Replication in the Cacao JVM interpreter M. - - PowerPoint PPT Presentation

superinstructions and replication in the cacao jvm
SMART_READER_LITE
LIVE PREVIEW

Superinstructions and Replication in the Cacao JVM interpreter M. - - PowerPoint PPT Presentation

Superinstructions and Replication in the Cacao JVM interpreter M. Anton Ertl Christian Thalinger Andreas Krall TU Wien Why interpreters? Execution Time Architecture Mono Cacao 1000 Alpha interp. JIT AMD64 JIT JIT ARM JIT JIT


slide-1
SLIDE 1

Superinstructions and Replication in the Cacao JVM interpreter

  • M. Anton Ertl

Christian Thalinger Andreas Krall TU Wien

slide-2
SLIDE 2

Why interpreters?

Porting/Retargetting Effort (d) Execution Time 1 10 100 1000 0.1 1 10 100 1000 Interpreters JITs

Architecture Mono Cacao Alpha interp. JIT AMD64 JIT JIT ARM JIT JIT HP-PA interp. IA32 JIT JIT IA64 JIT MIPS JIT MIPS64 JIT PowerPC JIT JIT PowerPC64 interp. s390 JIT s390x JIT SPARC SPARC64

slide-3
SLIDE 3

Threaded Code

VM Code VM instruction routines Machine code for iadd Dispatch next instruction Machine code for imul Dispatch next instruction imul iadd iadd ...

slide-4
SLIDE 4

Dynamic Superinstructions

data segment threaded VM Code code segment VM routine template Machine code for iload Dispatch next iload b iload c isub istore a Machine code for isub Dispatch next data segment

  • dyn. superinst code

Machine code for iload Machine code for iload Machine code for isub Machine code for istore ... Dispatch next Machine code for istore Dispatch next ...

slide-5
SLIDE 5

Replication

iload b iload c isub istore a Machine code for iload Machine code for iload Machine code for isub Machine code for istore ... Dispatch next ... iload e iload f isub istore d ... iload b iload c isub istore a Machine code for iload Machine code for iload Machine code for isub Machine code for istore ... Dispatch next ... iload e iload f isub istore d Machine code for iload Machine code for iload Machine code for isub Machine code for istore ... Dispatch next ...

No Replication Replication

+ Increases BTB prediction accuracy + Simpler − Increases code size

slide-6
SLIDE 6

JVM and .NET problems

  • Quickening
  • Potential exception-throwing instructions
  • How much benefit?
slide-7
SLIDE 7

Quickable Instructions ACONST ARRAYCHECKCAST CHECKCAST GETFIELD CELL GETFIELD INT GETFIELD LONG GETSTATIC CELL GETSTATIC INT GETSTATIC LONG INSTANCEOF INVOKEINTERFACE INVOKESPECIAL INVOKESTATIC INVOKEVIRTUAL MULTIANEWARRAY NATIVECALL PUTFIELD CELL PUTFIELD INT PUTFIELD LONG PUTSTATIC CELL PUTSTATIC INT PUTSTATIC LONG

slide-8
SLIDE 8

Simple Solution

data segment threaded VM Code code segment VM routine template Machine code for iload Dispatch next iload b getfield_quick Example.i

  • ffset

istore a Machine code for getfield Dispatch next data segment

  • dyn. superinst code

Machine code for iload Dispatch next Machine code for istore ... Dispatch next Machine code for istore Dispatch next ... Machine code for getfield_quick Dispatch next before executing getfield after executing getfield

slide-9
SLIDE 9

SableVM’s Sophisticated Solution

data segment threaded VM Code code segment VM routine template Machine code for iload Dispatch next iload b getfield_quick

  • ffset

istore a Machine code for getfield Dispatch next data segment

  • dyn. superinst code

Machine code for skip_operand Machine code for iload Machine code for getfield_quick Machine code for istore ... Dispatch next Machine code for istore Dispatch next ... Machine code for getfield_quick Dispatch next Machine code for goto Dispatch next Machine code for replace Dispatch next super|goto prepseq iload b getfield Example.i

  • p-ptr

istore a ... before executing prepseq after executing prepseq replace super inst-ptr goto behind unused slot

slide-10
SLIDE 10

Cacao’s Sophisticated Solution

data segment threaded VM Code code segment VM routine template Machine code for iload Dispatch next super|iload b getfield Example.i

  • ffset

istore a Machine code for getfield Dispatch next data segment

  • dyn. superinst code

Machine code for iload Machine code for getfield_quick Machine code for istore ... Dispatch next Machine code for istore Dispatch next ... Machine code for getfield_quick Dispatch next superstart table last quickable inst threaded code start real-machine code before executing getfield after executing getfield

slide-11
SLIDE 11

Potential Exception-Throwing Instructions IALOAD LALOAD AALOAD BALOAD CALOAD SALOAD IASTORE LASTORE BASTORE CASTORE IDIV IREM GETFIELD CELL GETFIELD INT GETFIELD LONG PUTFIELD CELL PUTFIELD INT PUTFIELD LONG INVOKEVIRTUAL INVOKESPECIAL INVOKEINTERFACE ARRAYLENGTH CHECKNULL

slide-12
SLIDE 12

Problem and Solution getfield_cell: getfield_cell: mov (%edi),%eax mov (%edi),%eax add $0x8,%edi add $0x8,%edi test %ebp,%ebp test %ebp,%ebp je throw jne no_throw jmp *0x2a0(%esp) no_throw: add $0x4,%edi add $0x4,%edi mov (%eax,%ebp,1),%ebp mov (%eax,%ebp,1),%ebp jmp *-4(%edi) jmp *-4(%edi)

slide-13
SLIDE 13

Speedup over plain threaded code

compress jess db javac mpegaudio mtrt jack speedup Pentium 4 4 2.8 2 1.4 1.0 plain threaded code

  • throw simple -repl
  • throw soph -repl
  • throw simple +repl
  • throw soph +repl

+throw simple -repl +throw soph -repl +throw simple +repl +throw soph +repl

slide-14
SLIDE 14

Speedup of various JVMs over Cacao with Superinstructions

compress jess db javac mpegaudio mtrt jack speedup Pentium 4 10 5 2 1.0 0.5 0.2 0.1 0.05 0.02 Kaffe int JamVM gij HotSpot int J9 int SableVM cacao threaded cacao +throw soph +repl cacao jit HotSpot mixed J9 mixed Jikes RVM jrockit kaffe jit

slide-15
SLIDE 15

Conclusion

  • Superinstructions can provide big speedups
  • Replication has little impact
  • Quickening:

New sophisticated solution but simple solution performs well in JIT setting

  • Relocatability of throwing VM instructions:

Big performance impact Solution: replace relative with indirect jumps