SFIO progress on Swiss-Tx SCS meeting on Frangipani: a scalable - - PDF document

▶

Aug 10, 2023 154 likes •283 views

SFIO progress on Swiss-Tx SCS meeting on Frangipani: a scalable distrib- uted file system to Linux December 1, 2000 Emin Gabrielyan EPFL, Computer Science Dept. Peripheral Systems Lab. Emin.Gabrielyan@epfl.ch SFIO library architecture

SLIDE 1

SFIO progress on Swiss-Tx

SCS meeting on Frangipani: a scalable distrib- uted file system to Linux December 1, 2000

Emin Gabrielyan EPFL, Computer Science Dept. Peripheral Systems Lab. Emin.Gabrielyan@epfl.ch

SFIO library architecture
SFIO on top of MPICH and on top of

MPIFCI, performance on T1

performance of SFIO on top of MPIFCI on
T1. Very large files, no cache effect.
Swiss-T1’s topology. Possible influence to

the SFIO performance

Conclusion
Future work

SLIDE 2

sfp_waitall

mread mwrite mreadc mwritec mreadb mwriteb mrw

sfp_writec sfp_readc sfp_write sfp_read sfp_rdwrc sfp_writeb sfp_readb SFP_CMD _WRITE SFP_CMD _READ SFP_CMD _BREAD SFP_CMD _BWRITE sfp_rflush sfp_wflush cyclic distribution requests caching MPI MPI MPI MPI flushcache sortcache mkbset bkmerge Compute Node I/O Node

SLIDE 3

SFIO All-to-All concurent write access from

all compute nodes to all I/O nodes

Global File size is 2000MByte for MPICH

and MPIFCI.

Stripe unit size is 200Byte only

I/O Com pute tonep0 I/O Com pute I/O Com pute I/O Com pute tonep1 tonep2 tonep3

Network

SLIDE 4

Superlinear speedup of SFIO/FCI due to augmentation of

cache effect when increasing the number of I/O nodes.

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 10 20 30 40 50 60 70

number of compute and I/O nodes Performance MB/s

SFIO on top of MPICH

SFIO all-to-all I/O performance on

1 2

1 4

1 6

1 8

2 1

2 2

2 3

2 4

2 5

2 6

2 7

2 8

2 9

3 1

3 2

70 60 50 40 30 20 10

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 100 200 300 400 500 600 700 800

n u m b e r

p u t e a n d I / O n

e s

SFIO on top of MPI-FCI

0102030405060708091011121314151617181920212223

2 4

2 6

2 8

3 2

800 700 600 500 400 300 200 100

Performance MB/s

Swiss-T1’s Fast Ethernet and Tnet

SLIDE 5

To avoid the cache effect the total size of SFIO files is

increasing when the number of I/O nodes grows.

50 100 150 200 250 300 350 400

1 4 7 1 1 3 1 6 1 9 2 2 2 5 2 8 3 1 write maximum write average read maximum read average

throughput MB/s Number of I/O Nodes SFIO All-to-all performance on T1. (1GB-31GB file size, 200Byte chunk, 53 measurements)

SLIDE 6

Swiss-T1 TNet interconnection and routing topology 1 2 3 4 5 6 7 8 20 22 21 19 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 02 60 59 01 00 63 62 61 03 10 09 08 07 06 05 04 11 12 13 14 15 16 17 18 ?? Processor ? TNet 12 port Full Crossbar Switch TNet connection Logical Routing

SLIDE 7

Swiss-T1 SFIO over TNet topology 1 2 3 4 5 6 7 8 20 22 21 19 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 02 60 59 01 00 63 62 61 03 10 09 08 07 06 05 04 11 12 13 14 15 16 17 18 ?? I/O Node ?? Compute Node ? TNet 12 port Full Crossbar Switch TNet connection Logical Routing

SLIDE 8

connection loads

56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56

1 2 3 4 5 6 7

100 100 75 75 75 50 50 25

36 Pr.

56 Compute Node 56 I/O Node Tnet Switch 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53 53

1 2 3 4 5 6 7

89 89 78 100 78 44 44 33

38 Pr.

53 Compute Node 53 I/O Node Tnet Switch 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42

1 2 3 4 5 6 7

67 67 67 100 67 33 33 33

40 Pr.

42 Compute Node 42 I/O Node Tnet Switch 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44

1 2 3 4 5 6 7

67 75 67 100 8 8 67 17 33 42 33 8

42 Pr.

44 Compute Node 44 I/O Node Tnet Switch

SLIDE 9

Theoretical throughput

10 20 30 40 50 60 70 1 5 9 1 3 1 7 2 1 2 5 2 9

number of nodes throughput in percentage

max aver CODINE min

Theoretical throughput of the Swiss-T1 network as a percentage of ideal throughput of fully crossbared switch. 1008000 montecarlo events of parallel

simulation. The min represent the worst topology and the max the best

SLIDE 10

Conclusion

SFIO is portable, highly scalable, and ready

for the distribution.

SLIDE 11

Future work

SFIO performance benchmarking on the large

supercomputer of Sandia National Laboratory.

Performance measurements of MPI-I/O inter-

faced to SFIO through MPICH/ADIO.

Possibly, creation of a portable MPI-I/O interface

SFIO progress on Swiss-Tx

SCS meeting on Frangipani: a scalable distrib- uted file system to Linux December 1, 2000

Emin Gabrielyan EPFL, Computer Science Dept. Peripheral Systems Lab. Emin.Gabrielyan@epfl.ch

MPIFCI, performance on T1

the SFIO performance

all compute nodes to all I/O nodes

and MPIFCI.

Network

cache effect when increasing the number of I/O nodes.

SFIO all-to-all I/O performance on

Swiss-T1’s Fast Ethernet and Tnet

increasing when the number of I/O nodes grows.

Conclusion

for the distribution.

Future work

supercomputer of Sandia National Laboratory.

faced to SFIO through MPICH/ADIO.

library to SFIO.