Efficient unpacking of required software from CernVM-FS
Samuel Teuber • EP-SFT Openlab Summer Student Nicholas Hazekamp, Jakob Blomer, Gerardo Ganis 13.08.2018
Efficient unpacking of required software from CernVM-FS Samuel - - PowerPoint PPT Presentation
Efficient unpacking of required software from CernVM-FS Samuel Teuber EP-SFT Openlab Summer Student Nicholas Hazekamp, Jakob Blomer, Gerardo Ganis 13.08.2018 Why is this Lack of internet connection on necessary? (1) compute nodes
Efficient unpacking of required software from CernVM-FS
Samuel Teuber • EP-SFT Openlab Summer Student Nicholas Hazekamp, Jakob Blomer, Gerardo Ganis 13.08.2018
Why is this necessary? (1)
Challenges faced in some HPC environments (e.g. NERSC)
compute nodes
Why is this necessary? (2)
Challenges faced in Benchmarking
time
(e.g. internet connection)
How are these challenges tackled today?
No harddisk Cache on HPC file system No internet connection cvmfs_preload Prepopulate cvmfs cache No FUSE client uncvmfs Download entire CVMFS repositories Filter afterwards Benchmarking
?
https://assets.nst.com.my/images/articles/26_bajajaaa_1521955064.jpg
Shrink Wrapping
A method for efficiently packaging required software from CVMFS into standalone images
https://en.wikipedia.org/wiki/Stretch_wrap#/media/File:Pallet_wrapper.jpg
Specification Building a specification describing the necessary files for a software run Export Export with cvmfs_shrinkwrap (tar, squashfs, docker, ...)
Run Independently
^/bar/etc/* /bar/Modules/setup.sh /foo/Packages/ROOT/* ^/foo/Packages/AliRoot/*
Application design
Image architecture
.provenance/ repo.cern.ch/ /cvmfs/ .data/ 00/ … ff/ .garbage/ Information for image reproducibility Exported repository structure Content addressed file links Garbage Collection information Hardlinks
FS Traversal
lookup)
specification
hardlinks
interfaces (extendible to other fs architectures)
Thread pool
Docker Injection
Replacing CVMFS docker layers
OS Base Layer Custom Container Layers 1 CVMFS Layer
1. Identify CVMFS Layer: Hash of layer as Image Label 2. Download “old” layer version 3. Update through shrinkwrap utility 4. Upload “new” layer version 5. Update Image Labels & Manifest OS Base Layer Custom Container Layers 1 CVMFS Layer
OS Base Layer Custom Container Layers 1 CVMFS Layer New Custom Container Layers 2 1. Identify CVMFS Layer: Hash as Image Label 2. Download “old” image version 3. Update through shrink wrap utility 4. Upload “new” image version 5. Update Image Labels
Example & Evaluation
From a vanilla docker image...
FROM centos:7 ... ADD HEP_OSlibs.repo /etc/yum.repos.d/HEP_OSlibs.repo RUN yum install -y HEP_OSlibs
$ cvmfs_shrinkwrap oci ROOT/root-demo -c hub.docker.com.conf*
Making image CVMFS injectable (injecting empty CVMFS layer)... Generating local copy of specified cvmfs repository subset... Packing tar layer... Compressing to gzip... Injecting updated cvmfs layer into hub.docker.com/ROOT/root-demo... * Command line interface interaction is still subject to change
...to a CVMFS injected image
That can run ROOT demos
Export data rate with warm cache from CVMFS to POSIX folder
Tracing & Specification Building
A method for automated image specification
Tracing & Specification Building
A method for automated image specification
Trace Automated Specification building based
Specification Trace by enabling CVMFS_TRACEFILE duing workflow
Run Independently Export Export with cvmfs_shrinkwrap (tar, squash, docker, ...)
>50k lines O(1) k lines
Future Work
Improve shrink wrapping workflow
Understand exact use cases and optimize system based on these needs
Improve automated specification building
Make use of traces from multiple software runs to build more reliable specifications
Direct exports to other formats than POSIX?
Might be more efficient to avoid the “middleman”