SLIDE 1 Generating Neural Networks forMicroscopySegmentation
Matthew Guay
November 1, 2017
SLIDE 2
ELECTRON MICROSCOPY
Electron microscopes (EM) produce nanometer-scale images.
SLIDE 3 SERIAL BLOCK-FACE IMAGING
Serial block-face scanning electron microscopy (SBF-SEM): Image huge 3D samples by repeated cutting and scanning.
(Denk, Horstmann 2004)
SLIDE 4 SBF-SEM APPLICATIONS
SBF-SEM images provide new insight into the organization of complex biological systems. Connectomics
(vimeo.com/101018819)
Systems biology
(Pokrovskaya et al., 2016)
SLIDE 5
IMAGE SEGMENTATION
Image segmentation: Partition image pixels into labeled regions corresponding to image content. Natural image EM image
SLIDE 6
IMAGE SEGMENTATION
Image segmentation: Partition image pixels into labeled regions corresponding to image content. Natural image EM image
SLIDE 7
AUTOMATING EM SEGMENTATION
Manual segmentation is infeasible for large SBF-SEM images. Automated segmentation: algorithmically classify each pixel, manually correct. Practical segmentation algorithm: Manual correction of algorithm output is much faster than manual segmentation.
SLIDE 8
BIOLOGICAL SEGMENTATION CHALLENGES
A practical segmentation algorithm requires high (> 99.9%) accuracy despite: Noise + small objects Diffjcult label assignment
SLIDE 9 DEEP LEARNING FOR SEGMENTATION
Sliding window network: Convolutional neural network classifjes one pixel at a time.
(Ciresan et al., 2012)
Encoder-decoder network (U-net): Convolutional encoding and decoding paths classify large image patches at once.
SLIDE 10 ENCODER-DECODER NETWORKS
Encoding path: Convolution, pooling operations decompose an image into a multiscale collection of features. Decoding path: Convolution, transposed convolution
- perations synthesize a new image from encoder features.
(Ronneberger et al., 2015)
SLIDE 11 BUILDING ENCODER-DECODER NETWORKS
Many design choices required for building encoder-decoder network architectures.
- Convolution kernel size
- Convolution kernels per layer
- Convolution layers per stack
- Use dropout?
- Use batch normalization?
- Convolution layer regularization
Design choices can be represented as numeric hyperparameters (HPs). Architecture design ⇔ HP space search.
SLIDE 12 ALGORITHMIC NETWORK DESIGN
Two optimization problems when applying neural networks to a problem domain. Learning: Optimize network weight parameters. Parameter ranges are continuous, objective function is (sub)differentiable.
- Evaluation is cheap, optimize with backpropagation.
Architecture design: Optimize network HPs. Mix of continuous and discrete ranges, objective function is not differentiable.
- Evaluation is expensive, optimization is an unstructured
search.
SLIDE 13
THE GENENET LIBRARY
genenet: Build, train, and deploy encoder-decoder networks for segmentation using Python and TensorFlow. Goal: simple network design for humans and algorithms. Build computation graphs from Gene graphs.
SLIDE 14
THE GENE GRAPH
Computation graph: sequence of functions mapping network input to output. Gene graph: A height-n tree of Genes that builds a computation graph. Gene: Gene graph node. Each builds a subgraph (module) in the computation graph.
SLIDE 15
THE GENE GRAPH
Leaf Genes (height 0) build small modules. Internal Genes (height i > 0) assemble their child Gene constructions into larger modules. Root Gene (height n) assembles a full computation graph in TensorFlow.
SLIDE 16 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
Gene graph Computation graph
SLIDE 17 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
EdgeGene
SLIDE 18 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
ConvLayerGene
SLIDE 19 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
StackGene
SLIDE 20 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
PathGene
SLIDE 21 net encode path decode path encode stack 0 encode stack 1 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 1 layer 0 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 edge input edge layer 0 edge e stack 0 edge e path edge layer 0 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge layer 2 edge d stack 1 edge e stack 0
NetGene
SLIDE 22
HP CALCULATION WITH GENENET
Height-i Gene gi with ancestors gi+1, . . ., gn. HP h (n_convlayers) has value hi at Gene gi. Each gi tracks a delta value ∆hi. hi = ∆hi + ∆hi+1 + · · · + ∆hn.
SLIDE 23
RANDOM NETWORK GENERATION
Changing ∆hn affects h for the whole Gene graph. Changing ∆h0 affects h for g0 and its descendants. Allows for easy random network generation. Choose feasible regions for HPs (one for height n, another for height i < n). Sample values from height n downward.
SLIDE 24
ENSEMBLE SEGMENTATION ALGORITHMS
Classifjer ensemble: Take some classifjers, average their predictions. For EM segmentation, form an ensemble from high-performing neural networks. Diverse network architectures contribute to high ensemble ambiguity, improving performance (Krogh, Vedelsby 1995).
SLIDE 25
PRELIMINARY SBF-SEM RESULTS
Our lab imaged a human platelet sample with a Gatan 3View. Goal: Segment cells and 5 organelle types in a 250 × 2000 × 2000 volume.
SLIDE 26
TRAINING ON BIOWULF
Biowulf: NIH high-performance computing cluster. Train networks on NVIDIA K80 GPUs. Create training jobs with Bash, load Singularity containers on Biowulf nodes, run genenet scripts.
SLIDE 27 SEGMENTATION NETWORK TRAINING
Lab members manually segmented a 50 × 800 × 800 subvolume. We trained 80 random networks for 100000 iterations on Biowulf. Mutable HPS:
n_convkernels n_stacks n_convlayers input_size log_learning_rate Regularization HPs
SLIDE 28 network 1 n params: 259447 n layers: 49 input shape: [150, 150] n kernels: 30 log learning rate: -2.79 encode path decode path encode stack 0 decode stack 1 decode stack 0 layer 0 layer 1 layer 2 layer 3 layer 4 layer 0 layer 1 layer 2 layer 3 layer 4 layer 5 layer 0 layer 1 layer 2 layer 3 edge input edge layer 0 edge layer 1 edge layer 2 edge layer 3 edge e path edge layer 0 edge layer 1 edge layer 2 edge layer 3 edge layer 4 edge d stack 1 edge e stack 0 edge layer 0 edge layer 1 edge layer 2 network 22 n params: 352510 n layers: 22 input shape: [48, 48] n kernels: 53 log learning rate: -5.2 encode path decode path encode stack 0 decode stack 1 decode stack 0 layer 0 layer 0 layer 1 layer 2 layer 0 layer 1 edge input edge e path edge layer 0 edge layer 1 edge d stack 1 edge e stack 0 edge layer 0 network 40 n params: 16946302 n layers: 62 input shape: [254, 254] n kernels: 87 log learning rate: -4.39 encode path decode path encode stack 0 encode stack 1 encode stack 2 decode stack 3 decode stack 2 decode stack 1 decode stack 0 layer 0 layer 0 layer 1 layer 2 layer 0 layer 1 layer 0 layer 1 layer 0 layer 1 layer 2 layer 3 layer 0 layer 1 layer 2 layer 0 layer 1 layer 2 edge input edge e stack 0 edge layer 0 edge layer 1 edge e stack 1 edge layer 0 edge e path edge layer 0 edge d stack 3 edge e stack 2 edge layer 0 edge layer 1 edge layer 2 edge d stack 2 edge e stack 1 edge layer 0 edge layer 1 edge d stack 1 edge e stack 0 edge layer 0 edge layer 1 0905_46 n params: 534717 n layers: 13 input shape: [48, 48] n kernels: 110 log learning rate: -3.88 encode path decode path encode stack 0 decode stack 1 decode stack 0 layer 0 layer 0 layer 0 edge input edge e path edge d stack 1 edge e stack 0
SLIDE 29
RANDOM NETWORK PERFORMANCE
Below: Comparison of random network validation performance (adjusted Rand score) with the original (Ronneberger et al., 2015) u-net. Nine networks outperformed the original u-net.
SLIDE 30
RANDOM NETWORK PERFORMANCE
SLIDE 31 ENSEMBLE PERFORMANCE
Strategy: Make an ensemble of the best N networks, evaluate
- n validation data. N = 4 is best.
SLIDE 32
PLATELET SEGMENTATION SPEEDUP
The point: Is correcting the algorithm faster than manual segmentation? First ensemble segmentation correction: ∼ 2× speedup for a 10 × 800 × 800 volume. Second ensemble segmentation correction: ∼ 3× speedup for a 20 × 800 × 800 volume. A good start, but much more is needed.
SLIDE 33
FUTURE WORK
Use 3D encoder-decoder nets instead of 2D. Train more networks for longer. Explore evolutionary strategies for network design, instead of random sampling. The big challenge: Robust segmentation that generalizes between tasks.