Source: Improving Connectomics by an Order of Magnitude from Google Research
Posted by Viren Jain, Research Scientist and Technical Lead and Michal Januszewski, Software Engineer, Connectomics at Google
The field of connectomics aims to comprehensively map the structure of the neuronal networks that are found in the nervous system, in order to better understand how the brain works. This process requires imaging brain tissue in 3D at nanometer resolution (typically using electron microscopy), and then analyzing the resulting image data to trace the brain’s neurites and identify individual synaptic connections. Due to the high resolution of the imaging, even a cubic millimeter of brain tissue can generate over 1,000 terabytes of data! When combined with the fact that the structures in these images can be extraordinarily subtle and complex, the primary bottleneck in brain mapping has been automating the interpretation of these data, rather than acquisition of the data itself.
Today, in collaboration with colleagues at the Max Planck Institute of Neurobiology, we published “High-Precision Automated Reconstruction of Neurons with Flood-Filling Networks” in Nature Methods, which shows how a new type of recurrent neural network can improve the accuracy of automated interpretation of connectomics data by an order-of-magnitude over previous deep learning techniques. An open-access version of this work is also available from biorXiv (2017).
3D Image Segmentation with Flood-Filling Networks
Tracing neurites in large-scale electron microscopy data is an example of an image segmentation problem. Traditional algorithms have divided the process into at least two steps: finding boundaries between neurites using an edge detector or a machine-learning classifier, and then grouping together image pixels that are not separated by a boundary using an algorithm like watershed or graph cut. In 2015, we began experimenting with an alternative approach based on recurrent neural networks that unifies these two steps. The algorithm is seeded at a specific pixel location and then iteratively “fills” a region using a recurrent convolutional neural network that predicts which pixels are part of the same object as the seed. Since 2015, we have been working to apply this new approach to large-scale connectomics datasets and rigorously quantify its accuracy.
|A flood-filling network segmenting an object in 2d. The yellow dot is the center of the current area of focus; the algorithm expands the segmented region (blue) as it iteratively examines more of the overall image.|
Measuring Accuracy via Expected Run Length
Working with our partners at the Max Planck Institute, we devised a metric we call “expected run length” (ERL) that measures the following: given a random point within a random neuron in a 3d image of a brain, how far can we trace the neuron before making some kind of mistake? This is an example of a mean-time-between-failure metric, except that in this case we measure the amount of space between failures rather than the amount of time. For engineers, the appeal of ERL is that it relates a linear, physical path length to the frequency of individual mistakes that are made by an algorithm, and that it can be computed in a straightforward way. For biologists, the appeal is that a particular numerical value of ERL can be related to biologically relevant quantities, such as the average path length of neurons in different parts of the nervous system.
|Progress in expected run length (blue line) leading up to the results shared today in Nature Methods. The red line shows progress in the “merge rate,” which measures the frequency with which two separate neurites were erroneously traced as a single object; achieving a very low merge rate is important for enabling efficient strategies for manual identification and correction of the remaining errors in the reconstruction.|
We used ERL to measure our progress on a ground-truth set of neurons within a 1-million cubic micron zebra finch song-bird brain imaged by our collaborators using serial block-face scanning electron microscopy and found that our approach performed much better than previous deep learning pipelines applied to the same dataset.
|Our algorithm in action as it traces a single neurite in 3d in a songbird brain.|
We segmented every neuron in a small portion of a zebra finch song-bird brain using the new flood-filling network approach, as depicted here:
|Reconstruction of a portion of zebra finch brain. Colors denote distinct objects in the segmentation that was automatically generated using a flood-filling network. Gold spheres represent synaptic locations automatically identified using a previously published approach.|
By combining these automated results with a small amount of additional human effort required to fix the remaining errors, our collaborators at the Max Planck Institute are now able to study the songbird connectome to derive new insights into how zebra finch birds sing their song and test theories related to how they learn their song.
We will continue to improve connectomics reconstruction technology, with the aim of fully automating synapse-resolution connectomics and contributing to ongoing connectomics projects at the Max Planck Institute and elsewhere. In order to help support the larger research community in developing connectomics techniques, we have also open-sourced the TensorFlow code for the flood-filling network approach, along with WebGL visualization software for 3d datasets that we developed to help us understand and improve our reconstruction results.
We would like to acknowledge core contributions from Tim Blakely, Peter Li, Larry Lindsey, Jeremy Maitin-Shepard, Art Pope and Mike Tyka (Google), as well as Joergen Kornfeld and Winfried Denk (Max Planck Institute).