TractoInferno - A large-scale, open-source, multi-site database for machine learning dMRI tractography

Sci Data. 2022 Nov 25;9(1):725. doi: 10.1038/s41597-022-01833-1.

Abstract

TractoInferno is the world's largest open-source multi-site tractography database, including both research- and clinical-like human acquisitions, aimed specifically at machine learning tractography approaches and related ML algorithms. It provides 284 samples acquired from 3 T scanners across 6 different sites. Available data includes T1-weighted images, single-shell diffusion MRI (dMRI) acquisitions, spherical harmonics fitted to the dMRI signal, fiber ODFs, and reference streamlines for 30 delineated bundles generated using 4 tractography algorithms, as well as masks needed to run tractography algorithms. Manual quality control was additionally performed at multiple steps of the pipeline. We showcase TractoInferno by benchmarking the learn2track algorithm and 5 variations of the same recurrent neural network architecture. Creating the TractoInferno database required approximately 20,000 CPU-hours of processing power, 200 man-hours of manual QC, 3,000 GPU-hours of training baseline models, and 4 Tb of storage, to produce a final database of 350 Gb. By providing a standardized training dataset and evaluation protocol, TractoInferno is an excellent tool to address common issues in machine learning tractography.

Publication types

  • Dataset