fMRIPrep Tutorial #2: Running the Analysis


Background

This chapter closely follows the steps written in Daniel Levitas’s tutorial on fMRIPrep, which provides the background on what fMRIPrep is and how to install it. We will be following the second option, which is to use fMRIPrep through Docker. Once you have installed the appropriate version for your operating system, you also need to register on the FreeSurfer website here and download the license.txt file. When it has been downloaded, move it to the derivatives folder of the Flanker directory by typing:

mv ~/Downloads/license.txt ~/Desktop/Flanker/derivatives

Once you have all of these elements, you are ready to run fMRIPrep.

The Docker App

The Docker App allows you to download a package of software programs that are needed to analyze a dataset. This package of programs is called a container. For fMRI data, for example, we can use the terminal to upgrade the docker container for fmriprep:

python -m pip install --user --upgrade fmriprep-docker

Which will install all of the programs used by fMRIPrep - for example, tools from software packages such as FSL and ANTs to assist with normalization and denoising the data. Before executing the code that will perform fMRIPrep on the data, you need to have Docker running.

Note

If you try running the command above, you may get the following error: ImportError: cannot import name md5. This can happen sometimes with Python version 2.7; to fix this error, install a more recent version of Python, and then rerun the command:

python3 -m pip install –user –upgrade fmriprep-docker

Contents of the fMRIPrep Script

To run fMRIPrep, download the code from this github page. Then, navigate to your Flanker directory and create a new sub-directory by typing mkdir code. We will place the script in this folder to keep our files organized:

mv ~/Downloads/fmriprep.sh code

Let’s take a look at what the code does by typing cat code/fmriprep_singleSubj.sh. You should see something like this:

#User inputs:
bids_root_dir=$HOME/Desktop/Flanker
subj=08
nthreads=4
mem=20 #gb
container=docker #docker or singularity

#Begin:

#Convert virtual memory from gb to mb
mem=`echo "${mem//[!0-9]/}"` #remove gb at end
mem_mb=`echo $(((mem*1000)-5000))` #reduce some memory for buffer space during pre-processing

export FS_LICENSE=$HOME/Desktop/Flanker/derivatives/license.txt

#Run fmriprep
if [ $container == singularity ]; then
  unset PYTHONPATH; singularity run -B $HOME/.cache/templateflow:/opt/templateflow $HOME/fmriprep.simg \
    $bids_root_dir $bids_root_dir/derivatives \
    participant \
    --participant-label $subj \
    --skip-bids-validation \
    --md-only-boilerplate \
    --fs-license-file $HOME/Desktop/Flanker/derivatives/license.txt \
    --fs-no-reconall \
    --output-spaces MNI152NLin2009cAsym:res-2 \
    --nthreads $nthreads \
    --stop-on-first-crash \
    --mem_mb $mem_mb \
    -w $HOME
else
  fmriprep-docker $bids_root_dir $bids_root_dir/derivatives \
    participant \
    --participant-label $subj \
    --skip-bids-validation \
    --md-only-boilerplate \
    --fs-license-file $HOME/Desktop/Flanker/derivatives/license.txt \
    --fs-no-reconall \
    --output-spaces MNI152NLin2009cAsym:res-2 \
    --nthreads $nthreads \
    --stop-on-first-crash \
    --mem_mb $mem_mb \
    -w $HOME
fi

Warning

Thomas Ernst has made the following comment that is particularly important for Ubuntu users: “[In this script,] the temporary eval dir is set to be the $HOME dir. That is bad for two reasons: Firstly, at least on Ubunbtu, fmriprep will not clean up the temp dir, easily leading to a overfull home dir/main disk and stoping eval after a few subjects. Secondly, if you select the –clean-workdir option this will delete the entire content of the $HOME dir before crashing.”

Furthermore, if you are using a supercomputer cluster, you may want to use the /tmp/scratch directory to store the output. Bennet Fauber of the University of Michigan recommends setting the following variables for the directories:

BIDS_DIR=/tmp/workflow_${SUB}/BIDS OUTPUT_DIR=/tmp/workflow_${SUB}/derivatives WORK_DIR=/tmp/workflow_${SUB}/work

Bennet: “It’s almost certainly not a good idea to make WORK_DIR the home on a cluster, as home is likely to have a small quota, be NFS, and be slow. There’s almost always some kind of /scratch for that, or, as we do, /tmp. If using /tmp, it’s a good idea to have code to remove work directories after the job finishes, unless debugging.

The first block of code, “User Inputs”, sets the path to where the data is, as well as which subject to analyze. nthreads specifies the number of processors to use, and mem specifies the amount of memory to use, in gigabytes. The variable container can be set to either docker or singularity; the latter, which refers to a container typically used on supercomputing clusters, will be covered in a later tutorial. For now, we will set it to docker. The second block of code reformats the mem variable to remove the suffix gb, so that it can be read by fMRIPrep.

Next we come last half of the code: an if/else statement that executes code depending on whether you chose docker or singularity. Since we chose docker, the second part of the statement will be run. Within that section, we will supply both the root directory containing the data - in other words, the Flanker directory - and the data where the output will be stored, which we will place in the derivatives subfolder.

The other lines in this block mostly contain options for your analysis, which we will explore later. For now, we will run a relatively simple analysis which does the standard preprocessing steps of coregistration, normalization, and physiological component extraction. The last two lines, --mem_mb and -w, use variables to specify the amount of memory to be used, and the working directory where intermediate results will be stored.

Running the Script

To run the script, simply navigate to the code directory and type the following:

bash fmriprep.sh

This will begin preprocessing the data for subject #8 - which, you may recall, was one of the first subjects we analyzed in the fMRI tutorials on SPM, AFNI, and FSL. Our goal here will be to compare the output from those processing pipelines with what is generated by fMRIPrep, in order to see the relative advantages and disadvantages of each.

Using the barebones analysis pipeline that we specified above, this should take about one or two hours to process. When it has finished, click the Next button.

Running Singularity on a Supercomputing Cluster

The following is sample code that will be updated in the future:

#!/bin/bash

#SBATCH --job-name=fmriprep
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=16g
#SBATCH --mail-type=NONE
#SBATCH --partition=week-long
#SBATCH --output=/home/%u/slurm/%x-%j.log
#SBATCH --time=72:00:00

hostname -s
uptime
source /home/sw/spack/share/spack/setup-env.sh
spack load singularity

SUBJ=$1
FMRIPREP=/home/data/fmriprep-20.2.1.simg
SURF_LICENSE=/home/sw/freesurfer/license.txt

BIDS_DIR=~/Desktop/Flanker
OUTPUT_DIR=~/Desktop/Flanker/derivatives
WORK_DIR=~

singularity run \
    $FMRIPREP      \
    $BIDS_DIR $OUTPUT_DIR participant \
    --n_cpus $SLURM_CPUS_PER_TASK        \
    --omp-nthreads $SLURM_CPUS_PER_TASK \
    --fs-license-file=$SURF_LICENSE         \
    --participant-label=$SUBJ \
    --skip_bids_validation --ignore slicetiming \
    --dummy-scans 12 \
    --output-spaces anat fsnative MNI152NLin2009cAsym:res-2 fsaverage:den-10k fsLR \
    --cifti-output \
    -w $WORK_DIR

Video

For a video demonstration of how to set up the fmriprep.sh script, click here.