1955

View Abstract / View Presentation

Memory-Efficient Learning for High-Dimensional MR Reconstruction

Ke Wang¹, Michael Kellman², Christopher M. Sandino³, Kevin Zhang¹, Shreyas S. Vasanawala⁴, Jonathan I. Tamir⁵, Stella X. Yu⁶, and Michael Lustig¹
¹Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, United States, ²Pharmaceutical Chemistry, University of California, San Francisco, Berkeley, CA, United States, ³Electrical Engineering, Stanford University, Stanford, CA, United States, ⁴Radiology, Stanford University, Stanford, CA, United States, ⁵Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States, ⁶International Computer Science Institute, University of California, Berkeley, Berkeley, CA, United States

We present a memory-efficient learning (MEL) framework for high-dimensional MR reconstruction to enable training larger and deeper unrolled networks on a single GPU. We demonstrate improved image quality with learned reconstruction enabled by MEL for 3D MRI and 2D cardiac cine applications.

Figure 1. GPU memory limitations for high-dimensional unrolled DL recons: a) Conventionally, in order to reconstruct a 3D volume, each slice in the readout dimension is independently passed through a 2D unrolled network and then re-joined into a 3D volume. In contrast, 3D unrolled networks require more memory but use 3D slabs during the training, which leverages more data redundancy. b) Typically, cardiac cine DL recons are often performed with a small number of unrolls due to memory limitations. More unrolls are able to better learn the finer spatial and temporal textures.

Figure 2. a) Unrolled networks are formed by unrolling the iterations of an image reconstruction optimization. Each unroll consists of a CNN-based regularization layer and a DC layer. Conventionally, gradients of all layers are computed simultaneously, which requires a significant GPU memory cost. b) Memory-efficient learning procedure for a single layer: 1) Recalculate the layer’s input $$$\mathbf{x}^{(n-1)}$$$, from the known output $$$\mathbf{x}^{(n)}$$$. 2) Reform the AD graph for that layer. 3) Backpropagate gradients $$$q^{(n)}$$$ through the layer’s AD graph.