On the Temporal Fidelity of Nonlinear Inverse Reconstructions for Real-Time MRI – The Motion Challenge

Purpose: To evaluate the temporal accuracy of a self-consistent nonlinear inverse reconstruction method (NLINV) for real-time MRI using highly undersampled radial gradient-echo sequences and to present an open source framework for the motion assessment of real-time MRI methods. Methods: Serial image reconstructions by NLINV combine a joint estimation of individual frames and corresponding coil sensitivities with temporal regularization to a preceding frame. The temporal fidelity of the method was determined with a phantom consisting of water-filled tubes rotating at defined angular velocity. The conditions tested correspond to realtime cardiac MRI using SSFP contrast at 1.5 T (40 ms resolution) and T1 contrast at 3.0 T (33 ms and 18 ms resolution). In addition, the performance of a post-processing temporal median filter was evaluated. Results: NLINV reconstructions without temporal filtering yield accurate estimations as long as the speed of a small moving object corresponds to a spatial displacement during the acquisition of a single frame which is smaller than the object itself. Faster movements may lead to geometric distortions. For small objects moving at high velocity, a median filter may severely compromise the spatiotemporal accuracy. Conclusion: NLINV reconstructions offer excellent temporal fidelity as long as the image acquisition time is short enough to adequately sample (“freeze”) the object movement. Temporal filtering should be applied with caution. The motion framework emerges as a valuable tool for the evaluation of real-time MRI methods.


INTRODUCTION
While conventional magnetic resonance images represent direct reconstructions of the acquired data by inverse Fourier transformation, iterative reconstruction techniques only result in optimized estimations of the true spin-density distribution of an object.In recent years, such mathematical approaches found increasing use for MRI reconstructions from undersampled datasets with applications ranging from parallel imaging, e.g.[1][2][3][4], to real-time imaging, e.g.[5][6][7][8], and model-based reconstructions of parametric maps, e.g.[9][10][11].In all cases, the reconstruction is defined as the solution of a linear or nonlinear inverse problem with increasing degree of mathematical complexity.In general, therefore, important questions arise about the validity of the algorithms and the reliability of the estimated images.
*Address correspondence to this author at the Biomedizinische NMR Forschungs GmbH am Max-Planck-Institut für biophysikalische Chemie, 37070 Göttingen, Germany; Tel: +49-551-201-1721; Fax: +49-551-201-1307; E-mail: jfrahm@gwdg.de Here, we specifically address the temporal fidelity of iterative reconstructions developed for real-time MRI.In particular, we determine the temporal acuity of the NLINV method which relies on nonlinear inverse reconstructions with temporal regularization from highly undersampled radial gradient-echo acquisitions [8].Rather than detailing the general mathematical properties of regularized iterative optimization problems -which for MRI applications are often ill-conditioned and not even convex -the rationale of this work is to provide experimental evidence for conditions that correspond to a realistic clinical imaging scenario, namely cardiovascular MRI in real time.
Because of the absence of a gold standard for in vivo measurements, this work entirely focuses on the use of an experimental framework which comprises of a motion phantom designed to control the velocities of multiple moving objects independently within the imaging field-ofview [12].The chosen parameters resemble the conditions previously introduced for real-time cardiac MRI using T1 contrast (3.0 T) at 18 ms and 33 ms resolution [13] as well as SSFP contrast (1.5 T) at 40 ms resolution [14].

Nonlinear Inverse Reconstruction
Real-time MRI of a dynamic process links serial acquisitions of consecutive datasets with serial reconstructions of respective images that constitute the frames of an MRI movie.In order to achieve high temporal resolution, i.e. a minimum acquisition time for individual frames, the use of pronounced data undersampling is the common method of choice.Hence, the task to be accomplished is a selfcontained image reconstruction from the dataset that defines an individual frame.In this respect, self consistency and temporal fidelity refer to the fact that all reconstructions, i.e. estimations of the actual image and its coil sensitivities, have to be performed by exclusively using information which defines the actual time point.This information comprises only a very limited number of k-space lines for a limited number of radiofrequency coils.
Unfortunately, and regardless of the chosen mathematical approach, such a reconstruction in general does not lead to sufficient image quality -or in turn precludes the use of a sufficiently high degree of undersampling.For example, for the case of radial MRI, violation of this limitation causes the occurrence of streaking artifacts.Stabilization of the illconditioned mathematical problem therefore relies on additional information.Thus, the crucial question for any real-time MRI method arises: Where is this additional information coming from?
First of all, it seems clear that extra data cannot be taken from a high-quality (e.g., fully sampled) calibration scan because the need for a preceding measurement (which would be required for any new image orientation or any other altered parameter) would make no sense for a real-time imaging scenario.Secondly, such information should also not be derived from a "composite dataset" obtained by averaging the dynamic acquisitions of a moving object.Apart from the fact, that this strategy requires the reconstruction to wait until the end of the entire acquisition, it bears the risk of compromising the temporal acuity of the single-frame reconstructions -despite the fact that such reconstructions may result in qualitatively "pleasing" images.In fact, for the question asked here, the lack of visible image artifacts should not be confused with temporal fidelity.It is therefore advisable to search for complementary information in the data acquired immediately before or after the actual frame.For these reasons radial "through-time" GRAPPA [15] was recently extended to use self-calibration, but found to result in an insufficient number of GRAPPA kernels for restoring adequate image quality, so that respective cardiac applications lead to partial blurring of systolic frames [16].
The real-time MRI setting studied here [8] is based on the original NLINV method for parallel imaging [4].This choice was motivated by the desire to start with an iterative reconstruction technique that takes advantage of all available data by simultaneously estimating both the image and its coil sensitivities.The prize to be paid is the fact that, in contrast to conventional linear approaches with undersampling factors of only 2 to 3, such a strategy can no longer avoid the true reconstruction problem of parallel MRI which emerges as the iterative solution to a nonlinear inverse problem.However, independent of the high computational demand, the method makes optimum use of any amount of data in a self-consistent manner and therefore promises to yield the best possible estimate.For real-time MRI the NLINV method inherently accounts for putative changes of coil sensitivities during scanning.In addition, real-time NLINV exploits the temporal continuity of a movement in a serial acquisition by constraining the iterative solution to the nonlinear inverse problem by a temporal regularization to the preceding frame.To note, rather than adding the data of neighboring frames into the actual estimation, the temporal regularization selects -from a range of possible solutions which match the data of the current frame -the solution that is closest to the preceding frame.This constraint increases the possible degree of undersampling while exactly conforming to the actual data.
Because in vivo applications of the real-time NLINV method may eventually lead to residual streaking artifacts, previous implementations forced these undersampling streakings to "flicker" in 5 sequential frames by using complementary sets of radial spokes [13,14].The strategy allows for an efficient removal of such artifacts by a postprocessing temporal median filter that extends over the same range of 5 frames.The filter preserves the signal of a moving object if it appears for at least three frames in a specific pixel, i.e. if the positional displacement of the object during a single frame does not exceed one third of its length.For faster movements with larger displacements relative to the size of the object (or lower temporal resolution), the temporal accuracy may be degraded.This study therefore evaluated the performance of NLINV reconstructions with and without temporal median filtering.

Real-time MRI
Real-time MRI measurements of a motion phantom were performed on two different MRI systems (Siemens Healthcare, Erlangen, Germany) operating at 1.5 T (TIM Symphony) and 3.0 T (TIM Trio).In either case, radiofrequency excitation was accomplished with a body coil, while signal reception employed the standard 8-channel and 32-channel headcoil, respectively.Dynamic MRI acquisitions relied on highly undersampled radial gradientecho sequences with either spoiled conditions at 3.0 T [13] or fully balanced gradients at 1.5 T [14] as previously described for cardiac MRI.Experimental details are summarized in Table 1.In all applications shown here image reconstructions were without any spatial filtering.At 3.0 T (TIM Trio) NLINV reconstructions were performed online and without user interference at > 20 frames per second.This was accomplished by running a parallelized version of the algorithm on a computer equipped with 8 graphical processing units that bypassed the conventional image reconstruction pipeline [17,18].At 1.5 T (TIM Symphony) NLINV reconstructions were done offline.In this case online monitoring was enabled by a slidingwindow technique that relied on gridding and inverse FFT of a composite dataset of 5 consecutive radial acquisitions (i.e., 65 spokes) as described [19].

Motion Phantom
As shown in Fig .(1) a metal-free motion phantom based on a pneumatic rotary motor was constructed from acrylic glass and polyacetal [12].Respective construction details are available at [20].During operation two nozzles pipe pressured air (100 to 300 kPa) across a slotted rotor mounted to a drive shaft.Four gear wheels transmit the fast rotary motion to a specimen holder which either contained glass tubes (9.7 mm i.d., aqueous solution of 1 mM Gd-DTPA) at different distances from the center or a petri dish (90 mm i.d.).In the latter case agarose gel with multiple acrylic glass placeholders at different distances from the center served to form an "inverse" phantom to the rotating water-filled tubes yielding "holes" in an otherwise homogeneous signal distribution.
Rotational speed is set by varying the air pressure with a pressure-reducing valve and a fine-adjustment needle valve.The rotational frequency of the phantom inside the headcoil of the MRI scanner is monitored with a video camera.An image-processing algorithm uses the real-time video feed of the camera to determine the rotational speed by detecting two red markers placed on the rotating disc.The source code of the algorithm at [20].The rotational frequency is recorded and for online monitoring visualized with a live data graph.Based on rotational frequencies of up to 2.0 Hz the angular velocities of the rotating tubes are calculated according to their distances of the center.The two chosen ranges of velocities, i.e. up to 10 cm s -1 and 30 cm s -1 , extend to the fastest movements of the myocardial wall during postsystolic expansion and of the tongue tip during rapid speaking [21], respectively.

Fig.
(2) depicts selected frames of MRI movies of the motion phantom (Supplementary Movies 1 to 6) with angular velocities of 5.0, 7.5 and 10 cm s -1 (inner to outer tube).The results were obtained at 1.5 T using 40 ms temporal resolution (SSFP contrast) as well as at 3.0 T using 33 and 18 ms resolution (T1 contrast).It turns out that NLINV reconstructions without temporal filtering (top row) offer good to excellent temporal acuity in all cases.On the other hand, the use of a temporal median filter, which extends over 5 consecutive frames and in human cardiac studies helped to remove residual streaking artifacts, partly compromises the correct geometrical structure.This particularly applies to high velocities (7.5 and 10 cm s -1 ) at   for water-filled tubes rotating at angular velocities of 5.0, 7.5, and 10 cm s -1 (top) without and (bottom) with a temporal median filter extending over 5 frames.The images represent selected frames of corresponding movie recordings (see Supplementary Movies 1 to 6).For further experimental parameters see Table 1.  the lowest temporal resolution (40 ms), so that the corresponding tubes exhibit slightly disturbed shapes along the direction of the movement (bottom left of Fig. 2).
In general, similar observations are made for movements at higher angular velocities of 15, 22, and 30 cm s -1 shown in Fig. (3) (Supplementary Movies 7 to 12).Without temporal filtering (top row) NLINV reconstructions provide acceptable spatiotemporal estimates -in particular, if the spatial displacement during the acquisition of a single frame remains smaller than the object itself.For 40 ms resolution this applies to a maximum velocity of about 10 cm s -1 , i.e. a displacement of about 4 mm (outermost tube in Fig. (2), upper left), while all reconstructions at higher velocities are affected by an increasing degree of geometric distortion (Fig. 3, upper left).On the other hand, real-time acquisitions at 33 ms resolution may capture objects moving at velocities of up to 15 cm s -1 which corresponds to a 5 mm displacement per frame (innermost tube in Fig. (3), upper middle).For a twofold higher velocity of 30 cm s -1 (outermost tube) the displacement by 10 mm (i.e., the diameter of the tube) distributes the intensity of the object along the direction of the movement, though without generating other artifacts.The best performance is achieved at the highest temporal resolution of 18 ms which reveals only small structural deformations even at the highest velocity of 30 cm s -1 , however, this speed could only be achieved by lowering the in-plane resolution to 2.0 mm.Again, the application of a post-processing temporal filter may severely compromise the spatiotemporal accuracy.For high velocities and insufficent temporal sampling it may even completely destroy the object representation, for example see results for 40 and 33 ms resolution.Only for 18 ms acquisitions the use of a temporal median filter preserves the reconstructions of a small object for velocities up to 15 cm s -1 (Fig. 3, lower right).
Additional analyses are shown in (Fig. 4) which compares temporal signal intensity profiles at 33 and 18 ms resolution for the outermost tubes in (Figs. 2 and 3) rotating at 10 and 30 cm s -1 , respectively.These profiles represent horizontal lines (single pixel, central half of the FOV) that correspond to the center of the respective tube at its top position.These horizontal profiles are vertically displayed in (Fig. 4) as a function of time for a total duration of 900 ms in all cases.With reference to the movies underlying (Figs. 2  and 3) the object enters the selected line from the left-hand side, stays within the range for a period that depends on its diameter and the angular speed while continuously moving to the right-hand side (upwards in Fig. 4), and finally leaves the line position when rotating on.Without temporal filtering the results more or less appear as "freezed" snapshots at the respective temporal sampling density, but show little or no spatiotemporal blurring.This situation is clearly altered when using a temporal median filter: While objects with a velocity of 10 cm s -1 are faithfully reproduced at 18 ms temporal resolution, slower 33 ms acquisitions lead to frames with visible temporal blurring despite the correct identification of stepwise displacements.Median filtering of frames covering velocities as high as 30 cm s -1 eliminates the true object representations because of pronounced image inconcistencies (i.e., displacements) in a series of 5 consecutive frames.
Finally, all motion experiments were also carried out with an "inverse" phantom (not shown) to investigate the performance of the reconstruction algorithm and its temporal fidelity for "black holes" in a homogeneous spin-density distribution rather than bright objects in empty space.The results demonstrate an almost identical behavior in all cases.

DISCUSSION
This work evaluates the temporal fidelity of the NLINV reconstruction method for real-time MRI which has been demonstrated to allow for a wide range of medical applications at hitherto unsurpassed image quality and spatiotemporal resolution.Examples of human studies include examinations of cardiac function using T1 contrast at 3.0 T [8,13] as well as SSFP contrast at 1.5 T [14], cardiovascular blood flow [22,23], swallowing [24], and speaking [21].Due to the absence of an in vivo gold standard, all present evaluations are based on a specialpurpose motion phantom built to rotate positive (or negative) objects at defined angular velocity.Small object sizes were chosen to emphasize putative inconsistencies or errors which are expected to occur mostly at structural borders or other features with high spatial frequencies.In other words, while homogeneous objects will apparently be more tolerant to temporal blurring in terms of visible artifacts, the present experimental design aimed at testing a worst-case scenario.
In general, without temporal filtering the experimental results demonstrate high-quality NLINV estimations with surprising temporal acuity.This holds true as long as the speed of a small object leads to a spatial displacement during the acquisition of a single frame which is smaller than the size of the object itself (here less than half its diameter).Reconstructions of even faster moving objects that cause larger inconsistencies in single-frame data must fail.For NLINV without temporal filtering such situations lead to geometric distortions such as elongated shapes along the motion direction but no additional or widespread artifacts.
With respect to previous NLINV applications to real-time cardiac MRI [13,14], a resolution of 33.3 ms (30 fps, 3.0 T, T1 contrast) turns out to adequately cover typical myocardial motions of up to 10 cm s -1 even when using a temporal median filter, while higher velocities of 15 to 20 cm s -1 are only accessible without temporal filtering.On the other hand, a resolution of 40 ms (25 fps, 1.5 T, SSFP contrast) appears to be the lower limit for cardiac MRI when using a temporal median filter as such conditions may already cause some geometric distortions for small structures moving at higher velocities.Of course, the use of a better temporal resolution of, for example, 18 ms (55 fps, 3.0 T, T1 contrast) allows NLINV reconstructions to even cover motions up to 30 cm s -1 .In this case the application of the post-processing median filter reduces the temporal fidelity of respective reconstructions to 15 cm s -1 .
Limitations of the present work are the restrictions to a specific nonlinear inverse reconstruction method and to a particular temporal filter.However, the primary duty to report details about the truly accessible temporal acuity of a real-time MRI method is of course with the respective developers.While a comparison with other methods is outside the scope of this study, the present work defines the underlying "motion challenge" and invites comparative studies by an easy-to-use open source framework for the assessment of motion effects [20].The experimental approach relies on a specially designed phantom and includes worst-case situations (i.e., small objects at high velocity measured at low temporal resolution) and at the same time to mimic the conditions of a realistic clinical imaging scenario as established for cardiac MRI.
The key problem at this stage is not the computational accuracy of an iterative estimation from highly undersampled data -at least not for the NLINV methodbut the use of a post-processing temporal median filter which has previously been introduced to alleviate residual streaking artifacts eventually observed in vivo.While temporal median filtering of NLINV reconstructions may still be acceptable for the velocity range typically seen in cardiac MRI, it would be preferable to completely avoid such filtering for more general applications and find better ways to remove remaining reconstruction errors if present.While more suitable filters such as the non-local means algorithm [25] might offer one possible solution, the incorporation of aggregated motion estimations from preceding reconstructions for a refined estimation of serial frames may be another alternative [26].

Fig. ( 1 ).
Fig. (1).The motion phantom comprises either (left and middle) three water-filled tubes mounted on a rotating disk at different distances from the center or (right) an "inverse" arrangement of three areas of no signal (acrylic glass) in a petri dish filled with agarose.For online video analysis and calculation of the rotational frequency the disk carries red markers.For further details see text.

Fig. ( 2 )
Fig. (2).NLINV reconstructions at 40.0 ms resolution (1.5 T, SSFP contrast) as well as 33.3 ms and 18.0 ms resolution (3.0 T, T1 contrast)for water-filled tubes rotating at angular velocities of 5.0, 7.5, and 10 cm s -1 (top) without and (bottom) with a temporal median filter extending over 5 frames.The images represent selected frames of corresponding movie recordings (see Supplementary Movies 1 to 6).For further experimental parameters see Table1.

Fig. ( 3
Fig. (3).NLINV reconstructions at 40.0 ms, 33.3 ms and 18.0 ms resolution for tubes rotating at 15, 22, and 30 cm s -1 (top) without and (bottom) with a temporal median filter.The images represent selected frames of corresponding movie recordings (see Supplementary Movies 7 to 12).For further details see Fig. (2).

Fig. ( 4 ).
Fig. (4).Intensity vs time profiles (duration 900 ms) for NLINV reconstructions of the outermost tubes in Figs.(2 and 3) (single pixel, central half of a horizontal line cutting through the respective top position) rotating at (left) 10 cm s -1 and (right) 30 cm s -1 .For further details see text.