Simulating the neural correlates of stuttering

For functional neuroimaging studies of stuttering, two challenges are (1) the elicitation of naturally stuttered versus fluent speech and (2) the separation of activation associated with abnormal motor execution from activation that reflects the cognitive substrates of stuttering. This paper reports on a proof-of-concept study, in which a single-subject approach was applied to address these two issues. A stuttering speaker used his insight into his own stuttering behavior to create a list of stutter-prone words versus a list of “fluent” words. He was then matched to a non-stuttering speaker, who imitated the specific articulatory and orofacial motor pattern of the stuttering speaker. Both study participants performed a functional MRI experiment of single word reading, each being presented with the same lexical items. Results suggest that the generally observed right-hemisphere lateralization appears to reflect a true neural correlate of stuttering. Some of the classically reported activation associated with stuttering appears to be driven more by nonspecific motor patterns than by cognitive substrates of stuttering, while anterior cingulate activation may reflect awareness of (upcoming) dysfluencies. This study shows that it is feasible to match stuttering speakers’ utterances more closely to simulated stutters for the investigation of neural correlates of real stuttering. Significant main effects and contrast effects were obtained for the differences between fluent and stuttered speech, and right-hemisphere lateralization associated with real stuttered speech was shown in a single subject.

The neural basis of stuttering has been of great interest for quite some time now, with a shift in emphasis from regarding stuttering as a mere articulatory deficit, or primarily a psychological problem, to regarding it as a planning disorder that has its origins in neural control over some level of the speech production process. Following very beneficial and emancipatory developments in the latter half of the twentieth century in the areas of behavioral stuttering treatment in combination with counseling on acceptance of oneself as a person who stutters (PWS), the more recent braindirected approach has the potential to offer new and exciting insights into the nature of stuttering and related aspects of the language and speech system. These, in turn, are hoped to inspire new intervention approaches that target the neurophysiological origins of stuttering.
Results from functional and structural neuroimaging experiments are not homogeneous, though some recurring patterns are notable. A general finding is of hyperactivity of right primary and premotor cortex (Bhatnagar & Buckingham, 2010;Brown, Ingham, Ingham, Laird, & Fox, 2005;De Nil, Kroll, Kapur, & Houle, 2000;Fox et al., 1996), not just in speech tasks, but also reported in nonspeech tasks for PWS (Chang, Kenney, Loucks, & Ludlow, 2009). This activation pattern in PWS is often reported to coincide with increased cerebellar activation (Brown et al., 2005;De Nil, Kroll, c 2013Taylor & Francis & Houle, 2001Fox et al., 1996) and with reduced left-hemisphere activation of the auditory cortex (Bhatnagar & Buckingham, 2010;Brown et al., 2005;De Nil et al., 2008;Fox et al., 1996). This latter finding has also been related to reduced gray and white matter density in auditory cortex (Beal, Gracco, Lafaille, & De Nil, 2007;Lu et al., 2010). Some studies account for stuttering as a malfunction in the thalamus, causing a transient asynchronization in the balancing and sequencing of multiple impulses (Allert, Kelm, Blahak, Capelle, & Krauss, 2010;Bhatnagar & Andy, 1989;Bhatnagar & Buckingham, 2010;Lu et al., 2010), and this may be related to an observed correlation between basal ganglia activation and stuttering severity (Giraud et al., 2008). Compared to people who do not stutter (PWNS), PWS have also been found to show increased activation levels in right-hemisphere inferior frontal cortex (De Nil et al., 2008), specifically in the frontal operculum (Neumann et al., 2003). Another focal activation difference between PWS and fluent speakers that has been reported is an increased activation in left anterior cingulate cortex in PWS, which may be either a direct neural correlate underlying stuttering or reflective of cognitive anticipatory reactions to upcoming stutters (De Nil et al., 2000). This latter observation, in fact, goes for all synchronically measured differences in functional activation or neurophysiology; these do not in themselves offer the possibility of establishing whether any neurological anomaly reflects cause or effect of developmental (life-long) stuttering. Such interpretations are the realm of theories about the relations among brain, mind, and body, which may possibly be aided by longitudinal studies, for example, mapping the development of at-risk children before stuttering onset. As a starting point, however, it will be important to establish the neural correlates of stuttering itself.
For functional neuroimaging studies of stuttering, two challenges are (1) the elicitation of naturally stuttered versus fluent speech and (2) the separation of activation associated with abnormal motor execution from activation that reflects the cognitive substrates of stuttering. For example, De Nil et al. (2008), who address these issues specifically, did not find differences between simulated stuttering and fluent speech in their nonstuttering group, but this is possibly because the simulated stuttering used in that study was not similar enough to the cognitive and motor patterns associated with real stuttering. They suggest, therefore, that a "true stuttering" condition be included in experiments such as theirs, but point to practical problems. One issue is that to compare neural activation during stuttering to activation during nonstuttering, it is necessary to have a sufficient number of data points in each "condition." It is often difficult to predict which items (words) will lead to problems for PWS, particularly in a group study, as such items will differ for each individual. This means that if the same stimuli are used for different study participants, they will all exhibit different stuttering patterns, as well as different power levels for the computation of the relevant contrasts. In general, if a participant stutters too severely, there will be too few fluent utterances, while on the other side of the spectrum is the experimental danger of too few dysfluencies. On top of that, it is known that natural stuttering is commonly attenuated in the scanner, possibly under the influence of the radio pulse noise and rhythm during functional data acquisition or the repetitive nature of most tasks.
As to the second challenge, most functional neuroimaging studies will be interested in the cognitive substrates of stuttering (what makes the stuttering brain functionally different from a nonstuttering brain?) rather than in the act of stuttering itself, let alone the associated orofacial motor behavior. For this reason, studies will turn to having fluent control participants mimick stuttering, so as to be able to subtract the overt stuttering behavior from the combination of overt stuttering behavior plus the neural substrate of real stuttering. Here, too, the problem is that the stuttering patterns and orofacial motor behavior differ between stuttering individuals, ranging from slight dysfluencies to extensive blocking, prolongations, face twitching, and eye movements.
This paper reports on a proof-of-concept study, in which we have tried to deal with the two issues mentioned earlier, through a single-subject approach. We made use of a speaker's insight into his own stuttering behavior to create a list of words on which he is likely to stutter versus a list of "fluent" words. In addition, a speech pathologist was trained to imitate the specific articulatory and orofacial motor pattern associated with this speaker's stuttering. Both performed a functional MRI (fMRI) experiment of single word reading, each being presented with the same lexical items.

Participants
Dysfluent participant DS was a left-handed male monolingual speaker of English, 73 yrs of age at the time of testing, with a history of developmental stuttering since age 4 and no other neurological problems. DS received 6 yrs of post-secondary education and worked as a university professor in the field of voice and fluency disorders at the time of testing. Like many people who stutter, DS was aware of specific words and phonological characteristics that might be problematic for him and was able to predict to a great extent which parts of a planned sentence utterance would invoke stutters. He also reported that the "list" of problematic words occasionally changed over time, with some periods being characterized by particular problems with words starting with/s/, for example, while other periods might be marked by problems with initial/d/.
Fluent participant FS was a left-handed male monolingual speaker of English, 46 yrs of age at the time of testing, with no history of speech deficits nor of other neurological problems. FS received 6 yrs of post-secondary education and worked as a university professor and licensed speech pathologist, specialized in fluency disorders, at the time of testing. As a long-time colleague, FS was familiar with the stuttering behavior of DS, both in terms of speech output and associated orofacial motor behavior.
Both participants were classified as "mixed lefthanded," according to the Edinburgh Handedness Inventory (Oldfield, 1971), and scored very similarly, not only in terms of overall index (DS, -0.2; FS, -0.3), but also in terms of which specific activities were Left and Right, both having a strong left-handed preference for skills and a strong right-handed preference for sports. They agreed on 9/10 questions, the only difference between them being on the "scissors" question, which is biased because scissors are generally made to favor right handers. Thus, the participants were considered to be identical on the pattern of their handedness.
Both participants were academics specialized in the study of stuttering, so the "stuttered" productions by these individuals will have been shaped by their training and experience, just as any individual stuttering pattern is influenced by the speaker's experience. However, our starting assumption is that the stutters of a "stuttering expert" are not inherently different from those of "lay" PWS, in terms of their neuronal underpinnings. In addition, all individual PWS have varying strategies, sometimes based on specific treatment experiences, which lead to individual variation, particularly in motor behavior. One motivation for the present study was to explore to what extent it may be possible to mimic subject-specific surface stuttering patterns on an individual basis to provide a clearer window on the actual neurogenics of stuttering.

Materials
For the present experiment, DS created a list of words he knew were "stutter-prone," henceforth the nonfluent list, and a list of words he could utter without stuttering, the fluent list. From these two lists of words, we extracted two experimental lists of 24 fluent and nonfluent words, which were matched for frequency, length (in phonemes, graphemes, and syllables), and imageability, based on information from the MRC Psycholinguistic Database (Coltheart, 1981;Wilson, 1987). Each list was composed of 9 nouns, 6 adjectives, 2 verbs, 5 common names (of persons known to both DS and FS), and 2 place names. Fluent participant FS was trained to imitate DS' stuttering pattern during two 2-hour training sessions, covering spontaneous speech and word list reading.

fMRI data acquisition, procedure, and analysis
Scanning was carried out on a Siemens 3T TIM Trio scanner. A T1-weighted anatomical scan was made at the start of each session with the following parameters: TR = 2250 ms; TE = 4.52 ms; flip angle = 9 • ; image matrix = 176 × 256; FOV = 256; voxel size = 1 × 1 × 1 mm 3 . For functional imaging (epi), a sparse-scanning design was used, with a TR of 10 seconds and an 8-second delay; TE 30 seconds; flip angle = 90 • ; image matrix = 64 × 64; FOV = 208 mm; voxel size = 3.25 × 3.25 × 3.2 mm 3 ; 33 slices. Sparse scanning allowed both the unimpeded recording of the participants' utterances and the avoidance of artifacts caused by articulatory motion during the 2 seconds of data acquisition in each trial (Gracco, Tremblay, & Pike, 2005).
Participants read aloud words presented in capitals on a screen, with a control condition in which they were presented with a nonsense letter string (YYYXXXYY), not requiring a response. One whole brain volume was acquired 3 seconds after the end of each word presentation, with an acquisition time of 2 seconds, for a total of 144 acquisitions over 2 runs (stimuli were repeated once, across the runs, for 48 trials per condition). Words remained on screen for 3 seconds, during which the participants were allowed to speak. The participants were instructed to stop speaking once the words left the screen, even if the attempt was unsuccessful because of stuttering (see Figure 1). As the BOLD response to neural activation was estimated to peak at 6 seconds (Aguirre, Zarahn, & D'Esposito, 1998), the acquired epi volumes should capture the peak associated with the stimulus onset and, in the case of word trials, the onset of the overt speech response. DS was scanned first and his recorded responses were coded offline to register whether his stuttering pattern did indeed match the anticipated pattern and to ensure that FS would pseudostutter on the same trials marked with real stuttering by DS. For participant FS, word trials were color coded on the presentation screen to achieve complete matching between his output and the real stutters of participant DS. Words presented in green were to be articulated fluently, while words in red were to be pseudostuttered.
Data preprocessing was performed with SPM8 (http://www.fil.ion.ucl.ac.uk/spm). Functional scans were realigned to a mean functional volume, with which the anatomical volume was then coregistered. After spatial normalization of the anatomical image to the Montreal Neurological Institute 152-subject template brain (ICBM, NIH P-20 project), the coregistered functional volumes were normalized using the same transformation parameters and resliced at a resolution of 3 × 3 × 3 mm 3 . Spatial smoothing was performed with an 8 mm (full-width, half-maximum) isotropic Gaussian kernel.
Data were analyzed separately for each participant in first-level analyses, using a Finite Impulse Response model, with the three conditions fluent, dysfluent, and letter strings and six motion regressors modeled for each of the two runs, based on the calculated realignment parameters. T-contrasts were evaluated at a significance level of p < .05, with a family-wise correction for multiple comparisons and a minimum cluster size of three contiguous voxels (81 mm 3 ).
To investigate differences in lateralization of the contrasts fluent speech > letterstrings and dysfluent speech > letterstrings, we performed χ 2 tests on the number of voxels activated per hemisphere between the two participants, with α = .05.
Because of different overall activation levels between participants and issues related to global normalization of the BOLD signal, we did not consider entering them in the same statistical model to be informative. Therefore, we limited the present proof-of-concept study to the statistical evaluation of within-subject effects, while between-subjects differences were only informally assessed.

RESULTS
DS stuttered on five "fluent" trials (10.4%) and was fluent on one "dysfluent" trial (2.1%), so these trials were marked accordingly for FS, so that FS could simulate the exact same behavioral pattern during his scanning session, which followed DS'. In total, then, both DS and FS were fluent on 44 trials and stuttered on 52 trials.
As a measure of the amount of head movement by the participants during the scanning sessions, we evaluated the motion parameters calculated by SPM during image realignment, which were also entered as regressors in the statistical model. Relative to the first acquired volume, participant DS' movement was between -0.6 and 0.7 mm, and FS' movement was between -1 and 0.75 mm. These values are low, compared to a voxel size of 3 mm 3 , and show that participants were able to maintain a stable head position, even during an overt speech task.
The contrast fluent speech > letter strings should reveal the activation pattern associated with the participants' fluent speech, without the activation that is specific to looking at word-shaped stimuli. Both participants showed similar activation patterns that included cortical areas commonly associated with overt speech, though DS' activation pattern was more extensive (see Figures 2 and 3,  red).
In DS, fluent speech was associated with activation in bilateral middle and superior temporal gyrus, including the rolandic operculum and extending posteriorly into the supramarginal gyrus and anteriorly to the superior temporal pole. Bilateral postcentral and precentral gyri and inferior frontal gyri were also activated, including the triangular and the opercular parts of the inferior frontal gyrus. Frontal activation also extended to the insula, bilaterally. Posteriorly, bilateral activation was observed in the cuneus, calcarine gyrus, lingual gyrus, and superior occipital gyrus. Lefthemisphere inferior parietal cortex, fusiform gyrus, and cerebellum were also activated (see Figure 2).
In FS, fluent speech was associated with similar bilateral middle to superior temporal activation, extending to supramarginal gyrus and superior temporal pole and including the rolandic operculum. Bilateral inferior frontal operculum was activated, while this activation extended to anterior insula in the right hemisphere and to the triangular part of the inferior frontal gyrus in the left hemisphere.
Differential activation for the contrast dysfluent speech > letter strings should capture the activation associated with real stuttering in DS and with the simulated stuttering in FS. In both participants, activation patterns largely overlapped with those for fluent speech (see Figures 2 and 3, purple), but more so for FS, whose activation for dysfluent speech could best be characterized as fluent speech plus more extensive motor activation (see Figure 3, blue). Stuttering participant DS, on the other hand, showed greater differences between the two speech conditions (see Figure 2, blue).
In DS, dysfluent speech was associated with activation in bilateral middle and superior temporal  gyrus, including the rolandic operculum and extending to supramarginal gyrus and superiorly to the rolandic fissure into the postcentral and precentral gyri. Middle and superior frontal gyri were also bilaterally activated, as well as the opercular and triangular parts of the inferior frontal gyrus. Overall, activation appeared to be more extensive in the right hemisphere, to which we return below.
In FS, dysfluent speech was also associated, again, with bilateral middle and superior temporal activation, including the rolandic operculum and extending to supramarginal gyrus, superior temporal pole, postcentral gyrus, precentral gyrus, inferior frontal operculum, and insula. Posteriorly, there was activation in bilateral cuneus, calcarine gyrus, lingual gyrus, superior occipital gyrus, and cerebellum. In the left hemisphere, there was further activation in the triangular and orbital parts of the inferior frontal gyrus, as well as in anterior cingulate cortex.
There was no difference in lateralization of activation for the contrast fluent speech > letter strings between the two participants, both having an equal left-right distribution with 48% of differential activation in the right hemisphere (see Figure 4a). For the contrast dysfluent speech > letter strings, however, participant DS showed a right lateralization that was not present in participant FS, with 64% of activation in the right hemisphere versus 48% for FS (χ 2 = 82.369, p < .001; see Figure 4b).
Within the two participants, we also compared the activation associated with fluent versus dysfluent speech directly. Results are given in Tables 1 and 2. In DS, we see that, compared to stuttering, fluent speech is associated with stronger activation in the traditional "overt speech" areas, including temporal, inferior frontal, and middle frontal cortex. The opposite contrast, revealing activation specific to the stuttering condition, shows activation in right-hemisphere motor cortex and left-hemisphere middle and superior frontal cortex (see Figure 5, green).
In FS, differential activation between the two conditions is primarily seen in the contrast of dysfluent speech (see Figure 5, gold), with only some occipital and middle temporal activation in the reverse contrast. The simulated stuttering, then, is associated with extensive bilateral motor activation, but it includes temporal and left-hemisphere inferior frontal cortex.
To summarize, both participants showed similar activation patterns associated with fluent speech, but control participant FS' simulated stutters were associated with much greater bilateral motor activation, whereas stuttering participant DS showed a right-lateralized pattern during natural dysfluent speech and stronger overall activation during fluent speech.

DISCUSSION
This study would not have been possible without access to an individual who stutters in some extremely reliable and predictable places. This allowed us to generate reliable stuttering during scanning and refined the task of training a fluent speaker for simulated stuttering. The behavioral effects, showing that DS was indeed able to predict with fair accuracy which words would elicit natural stuttering on his part, suggest that the present study was able to achieve sufficient power for the detection of differential activation patterns based on the distribution of stuttered (52) versus fluent speech trials (44) in the scanner. This was confirmed by the obtained results in both individual participants.
Before we enter the discussion of particular activation patterns, it is important to point out that one must remain cautious about the interpretation of single-subject fMRI activation data of subtle  effects, that is, those associated with secondary cognitive faculties such as speech planning. The primary reason for the present study was to investigate whether this one-on-one matched-subjects approach, in terms of behavioral performance, might have the potential to reveal informative patterns when used in a group approach that allows for stronger statistical comparisons between subjects.
We suggest that the obtained results are indeed promising. Both participants showed activation in areas commonly associated with overt speech in the fluent and dysfluent conditions, including temporal cortex, primary motor cortex, and inferior frontal cortex, bilaterally (Huang, Carr, & Cao, 2001;Price, 2012). For the contrasts comparing fluent and dysfluent speech to viewing the letter strings, that is, the "main effects" of speech, it must be taken into account that the activation patterns will not only reflect speech planning and articulation, but also lexical access through reading, as well as self-monitoring of the participant's own overt speech output, which may account for the superior temporal activation often associated with auditory processing (Hickok & Poeppel, 2000). The observed middle temporal activation is commonly associated with lexical retrieval and semantic access (Fiez, Raichle, Balota, Tallal, & Petersen, 1996), while the occipital activation noted particularly for fluent speech in DS and for dysfluent speech in FS is likely related to the visual processing of lexical forms. Activation in the triangular part of the inferior frontal gyrus has been related to semantic reading (Friederici, Opitz, & von Cramon, 2000), while the inferior frontal opercular activity may play a role in grapheme-to-phoneme conversion, articulatory planning, and the sequencing of speech movements (Fiebach, Friederici, Muller, & von Cramon, 2002;Fiez, Raichle, Miezin, & Petersen, 1995;Guenther & Vladusich, 2012;Jobard, Crivello, & Tzourio-Mazoyer, 2003). The observed activation in the supramarginal gyrus may reflect its role in phonological encoding (Palumbo, Alexander, & Naeser, 1992) or its role in online error-checking via tactile and proprioceptive expectations (Guenther & Vladusich, 2012), and the cerebellar activation is likely to reflect motor planning and timing (Ghosh, Tourville, & Guenther, 2008).
As noted earlier, we did not perform a direct comparison between the activation patterns of DS and FS for lack of statistical validity, so this study cannot answer the question whether activation in any of these areas is significantly greater in either of the participants. Nevertheless, what is different between DS and FS is that DS appears to have a more qualitatively different pattern between his fluent speech and his dysfluent speech, while FS' dysfluent speech could be characterized as primarily "fluent speech plus more extensive primary motor activation" (see Figure 3). Unlike FS, DS also shows a significant right-hemisphere lateralization of his activation associated with stuttering. During simulated stuttering, then, FS shows generally increased motor activation, whereas during real stuttering, DS exhibits this increased motor activation particularly in the right hemisphere, in harmony with earlier published results (Fox et al., 1996). With all the caveats noted earlier, this suggests that the right lateralization is indeed an aspect that is specific to the stuttering brain and can be distinguished from the general motor activation associated with the excessive articulatory movements characteristic of stuttering.
The current study did not find evidence of the often reported reduced activation in PWS' auditory cortex, as both participants activated auditory cortex during both fluent and dysfluent speech. A direct comparison of activation levels, in a group neuroimaging study comparing effect sizes between PWS and PWNS, would be required to test this. Likewise, we did not observe anomalous (de-)activation of the thalamus or basal ganglia, as activation in these areas did not reach threshold for either of our two participants.
With respect to the anterior cingulate activation noted in previous studies (e.g., De Nil et al., 2000), we observed this activation in FS specifically during dysfluent speech (simulated stuttering), while in DS, it was stronger during fluent speech than during stuttering. De Nil et al. (2000) observed this increased anterior cingulate activation during silent reading in PWS and suggest that it reflects cognitive anticipatory reactions related to stuttering, in line with a role for anterior cingulate cortex underlying selective attention. We speculate that its presence in DS may indeed reflect the stuttering brain's continuous awareness of the possibility of upcoming stutters, even when speech turns out to be fluent, while its specific association with simulated stuttering in FS may reflect that speaker's anticipation to his own deliberate dysfluencies in those trials.
In contrast to De Nil et al.'s (2008) group results, we did observe significant differences between dysfluent and fluent speech in FS. This indicates that the method used here, of more directly and deliberately mimicking natural stuttering patterns, has the potential of revealing neural activations associated specifically with the motor behavior of stuttering, relative to the cognitive basis of stuttering. A group study, allowing the much-needed between-group comparisons, would entail finding a sufficient number of PWS with the same ability as DS to predict dysfluent versus fluent words, as well as control participants with the same ability as FS to simulate the specific articulatory and orofacial motor pattern of their stuttering match. Based on the present results, we are currently preparing such a study.
Finally, it is likely that there are qualitative and quantitative differences between stutters in spontaneous speech and those on a word list such as was used in the present study, so the present results do not necessarily generalize to natural stuttering in discourse. Our reason for using a word list was that this allowed greater experimental control and fewer potential motion and other noise artifacts. fMRI data are sensitive to articulatory movement and to neural activation patterns associated with other aspects of discourse and sentence production, which may be difficult to control across conditions. In particular, the sparse-scanning method could not be used for long discourse segments, as these would contain both stuttered and fluent segments, with varying timings. An advantage of our current approach is that we are simply dealing with data points related either to stuttered or nonstuttered events. In that respect, an interesting middle way is presented by Jiang, Lu, Peng, Zhu, and Howell (2012), who used sparse-scanning of a sentence completion task to elicit more naturalistic stutters. This design successfully allows for the post-hoc characterization of sentence-level stutters and could also be used in combination with the one-on-one speaker matching approach presented here.

CONCLUSIONS
Results suggest that some of the classically reported activation associated with stuttering is driven more by nonspecific motor patterns than by cognitive substrates of stuttering. Anterior cingulate activation may reflect awareness of (upcoming) dysfluencies, rather than a process that is specific to natural stuttering. Nevertheless, the generally observed right-hemisphere lateralization appears to reflect a true neural correlate of stuttering, whether due to neurogenetic, early developmental, or later adaptive behavioral factors.
Perhaps more importantly, this study shows that it is feasible to match stuttering speakers' utterances more closely to simulated stutters, for the investigation of neural correlates of real stuttering, by (1) eliciting words that are known to be "stutter prone" in individual speakers and (2) training nonstuttering speakers to imitate closely the individual motor behavior of PWS. Using this method, we were able to observe significant main effects and contrast effects of the differences between fluent and stuttered speech and even to replicate the right-hemisphere lateralization associated with real stuttered speech in a single subject. The specific neuroimaging results of the present singlesubject study must be interpreted with caution, but a group study using stuttering and nonstuttering participants matched one-on-one is anticipated to shed more light on the neural basis of natural stuttering.