I remember back in my first term ever as a grad student at a different university, our prof made us spend an entire week thinking about--not doing--transcription and translation. For that quarter of "World Music Perspectives," transcription and translation had to go together because a culturally sensitive scholar of whatever breed would fret about the power dynamics that encapsulate translating music, sounds and language into text. That whole term I was terrified to write as I tried to forget I thought I knew about music. It was quite the hurdle to overcome.
Yet, here I am, doing interview transcriptions and translations and musical transcriptions, all in the service of my dissertation. And again, I feel my own terror rising. I know we in musicology and ethnomusicology are a little tired of that music/language discussion. (How many times have you read "Music is the universal language" at the start of a sad undergraduate paper?) But I do think one area where the two converge in really interesting ways is transcription/translation. When I transcribe interviews I make a number of choices. Do I represent the "pausing" words like "um" (or in Portuguese, "ou")? Do I include moments when people stutter? How about sentence breaks? These are all really important questions that fundamentally underscore the limits of translating spoken language to written communication. I know that when I transcribe popular music, the hardest things I have representing are things that fall between the cracks of standard notation: especially timbre, but also micro-tones, extended techniques, and the use of stereophonics. And I feel similarly frustrated with the tools we have at our disposal, but they are all that we have.
I know that this process of transcription/translation is really important. When I transcribe, I listen to the same information repeatedly and I can really get inside my subject's speech patters. I learn so much about a person by the filler they choose in their sentences. I can tell when they are thinking or when they are uncomfortable. It is a kind of intimacy that one cannot capture in conversation; most people don't hear speech in phonemes but rather in chunks of words. In general, we have been trained to grasp larger ideas and rhetoric, not the minutiae of sentence-level decisions. For me, speech transcription is similar to closely studying recorded or live music because in that moment of the first encounter, one's attention isn't paying attention to every single detail. And doing analysis on a first listen is nearly impossible. Elisabeth LeGuin's essay "One Bar in Eight: Debussey and the Death of Description" immediately comes to mind here.* We capture moments of interest. We stay attentive, but we certainly don't analyze at the level we would were we to make a thorough, workable transcription. The same is true with transcribing speech.
I'm still terrified. After a few futile attempts at transciption/translation, my current strategy is to let the interview MP3s sit on my hard-drive until I'm far away from Brazil. I'll rely on those vague ideas captured in the discussion instead. Music I can deal with, but this transcription/translation is just a little too much for now.
* Elisabeth LeGuin, "One Bar in Eight: Debussey and the Death of Description," in Beyond Structural Listening: Post-Modern Modes of Hearing (Berkeley: University of California Press, 2004), 233-251.