A Pen-Eye-Voice Approach Towards The Process of Note-Taking and Consecutive Interpreting : An Experimental Design

Interpreting is a cognitively demanding language-processing task. Investigating the process of interpreting helps to explicate what happens inside the black box of interpreters’ minds, with implications on how the human mind processes language under taxing conditions. Since the interpreting process involves multitasking, it is challenging to develop an experimental design to investigate this process. In the case of consecutive interpreting (CI), it is particularly challenging because different methods need to be applied to tap into the two phases of CI, which involve different combinations of sub-tasks. This paper advocates the use of a triangulation of pen recording, eye tracking and voice recording to investigate the process of note-taking and CI.


INTRODUCTION
Interpreting is an intriguing, challenging, and complex language-processing task.Ever since interpreting research began to be established as a field of study in its own right in the mid-1970s (Pöchhacker, 2004, p. 81), there has been a strong interest in uncovering what is happening in interpreters' minds while they perform this extraordinary task.Researchers with a background in psychology have attempted to shed light on how the human mind processes language under severe stress and while engaging in heavy multi-tasking by investigating the cognitive processes in interpreting (e.g.Barik, 1973;Christoffels, 2004;Christoffels & De Groot, 2004, 2005;De Groot, 1997;Gerver, 1974aGerver, , 1974bGerver, , 1976;;Goldman-Eisler, 1972;Köpke & Signorelli, 2012).Researchers from within the field of interpreting, in turn, have approached the topic from an inter-disciplinary perspective that benefits from the theoretical and empirical findings in the cognitive sciences (e.g.Lambert, 1988;Moser-Mercer, 1997;Seeber, 2011Seeber, , 2013;;Shlesinger, 2000).
However, most of the process-oriented research approaching interpreting from a cognitive perspective focuses on simultaneous interpreting (SI), while consecutive interpreting (CI) is often neglected (Chen, 2017a).CI is an interesting activity from both a cognitive and a linguistic point of view.Similar to SI, it requires a high level of bilingual language processing and challenges the interpreter's cognitive system by requiring multi-tasking under strict time constraints.But CI also introduces a new challenge: note-taking (note 1).In addition to listening to the source speech and producing a target speech, CI requires the interpreter to perform the subtasks of note-writing and note-reading, making the process of note-taking and CI a particularly interesting topic of research.
A potentially important reason for the lack of research on the process of CI is inadequate process-oriented methods in the field of interpreting studies.This paper introduces an experimental design to investigate note-taking and CI, with an aim to collect detailed data on both the process (the two CI phases) and the product (the interpreting performance) of CI.The design triangulates pen recording (mainly targeting Phase I of CI), eye tracking (mainly targeting Phase II of CI) and voice recording (mainly targeting the interpreting performance).

2
IJCLTS 6(2):1-8 20 th century.It gradually gave way to SI, which was made possible by the development of electronic equipment, in multilateral and multilingual conference settings.However, CI remains the preferred mode in the context of "bilateral interactions with only two languages involved and in settings where confidentiality, intimacy and directness of interaction are given priority over time efficiency", such as high-level diplomatic encounters, business negotiations, ceremonial speeches and press conferences (Dam, 2010, p. 76).CI remains an important component in most interpreter training programmes.Its significance is manifested in the large quantities of master's theses on the subject (note 2).Even in places where the market is largely dominated by SI, training in CI is believed to be a good way of preparing students for SI (Gile, 2001).Furthermore, CI is frequently introduced to language students as a way of reinforcing language skills (e.g.Henderson, 1976;Hill, 1979;Paneth, 1984).
Given the important role CI plays in the above contexts, there exists a considerable limitation in the literature in that process-oriented cognitive investigations have rarely been carried out on CI.CI is an interesting activity from both a cognitive and a linguistic point of view.Similar to SI, it requires a high level of bilingual language processing and challenges the interpreter's cognitive system by requiring multi-tasking under strict time constraints.But CI also introduces a new challenge: note-taking.In addition to listening to the source speech and producing a target speech, CI requires the interpreter to perform the tasks of note-writing and note-reading.In Phase I of CI, interpreters listen to and analyse the source speech, keep parts of the speech in their working memory, and write down notes.In Phase II, interpreters read back their notes, retrieve information from their working memory, and produce a target speech.Both phases depend heavily on note-taking -this unique and distinctive feature of CI.
Note-taking has been a topic of interest in interpreting research for over half a century (see Chen (2016) for a review).The well-developed volume of literature on consecutive note-taking started with a series of books and articles introducing various note-taking systems and principles.They were published in different languages, each generating a profound influence in its own country and some even reached beyond (e.g.Allioni, 1989;Becker, 1972;Gillies, 2005;Gran, 1982;Ilg, 1988;Kirchhoff, 1979;Matyssek, 1989;Rozan, 1956Rozan, /2002)).Recommendations were made on such skills as noting the idea and not the word, how to use symbols, how to use abbreviations, and how to note links, negations, and emphasis.
The note-taking systems seem to be well-developed, but when it comes to teaching and learning note-taking skills, both teachers and students find it challenging.A couple of studies (e.g.Alexieva, 1994;Gile, 1991) found that note-taking diverted students' attention and even led to a degradation in interpreting performance.Researchers who have approached the topic form cognitive and linguistic perspectives (Kirchhoff, 1979;Kohn & Albl-Mikasa, 2002;Seleskovitch, 1975) found that there was a concurrent storage of information in notes and in memory, and a competition for cognitive resources between note-taking and other activities in the interpreting process.This has motivated subsequent research to target more specific note-taking features and to examine them empirically.Some of the most important variables investigated are: the choice of form (e.g.Dai & Xu, 2007;Dam, 2004a), the choice of language (e.g.Abuín González, 2012;Dai & Xu, 2007;Dam, 2004aDam, , 2004b;;Szabó, 2006), and the relation between note-taking and interpreting performance (e.g.Cardoen, 2013;Dai & Xu, 2007;Dam, 2007;Dam, Engberg, & Schjoldager, 2005).The choice of form refers to the choice between language and symbol, and the choice between abbreviation and full word; the choice of language refers to the choice between source and target language, and the choice between native and non-native language (Dam, 2004a).The studies have contributed valuable empirical data for a deeper understanding of the topic.For example, there is a general preference for language over symbol (Dai & Xu, 2007;Dam, 2004aDam, , 2004b;;Lung, 2003), and a source language dominance in the notes taken by student interpreters (Abuín González, 2012;Andres, 2002;Dai & Xu, 2007;Lim, 2010;Lung, 2003).However, most of the studies are product oriented, which means that they only look at the final product of note-taking (the notes produced), without an in-depth analysis of the interpreting process.
Nevertheless, some studies have taken a process-oriented approach to note-taking and CI, and two examples would be Andres (2002) and Orlando (2011).Andres (2002) used time-coded videos to analyse the time span between the moment a source speech unit was spoken and the moment it was noted down.It was the first study to record the note-taking process in detail.Orlando (2010) used the Livescribe Smartpen to record the process of note-taking.The questionnaire results he collected from students showed encouraging potentials for the technology to be applied in teaching and learning.However, both methods that have been used have important limitations: video recording involves determining the start of note-taking by manually checking the video and its timestamp; the Smartpen does not report the moment-to-moment changes in pen position in fine details (e.g.coordinates).
This paper attempts to revisit the topic of note-taking and CI and propose an experimental design that could potentially address some of the limitations with previous research.The design combines product analysis with process investigation, drawing on the conjoint approaches of pen recording, eye tracking and voice recording.

A PEN-EYE-VOICE DESIGN
This paper introduces an exploratory design involving pen recording, eye tracking and voice recording with the purpose of gaining further insights into the process of note-taking and CI.Pen recording is mainly targeting Phase I of CI, in which interpreters listen to the source speech and write down notes.Eye tracking is mainly targeting Phase II of CI, in which interpreters read back their notes and produce a translated speech.Voice recording is mainly serving the purpose of recording the interpreting performance, while also documenting a retrospection on the note-taking process.
A Pen-Eye-Voice Approach Towards the Process of Note-Taking and Consecutive Interpreting: An Experimental Design 3

Pen Recording
The apparatus used for pen recording was the Wacom Cintiq 13HD (a 13-inch LCD tablet with a resolution set at 1366×768 pixels) and the Wacom Pro Pen.The system was chosen because it aims to cater to graphic designers who have very high requirements in terms of the precise control of the pen on the tablet surface.It is ergonomically designed to mimic natural writing and painting.Another reason for choosing this system is because it is compatible with the Eye and Pen software (note 3), one of the core software products powering the experiment.The software piloted a laptop computer which was linked to the pen recording apparatus.The software carried out three tasks: controlling the experiment, collecting the pen data, and processing the pen data.

Controlling the experiment
The experiment and its procedures were programmed into the software, which then controlled the progress of the experiment and interacted with the participant.For example, in Phase I of CI, when finishing one page of note-taking, the participant could use the pen to click on a button displayed on the tablet screen called "New Page" (Figure 1) and the software would create a new blank page for note-taking.
The participant could use as many pages as needed.When the listening and note-writing phase was finished, the participant only needed to click a button called "Begin Interpreting" (Figure 1) and the software would automatically turn to the first page of notes written by the participant.
Then the participant could read back the notes and produce a target speech.In this phase, new buttons such as "Turn Page" (which turns to the next page of written notes) and "Next Part" (which plays the next segment of the source speech) would appear on the screen, so that the participant could interact with the software to navigate through the pages of written notes.The tablet screen would only react to the tip of the digital pen, so the participant could write as naturally as possible and did not need to worry about triggering any buttons by touching the screen with their hands.

Collecting the pen data
The software collected the spatial and temporal data about the pen as it moved across the tablet surface.For example, data was recorded for each pen stroke in terms of the distance (how far the pen travelled across the surface), duration (for how long the pen was in touch with the tablet), and speed (how fast the pen was moving).Spatial data was reported in centimetres and temporal data was reported in milliseconds.The software also kept a session log for each trial, documenting the time every action took place during the recording (e.g. the source speech segment started playing, the participant started writing, etc.).This function was crucial for the calculation of one type of data, the ear-pen span, which is the time span between the moment a speech unit is heard and the moment it is written down in notes.The ear-pen span is a useful indicator of cognitive processing during note-taking.

Processing the pen data
The software has many functions for displaying and analysing the recorded pen data (Figure 2 is a screenshot of the software in the analysis mode).The most useful function for the current design is the "Word separation" tool, which semi-automatically separates the written texts into words (in the case of this design, note units).Although manual work was required to correct the separations, this function allowed very accurate data to be reported for each individual note unit (e.g.start and end time, duration, distance, speed, etc.).Labels could be created for each note unit so that qualitative data could be added to each note and exported for further analysis.For example, for note unit no.13 (see bottom left of Figure 2), texts 1 to 6 documented the form and language of the note unit as well as its content, meaning, and corresponding source speech unit.The labels indicated that this note unit was language ("L" in Text 1), in English ("E" in Text 2), and an abbreviation ("A" in Text 3).It contained three letters "svs" (Text 4), meaning "services" (Text 5) and corresponding to the word "services" (Text 6) in the source speech.In this way, the exported file contained both quantitative and qualitative data (Figure 3).

Eye Tracking
The apparatus There were a few prerequisites for selecting the type of eye tracker to be used in this experimental design.First, the eye tracker needed to allow the interpreter to speak freely, thus eliminating the use of eye trackers that require chin rests.Second, the eye tracker needed to be usable in a handwriting situation.In particular, the eye camera(s) could not be masked by the participant's forearms in movement.Head-mounted eye trackers which are mounted directly on the participants could meet these requirements.Third, for the comfort of the participant and ecological validity of the experiment, a lightweight eye tracker that could be attached to the participant easily was preferable.
The eye tracker used in this design was the SensoMotoric Instruments (SMI) Eye Tracking Glasses 2 (ETG2).It is a light-weight (47 g), head-mounted eye tracker in the shape of a pair of glasses.The eye tracker uses dark pupil tracking.It has a tracking accuracy of.5°over all distances and a sampling rate of 60 Hz.The eye tracker has a built-in high-definition camera for scene recording.This camera recorded both the video and the audio during the entire note-taking and interpreting process.The SMI software iView ETG and BeGaze were used with default settings for eye data recording and analysis respectively.The experiment took place in a sound-proof studio with constant artificial illumination to avoid any distractions or disruption to the recording of eye data.

Semantic Gaze Mapping
Semantic Gaze Mapping is an analysis function of the software BeGaze.It can map gaze data from videos to static pictures.The pages of notes taken by the interpreters were saved as pictures by the Eye and Pen software.These pictures were imported into BeGaze and used as reference images.The gaze data on the scene video (a video of what the participant saw during interpreting) were be mapped onto the reference images.After all relevant eye data were mapped onto the images, AOIs were drawn on the images instead of on the scene video, which could increase the accuracy and efficiency of analysis.
An AOI was drawn for each note unit and labelled according to the note's form and language (e.g., an English abbreviation).This allowed further analysis to be carried out when comparing the eye movement data between two note-taking choices (e.g. the dwell time on Chinese vs English notes).

Voice recording
Voice recording during the interpreting process was collected via the eye tracker (the audio files were extracted), and voice recording during retrospection was collected via a laptop computer.The voice data were used for several different purposes in this experimental design.
First and foremost, the interpreting performance was recorded.The audio recordings were later transcribed and provided to a group of raters for evaluation.This generated performance scores used for exploring the relationship between note-taking and interpreting performance.
Second, voice recording was used during cued retrospection.Immediately after the interpreting tasks, the participants were provided with their notes for cued retrospection.They were asked to provide as much information as they could remember about the note-taking process, including but not limited to: what each note unit was; what it stood for; whether it was symbol or language, and if language, whether it was abbreviation or full word, Chinese or English.This is an important step because note-taking in CI is highly individualised, and the handwriting of interpreters could sometimes be difficult for others to decipher.
Third, the source speech audio files were used together with the session logs kept by the Eye and Pen software to calculate the ear-pen span, an important indicator of cognitive processing.

Tasks
To make the data more generalizable, two CI tasks covering both directions of interpreting (between Chinese and English) were designed to account for both the source/target language status and the native/non-native language status.The two tasks, English to Chinese (E-C) and Chinese to English (C-E), were carefully created through a series of procedures to control for variance.
First, two English scripts (on similar topics) were created by the author and edited by an experienced university lecturer in Australia (a native English speaker) to make them: (1) as comparable as possible, and (2) suitable to be read out loud and recorded as a speech.The edited scripts were analysed using CPIDR, a computer programme that could automatically determine the propositional idea density.The results show that they were similar in terms of length (number of words and propositions) and idea density.One of the scripts was used to create the E-C task.Second, the other script was translated by the author into her A language (Chinese), and refined by two Chinese-speaking editors at a local Chinese radio station to make it suitable to be read out loud and recorded as a speech.This script was used to create the C-E task.
Third, the edited English and Chinese scripts were recorded into audio by a native Australian English speaker (the university lecturer) and a native Mandarin Chinese speaker (a radio personality) respectively.The recordings were carried out in professionally soundproofed rooms.The speakers were required to record a natural speech with steady pace.They were allowed to restart any sentence at any time when needed.These false starts were later edited (see step four).
Four, the recorded audio files were imported into Audacity, a sound-editing programme, for further refinement to address such issues as false starts, unfinished sentences, and background noises (e.g.turning page).After edited, both tasks were about five minutes long and divided into three segments each.

The Experimental Setup
The digital tablet was linked to a laptop computer powered by the Eye and Pen software, which controlled the experiment procedures, interacted with the participant, and recorded the pen data.The eye tracker (a pair of eye tracking glasses) was linked to another laptop which recorded the eye data.The Eye and Pen software controlled the play of the sources speech and the eye tracker recorded the sound during the entire interpreting process, so the two pieces of data could be synchronised via the sound track.The experimental setup is shown in Figure 4.

Procedures
The experiment took place in four main procedures: practice, task performance, retrospection and post-experiment questionnaire.The practice session was designed to familiarise the participants with the experimental procedures and the apparatus, especially the digital pen and the eye tracker.The task performance session involved two CI tasks: C-E and E-C.The order of the tasks was randomised so that about half of the participants started with the C-E task and the other half started with the E-C task.Rest was allowed between tasks if needed.The retrospection session was cued by the written notes and participants were instructed to recall whatever they could remember about the note-taking process.This was mainly designed to help the researcher accurately identify the note units and to collect additional qualitative data for the interpretation of the results.The questionnaire was designed to collect such information as the participants' familiarity with the task topics, how they felt about using the digital pen and the eye tracker, and other feedback about the experiment.

DATA AND ANALYSIS
The main sources of data that were collected using the designed experiment are summarised in Table 1.Because note-taking in interpreting is a highly individualised activity, the number of note units produced in different categories both within and between the interpreters differed.Wherever applicable, the data should be standardised by calculating the mean.For example, if the number of Chinese notes written by a participant is n, then the ear-pen span of Chinese notes (EPS c ) of that participant is calculated as: The standardisation could create paired samples with the same size.In this way, paired-samples t-tests can be used to compare between the note-taking choices in different forms (language vs. symbol; abbreviation vs. full word) and languages (Chinese vs. English).The Pearson's correlation can  A pilot and a main study using this design have been reported elsewhere (Chen, 2017b(Chen, , 2017c)).The studies show that the ample empirical data collected during the experiment could reveal important traces of processing and efforts in note-taking and CI.A unique array of indicators can be generated for estimating the physical, temporal and cognitive demands of note-writing and note-reading, indicators that could be useful in future studies on related topics.

DISCUSSION
This paper introduces an experimental design to cater to process research on note-taking and CI.CI is a special and complicated language processing task which involves the simultaneous sub-tasks of listening and writing in Phase I and reading and speaking in Phase II.Previous studies have so far mainly focused on the product of note-taking (written notes) and CI (the interpreting performance), without investigating into the process.This design triangulates the methods of pen recording, eye tracking and voice recording to allow for a combined analysis of process and product.Detailed data can be collected during the process of note-taking and CI, particularly the pen movements (distance, duration and speed) and ear-pen span during note-writing, and the eye movements during note-reading.
It has to be admitted that the design has several limitations.First, there is a considerable amount of manual work involved in initial processing of the pen data (see section 3.1.3)and the eye data (see section 3.2.2),as well as preparing the data for analysis (see section 4).Second, the digital pen and tablet selected in this design for data accuracy and precision are based on a sacrifice of ecological validity.Other apparatus, for example, a digital pen with ink which can write on real paper, can be used to increase the ecological validity of the experiment, on condition that the data quality can be guaranteed.Third, the eye tracker selected in this design for ecological validity and availability reasons is a low speed one (records at 60 Hz).In addition, this eye tracker is not supported by the Eye and Pen software, resulting in a post-hoc data synchronisation.Other eye trackers could be explored to see if they perform better than the selected one.
The pen-eye-voice experimental design points out some future directions for interpreting research, especially process-oriented research, and potentially contributes to language processing research in general.Hopefully researchers could join the effort in using a triangulation of methods to investigate into the intriguing and challenging topic of interpreting.

END NOTES
1.In this paper, CI refers to long consecutive where systematic note-taking is used.2. Interested readers can find the theses reported in various issues of the Conference Interpreting Research In-formation Network Bulletin (CIRIN Bulletin) at www. cirinandgile.com.3.More detailed information about the software can be found on http://eyeandpen.net/en/.

Figure 1 .
Figure 1.A screenshot of the tablet in the recording mode

Figure 3 .
Figure 3.A sample data output of pen recording

Figure 4 .
Figure 4.The experimental set-up

Table 1 .
Data that can be collected using the experimental design in this paper