Transcription bank: John Potter

Where the transcript was published

Main publication, PhD Thesis:
Potter, J. (2009). Curating the self: Media literacy and identity in digital video production by young learners. London Knowledge Lab. London, University of London. PhD.

Articles drawn from data chapters e.g.:
Potter, J. (2010). “Embodied memory and curatorship in children’s digital video production.” Journal of English Teaching: Practice and Critique 9(1). pp. 22-35

Potter, J. (2012). Digital media and learner identity: The new curatorship. New York, Palgrave MacMillan.

Link to transcript: Long form video analysis template

How the transcript was made

The template was made by hand.  The initial design phases preceded any software solutions or the publication of the subsequent handbook of multimodal analysis texts in the years which followed so I was working more or less on my own on the design (but with the support of my supervisor Andrew Burn) through trial and error and adapting from Burn and Parker’s proposal and analysis of the “kineiconic” mode (2003).

Rationale for the deign

I made the transcript for use in my thesis to analyse student-authored digital video pieces, not for live action video. As such the idea was to look for evidence of design in the pieces produced by the children according to the theoretical frames in the thesis itself which were concerned with issues of habitus and learned ways of being in the setting. Because I was looking at evidence of cultural affiliation as well as the transient and anchored aspects of identity (cf Merchant, 2005) I added elements which allowed me to explore the affective and aesthetic elements in the video pieces (Leander and Kim, 2006).  As I wrote at the time:  “I decided to use a hybrid frame which allowed for the analysis of all the distinctive multimodal elements through time, scene by scene, in each of the video texts made by the children but which also allowed for some commentary on the social, aesthetic and affective aspects of each element in the video.”

From my original rationale:

Scene: this was where the number and timing of each scene was logged.  In the space to the right there were thumbnails used to illustrate the start of each scene (in most cases; other sequences were sometimes indicated where relevant).

Scene description: some simple narrative was provided here which gave a basic overview of the scene.

Genre/direct media reference: this allowed the media elements, direct quotations or non-specific genre parodies/appropriation to be identified and contribute to intertextual analysis

Element within video: this allowed for elements within the overall organising system of the video to be identified

Camera/technical: this category was used to describe particular issues that arose in the shooting of the scene, identifying shot types or technical difficulties

Action/gesture: Particular forms of gesture which accompanied the speech, sound and shot types were described under this category, including performance movements of particular kinds (elaborate or encircling movements, “street” gestures conveying particular sorts of cultural significance, anything which changed or underlined overall meaning in relation to the other modes).

Speech/sound: This was used to record all sound, diegetic and non-diegetic, including transcriptions of all speech in the productions

Style/identity/ways of being: This was used as a way of recording performance and embodied meanings within the production, referring to what the performance revealed about the “habitus”, the way of being within the setting (after Bourdieu, 1986, p.170).  This became “memories/references” in the shorter version of the grid.

Transition to the next scene: In this section, any relevant transitions were noted, particularly in respect of what this revealed about competence or otherwise with the software.

In the data analysis chapters shorter grids provide a single thumbnail per scene and a shorthand description which provide a snapshot of each video production.


Bourdieu, P. (1986). Distinction: A social critique of the judgement of taste. London, Routledge.

Burn, A. and D. Parker (2003). Analysing Media Texts. London, Continuum.

Leander, K. and A. Frank (2006). “The Aesthetic Production and Distribution of Image/Subjects among Online Youth.” E-Learning 3(2): 185-206.

Purpose of the transcript

The transcript has been useful for looking at the ways in which meaning is made in modes other than text and speech.  It has also usually been used in conjunction with other methods of data analysis and with other forms of related data as part of a hybrid and mixed methods approach.  I found it flexible enough that meaning-making modes could be set alongside a commentary on anchored and transient affiliation ready for corroboration and contrasting with other data collected in the setting.  I have used it in two other projects with similar short video clips – the Learners and Technology project 7 – 11 (see Selwyn, N., Potter, J. and Cranmer, S. (2010). Primary ICT: Learning from Learner Perspectives. London, Continuum.) and the Playground games project (See Potter, J. (2014, forthcoming) “Co-curating children’s play cultures” in Burn, A & Richards, C (ed) Children’s games in the new media age: Childlore, media and the playground, Ashgate: Burlington VT)

Lievrouw, L. H. and S. Livingstone, Eds. (2006). The Handbook of New Media (updated student edition). London, Sage.

Merchant, G. (2005). “Electric involvement: Identity performance in children’s informal digital writing.” Discourse: Studies in the cultural politics of education 26(3): 301-314.