Sally Stockbridge. 'Intertextuality: Video Music Clips and Historical Film'. In T. O'Regan & B. Shoesmith eds. History on/and/in Film. Perth: History & Film Association of Australia, 1987. 153-8.


Sally Stockbridge

Films of the thirties and forties have found new "life" in a different format, the video music clip. Recent video clip productions have brought to the small screen segments of much earlier film footage from Lang's Metropolis (1926) to Busby Berkeley and Fred Astaire. But these clips aren't a new form entirely, the black jazz juke box (made up of film collections) were common in the twenties and thirties, and even earlier, Oscar Fischinger was producing avant-garde, though populist, film/music productions. Popular culture now evokes its cinematic and musical anterior corpus, though, in so doing, recreates it in a different way.

This paper has a dual purpose: to recognize this earlier work and to appraise the interconnections between this work and the video music clips produced today.

The arguments expressed here need to be contextualized in relation to recent critiques of the burgeoning clip industry and TV music programs. This criticism is extremely negative in tone and suggests significant misinterpretation of a central component of video music clips, their intertextuality. Therefore, the theoretical basis of my argument here involves a reappraisal of this concept in relation to the work of Kristeva and Bakhtin.[1] In a recent article on MTV, an American cable TV music video program, E.Ann Kaplan mounted the following critique of video clips. She argued that:

MTV reflects the lack of orienting boundaries in its preference for pastiche. Rock videos do not "quote" the texts they freely draw upon, as early modernists like Joyce would have done. Postmodernist artists rather incorporate other texts "to the point where the link between high art and commercial forms seems increasingly difficult to draw." To quote a text implies a position from which the text is seen; it implies a position taken by the "quoting" text, a difference that makes a point ... in the recent periods when the distinction between high and low culture was clear, one could isolate the classical / realist (popular / commercial) text from the high art "classical" modernist text. The two were set in direct opposition to each other, and paid careful attention to the positioning of the spectator as historical subject ... (now) tapes draw freely and uncritically on art forms from Surrealism to Hollywood. Images from German expressionism, French surrealism and Dada (Dali, Bunuel, Magritte, Artaud) are all mixed together, and noir, gangster, western and horror films freely pillaged at the same time.[2]

In his recent study of the "Texts of James Bond", Tony Bennett[3] referred to the necessity of ridding ourselves of the assumptions that the "text" is, or should be, a unified, fixed or finished product. What is important is the different appropriations of them as a result of their inscription in different social, institutional and ideological contexts. This goes for the so-called "modernist" texts indicated above as well as for the so-called "post-modernist" video clips. Kaplan appears to both misunderstand intertextuality as "quotation" and to desire a textual unity that cannot exist.

Video music clips are culturally activated and constituted to be viewed in particular ways which include framing by programs and presenters, interviews with clipmakers and bands, discussions in magazines or "Fanzines", their use in concerts, and in relation to the radio play of the music component. Intertextuality works backwards and forwards to provide altered, stylized, mutated, conflicting or contradictory meanings within each new context.

The process at work here in relation to visual systems, and written appropriations of them, is akin to the concept of writing as developed by Bakhtin. There writing is considered to be the reading of the anterior literary corpus and the text an absorption of and a reply to another text.[4] For Bakhtin, an author may utilize the speech act of another in pursuit of his own aims and in such a way impose a new intention on the utterance. Such utilization may involve "stylization", "parody", or "skaz". "Stylization" is to "conventionalize any such style" (unidirectional); "parody", in contrast, involves the opposing and clashing of one position with another, or one text with another (varidirectional); and, "skaz" - which can be stylized or parodic - depends upon the narrator's ability to manipulate someone else's speech act. Active variants of this form of combination play important roles in creating polyphonic structures.[5]

The argument here, then, is that popular culture, and, in particular video music clips, is intertextual and that "stylization", "parody"', or "skaz" - polyphony can be the result. Following Kristeva, intertextuality is conceptualized as having nothing to do with matters of influence of one writer upon another, or with the sources of a literary work; instead it involves the components of a textual system (novel or film). It is defined as the transposition of one or more systems of signs into another, accompanied by a new articulation of the enunciative and denotative position (and therefore different levels of meaning). Any signifying practice is "a field in which various signifying systems undergo such a transposition."[6]

But it is not just the movement of elements from one system or context to another, it is also the presence of new viewers who operate differently as subjects that engenders intertextuality. Whereas Bakhtin and Kristeva's work here concentrates upon intertextuality in the "text", both de Lauretis[7] and Bennett refer also to the process of reading / interpretations in which "an intertextually organized reader meets an intertextually organized text."[8]

If we took Bakhtin's notion of parody, for example, for the clash of opposing positions to operate in any sense would depend upon the reader's capability to recognize them as such. This outcome is by no means certain.[9]

Video music clips transpose systems of signs from other systems which may be recognised and recognisable, however, the meaning attributable to these systems undergoes a change. The presence of different viewers also means that the meaning may involve, and evolve with, their differences. It is not the intentions of the author or authors that controls and determines meaning.

An example of this operation may be indicated in relation to music videos' potential self-referentiality. There are video music clips that use technology to refer back to their own commercial and creative constraints. Dire Straits' "Money For Nothing", becomes for Kaplan "a treatise on MTV and the rock star industry illusions of wealth without labour," and, Phil Collins' "Billy Don't Lose My Number" refers to the conflicts between clipmaker and musician over the determination of representational strategies. It is also possible to interpret Queen's "Radio Ga-Ga" as self-referential, although Kaplan describes the clip as:

(drawing) freely on Fritz Lang's Metropolis, and (using) the footage as a setting for the figures who have proto-fascist iconography ... there is a futuristic society and suggestions of fascism (by association with Fritz Lang) ... One is unclear whether the video is for or against fascism.[10]

This interpretation assigns a meaning "fascism" to Fritz Lang's Metropolis then transposes it from one context to another. First of all there is the question of whether this film and its director can be aligned with fascism; secondly, it suggests that images can carry "essential" meanings - essentially radical or essentially reactionary.

Objects within films gain meaning from the context within which they are placed. A transposition does not eliminate prior meanings but it does alter these. Both the new context and, concomitantly, the new viewers must be taken into account. In the case of Metropolis which was made in 1926, the gap between the film and this clip is 58 years. Another interpretation may account for both the apparent similarities and ascertainable differences.

My initial interpretation of "Radio Ga-Ga" focused on the way in which Freddy Mercury, the lead singer of Queen, and its "star", is transformed into an icon for adulation, the robot from Metropolis, which in that film, went on to entrance and manipulate the workers of the city. This could be a self-reflective acknowledgement by Freddy Mercury of his own position as "rock star". This reflectivity could be expanded into a critique of the music industry. And, since interpretation also depends upon the context and knowledge of the viewer, this could be expanded further given information about this band, apart and besides from information about Metropolis.

Followers of Queen would know that their fame and following fell away to nothing at one stage, after having been at the top of the charts. It was revived with "Radio Ga-Ga". The clip traces these events utilising signifiers of blind belief (i.e. German fascism - the salute and mass rally) in order to represent the way they viewed the audience. Fritz Lang used Metropolis as a critique of mass dependency, the "masses", as he represented them, required leadership because of their inability to discern what was in their own best interest. In "Radio Ga-Ga" it is these representations of the audience as "mass" that are transposed. Metropolis'internal pessimism is recontextualised as pessimism about audience loyalty and as a critique of the band's position within the rock industry in general, as well as the instability of the rock world. "You made us feel that we could fly". The "Favourite Years" in the clip were when Queen was at its peak. The seige mentality in the video clip is engendered by the instability of success. This self-referentiality could indicate either of the Bakhtin categories - parody or stylization - depending upon the interpretative position of the viewer. But, whatever the case, this difference must be indicative of "polyphony".

Many video clips have also been associated with Hollywood musicals of the thirties, not because the transposed system has a similarity of content and aesthetics, but because of a similarity of structures. These musicals are equally, conceivably, self-reflective.

It's possible to find traces of the influence of all sorts of musical production numbers in rock videos, but it's with Busby Berkeley's extravaganzas of the thirties that the basic structure of the rock-video form can best be seen. In most musicals, songs and dances related directly to some element in the surrounding story. Berkeley's routines stood by themselves in sublime self-enclosure. While stories were often told within them (as in the "Shanghai Lil" number in Footlight Parade or "Lullaby of Broadway" in Gold Diggers of 1935), Berkeley's primary mode was pure abstract sound and image. "By a Waterfall" in Footlight Parade, and "Only have Eyes for You" in Dames are basically processions of women in shifting mathematical arrangements gainst fluid non-spatial, temporally specific backdrops. Repetition and circularity of action as a signifier of closure, are similarly a rock-vid obsession.[11]

Mellancamp described these self-enclosed segments as "spectacles" and suggested that they mirrored the structure of the film disrupting it at the same time. Because of this they were said to be contradictory (or "Skaz").

Spectacles can be considered as excessively pleasureable moments in musicals, awakening the spectator to the fact of filmic illusion. Ironically, then, the moments of greatest fantasy and potentially greatest identification would coincide in musicals with the moments of maximum spectator alertness.[12]

Thus, musicals were described as self-reflective - acknowledging their own artifice, potentially disruptive of narrative realism opening out instead, a possibility for radical or oppositional interpretation, or cinema. The "spectacle" usually revolved around the presence of the star". Gene Kelly, Fred Astaire, Cyd Charisse, Ginger Rogers etc. singing and dancing in the street or on a stage, signified in terms of their position as protagonist and as "star". The devices of segmentation within the narrative and placement within the frame reinforced this. This aspect may seem rather more mythologising than self-reflective reinforcing an ideology of stardom - but its juxtaposition to the cinematic realism of the rest of the film, coupled with the deliberate "staging" of the sequence, perhaps foregrounded its constructed nature instead of increasing identification.

However, if music clips are considered in this context the first thing that is apparent is the fact that they do not "disrupt" because they are not juxtaposed to, or within, a distinctly different context of representational strategies. It is arguable that if the spectacle is taken out of the narrative context it remains "fantasy". The subversion of its excessive nature is removed and what remains is the performance and the "star" as well as the possible abstraction, repetition and technology. In this case the clip is no longer like the segment, it cannot share the same quality.

Indeed, few mainstream rock videos appear to use their own fantasies self-reflectively. The repetition in Busby Berkeley becomes, in clips, either a purely formal strategy or the structure and pattern of the music itself. This is not always the case though. It is important to see how the clips make use of transposed systems. There is the possibility that self-reflectivity resides in a number of examples.

In the Eurythmics clip "Would I Lie To You" the audience is included as a device foregrounding the specially constructed performance in a manner similar to the inclusion of audiences in Hollywood musicals. The audience represents the "concept" of the audience of the entertainment industry, suggesting to the viewer the status of the performance as performance, "entertaintment" and as pleasure. The "backstage" events are included as an expose of this process. In most other clips the audience, if included, are there to reinforce the illusion of "live performance", part of the taken-for-granted nature of the spectacle rather than a reflective component. Video music clips that include "performance" segments within a framework usually include an "audience". Here, although described sometimes as "live", it is much the same representation of the audience as Feuer describes. The "theatrical audience within the film" provides "the point of identification for the audiences of the film. The audience in the film is there to express the adulation the number itself sought to arouse from the film's audience."[13]

However, transposition may also become parody - an oppositional device. It's use within Cindy Lauper's clip "She-bop" would be a case in point. Busby Berkeley's song and dance segments were described above in terms of repetition and a circularity of action which are internal attributes of a segment. However, there is another form of repetition in the concept of the visual cliche.

Musicals of the thirties, like other genres, had distinctive features that enabled them to either be seen as the collected works of one director, or as emanating from the same studio, or as repeating a "style" or formula. All of these categorisations are based on observable repetition across a number of films rather than within them (stylization).

Right across the musicals of the thirties, but particularly the case in Busby Berkeley films, the visual motif of the elaborate (art-deco) staircase was repeated, complete with hundreds of male extras in top hats and tails carrying canes. Chorus numbers on or beneath these staircases accompanied by the stars of the film provided a climax or focal point, for example, in Gold Diggers of 1935. In Cindy Lauper's 1985 film clip "She-Bop" this motif occurs again also as the climax or finale of her performance. But in this case the extras are both male and female all in trousers with canes, parodying rather than simply repeating the fetishistic and patriarchal sexuality of Berkeley. Thus she transposes it into both a verbal and visual metaphorical "climax", part of a eulogization of auto-eroticism or masturbation. This climax is more directly "for" a female audience than Berkeley's films ever were!

Thus, musicals and video music clips are related intertextually. However, prior to the production of musicals there were various experiments of music and film which related them structurally, in relation to harmonies (now mainly produced through editing practices). These experiments sought to achieve this in a manner which would be "popular". Oskar Fischinger has been described as the artist who foreshadowed computer graphics by nearly 50 years.[14] Influenced by Walter Ruttman's abstract filmwork and linked to Surrealism and the Dadaists, Fischinger commenced his experiments in 1922 inventing a wax-slicing machine which allowed successive slices of a block of mixed multi-coloured wax to be photographed frame-by-frame.

Fischinger's most important work started in the thirties, after he moved to Berlin from Munich in 1927 to commence work as a special-effects animator and designer. One of his first projects was Fritz Lang's 1928 science-fiction film The Woman in the Moon. He completed a dozen films called Studies between, 1919 and 1933, all were black and white drawings on animation paper with charcoal or grey and black paints. All of these films are animated graphics synchronised with various music tracks, for example, jazz, Musorgsky's "Hungarian Rhapsody", Dukas' "Sorcerer's Apprentice". Malcolm le Grice argues that the popular success of these films was probably based on their strict synchronisation to "relatively light popular, or pop-classical music." But his biographer, William Moritz, described his ambivalence to it:

Fischinger of course did not hate music, and consented to work out his B/W studies in close synchronisation with music because he believed that the immense achievement of music as an abstract art form, developed and elaborated, should be tapped and adapted. A composer like Richard Strauss or Paul Hindesmith (two of his favourites) had a thousand years of tradition to build on - patterns and styles of harmony, counterpoint, melody rhythm, phrase etc. - had been worked out and refined by generations of great composers ... Both visual and abstract art and the kinetic, temporal art of the cinema, had begun almost from scratch in the early years of the twentieth century, and Fischinger felt that he could use the advice and learned advantages to be gained by constructive analogy with auditory abstract space-time art: music.[15]

In December 1933 Fischinger completed a one-minute film called Kreise (Circles) which was the first full-colour film shown in Europe. The film is in two parts - the first synchronised to the "Venusberg" music by Tannhauser, the second to "Hildigung's March" by Grieg. It consists of hundreds of hypnotic circles of all conceivable types on a flat surface. The film ends with a series of titles stating "Tolirag reaches all circles (of society)." Ironically, but most suitably, it was a promotional film, though, not for the music.

With advertising providing his income, Fischinger went into further personal experimentation. Composition in Blue (1935) is an experiment with the movement of three-dimensional objects within a prescribed space. The opening sequence introduces a number of red cubes on a blue 3-D background. Other coloured shapes expand into this space: panels, cylinders, discs and columns. There is a finale with circles of colour flying through space towards the viewers, rather like contemporary Sci-fi space odyssey films, and some modern advertising using computer graphics. "The Merry Wives of Windsor" overture is the soundtrack. This film was awarded the Grand Prix at the 1935 Venice festival, but since abstract art had been banned by the Nazi regime, international recognition meant that Fischinger had to leave Germany. His work in Hollywood lasted only three years since he refused to perform the compromises of that agency. Fantasia provides a good example here: his ideas were continually changed because Disney did not want a pure abstract film for fear that it would not be a commercial success. Disney also rejected a fundamental principle of Fischinger's style - having many movements occurring at once on the screen - and insisted in its stead upon only one action at a time being shown.

It is interesting to note that one of the more political and experimental bands today, Talking Heads, state that their preference is for animation videos rather than the naturalist, or other conventions, of advertising. "We made a couple of Tom Tom Club videos that were purely animation which I was really proud of, but they didn't get shown much because they didn't show the band."[16] The similarity between the Tom Tom Club videos and Fischinger's films is the relationship between their abstract visuals and their status as opposition, in each historical context, to the dominance of naturalism or other forms of visual representation. Differences exist in that the video clips are there for the promotion or popularisation of the music not the visuals. In the current case the music is the "end", for Fischinger it was the "means."

My purpose in this paper has been to query the supposition that the meaning of video music clips lie in the analysis of the texts that preceded them, and that intertextuality, rather than indicating an "origin" of meaning implicates texts within a multiple viaribility of meaning based on both the context of the video music clip as text, and the context of the viewer. This position hasn't been fully developed, as yet, only suggested via the assertion that similarity is always simultaneously traversed by difference.

What we understand as the contemporary video music clip began not with the inauguration of MTV in the USA in 1981, or with Countdown in 1975, but in the twenties and thirties through the experimental work of people like Fischinger and its relation to the more conventionalized context of the Hollywood musical. But these do not by any means exhaust the cultural anterior corpus that constitutes the signifying field of possibility.


1. Julia Kristeva, Desire in Language (Oxford: Blackwell, 1980); M.M. Bakhtin in V.N. Volosinov, Marxism and the Philosophy of Language (New York: Seminar Press, 1973).

2. E. Ann Kaplan, "A Postmodern Play of the Signifier? MTV: Advertising, Pastiche and Schizophrenia," Unpublished conference paper, p.9 and p. 14.

3. Tony Bennett, "The Bond Phenomenon: Theorising a Popular Hero," Southern Review, v. 16, no. 2 (1983), p.209.

4. Bakhtin, p. 192.

5. Ibid, p. 198

6. Kristeva, p. 15.

7. Teresa de Lauretis, Alice Doesn't (Bloomington: Indiana University Press, 1984), p. 21.

8. Bennett, p. 214.

9. See Paul Willemen, "Distanciation and Douglas Sirk," Screen, v. 12, no. 2, (1971).

10. Kaplan, p. 14.

11. Ehrenstein, "Pre-MTV", Film Comment, v. 19, no. 4 (1983), p.41.

12. Patricia Mellancamp, "Sectacle and Spectator," Cine-tracts, no. 2.

13. Jane Feuer, "The Self-Reflective Musical and the Myth of Entertainment," Quarterly Review of Film Studies, August (1977), p. 323.

14. The contemporary use of computer graphics is foregrounded in the Dire Straits clip "Money For Nothing" which utilized the Bosch computer system and did so, also, to promote the possibility of its useage.

15. M. Le Grice, Abstract Film and Beyond (London: Studio Vista, 1977), p. 65.

16. Countdown Magazine (July, 1985), p. 14.

Html markup=Tom O'Regan, Garry Gillard 11 February, 2015