Continuum: The Australian Journal of Media & Culture
Vol. 5, No. 1, 1991
Edited by Alec McHoul

Conjunctive structure in documentary film and television

Theo van Leeuwen

Conjunction in the Audiovisual Text

Linguistic conjunction analysis is about the semantic relations between the clauses in a text, or, more precisely, the 'conjunctively relatable units' (CRUs) in a text.

To analyze the conjunctive relations in a text is to ask: how does the content of this clause logically relate to the content of the previous clause - does it, for example, narrate an event that happened after the event narrated in the previous clause, or does it present a reason for what was stated in the previous clause, or something that contrasts with what was presented in the previous clause. An analysis of the conjunctive relations in a text, therefore, shows us something about how that text hangs together, reveals one aspect of the cohesive structure of that text. At the same time it shows us something of what the test is trying to do to or for the reader or listener - of whether is narrates events, builds up a persuasive argument, lists items of information, or perhaps, combines several of these and other functions.

Between the images in a sequence of images there can also be conjunctive relations. Subsequent images can narrate subsequent events, present contrasting content, similar content, and so on. This has been recognised in film theory from the 1920s onwards. Eisenstein, Pudovkin and Timoshenko already considered, to various degrees and in various ways, the logical relations between shots in sequence. Many of the distinctions in Metz's grande syntagmatique were conjunctive distinctions. But they did not always clearly distinguish between conjunctive relations and other kinds of cohesive relations (e.g. rhythm, repetition). They also tended to take a physical, technical unit, the shot, as their 'conjunctively relatable unit', rather than asking themselves whether shots always neatly coincide with visual 'conjunctively relatable units'. And they have, by and large, ignored that audiovisual texts interrelate visual and verbal conjunction into one conjunctive structure. This is, to some extent, understandable in the case of the earliest film theorists, who wrote in the period of the silent film, when the visual and the verbal elements (the titles) formed one track, rather than two simultaneous tracks. It is less excusable in the case of later writers, such as Metz. It is one of the aims of this paper to explore the role of what is here referred to as visual conjunction in film theory.

In audiovisual texts, then, there can be conjunctive relations not only within the verbal and the visual track, but also between these two tracks. It is another aim of this paper to explore the possibility of an integrated analysis of the verbal and visual conjunctive structures in a particular kind of audiovisual text, documentary film and television. To achieve this aim, it is necessary to have, as much as possible, one technical vocabulary, one metalanguage, to speak about both. This potentially entails the danger of projecting linguistic structures into the domain of the visual. However, if the metalanguage is a functional-semantic one, rather than one based on the forms of one or other semiotic system, this danger can be avoided. It is unreasonable, for instance, to speak of 'visual clauses' or 'verbal shots'. It is reasonable to expect that each semiotic, in its own way, can realise the kinds of logical relations which the culture generally, or which some socially defined domain within the culture, allows to be made actions, propositions and so on. Perhaps some conjunctive relations can, within a given social and historical context, only be realised visually, others only verbally, again others in both semiotics. But there is one 'form of the content', to use the Hjelmslevian term, which, within a given social and historical context, is drawn on by different 'expression forms'.

Conjunction in language

A - necessarily brief - overview of conjunction in language will be given first. It is based on the work of Halliday & Hasan, Halliday and Martin.

implicit and explicit conjunction

The conjunctive relations between clauses can either be implicit or explicit, that is, conjunction may be explicitly expressed or implicitly understood. The conjunctive relation between 1a and 1b below, for instance, is causal: in 1b Sir William provides the reason for his statement in 1a. But this relation is not explicitly expressed. The relation between 2a and 2b is also causal, but here it is explicitly expressed - by the conjunction because:

1a) The national President of the RSL, Sir William Keys, said it was insensitive of Gough Whitlam to launch his book today, knowing it was Armistice Day.

1b) Sir William said that the dismissal of the former Labour Government was insignificant in the context of the real meaning of the 11th. November.

2a) It would have been rather nice if he'd chosen some other day,

2b) because he would know that this would overshadow the particular events of this day.

Some linguists, Halliday among them, think that only explicit conjunction should be taken into account in the analysis of conjunction. However, in order to make sense of a text, the reader or listener must always provide some kind of conjunctive relation. Rather than not analyzing implicit conjunction, it should be analyzed as implicit, by testing which conjunctive expressions 'fit best'. Ambiguous or polyinterpretable cases should be marked as such, rather than that the analyst should choose for a particular interpretation. This will bring out how much room for different interpretations a text or kind of text leaves to the reader or listener.

The explicit realisation of conjunction can take a number of forms. It can take the forms of conjunctions, such as and, but, so, therefore, etc. It can take the form of conjunctive adjuncts, such as to illustrate, with this in view, and so on. It can also take the form of entire clauses. The same conjunctive relation relates examples 3a and 3b and examples 4a and 4b, for instance, but in 4a and 4b the conjunctive link is expressed by the clause as they do so, whereas in 3a and 3b it is expressed by the conjunction meanwhile.

3a) Waiters hurry from table to table.

3b) Meanwhile the band picks up a syncopation.

4a) Waiters hurry from table to table.

4b) As they do so the band picks up a syncopation.

The conjunctive meaning may also be lexicalised, that is, it may be incorporated in a verb, or noun, or adjective. The conjunctive relation which in 5 is expressed by previously, for instance, is in 6 expressed by the verb follow - to facilitate this the other verbs (move, meet) have been nominalised, turned into nouns:

5) Previously Parliament had met and (...)

6) His peace move followed a meeting of Parliament in which (...)

In the early days of the movies visual conjunction was often realised explicitly, either verbally (titles such as 'meanwhile...' 'later...'), or by means of a closed set of visual transitions such as the wipe and the dissolve. Today conjunction is more often implicit.

internal and external conjunction

Conjunction may be internal or external. In example 7 the conjunction later is internal: it does not provide a relation between the events represented in the text, but serves to organise the text itself. In example 8 the conjunction is external: it serves to relate the events represented in the text:

7) Later I will tell you about his trip to India.

8) Later he went to India.

Conjunctions such as firstly, secondly, finally, etc. are typically internal.

subordinating and non-subordinating conjunctions

Conjunction may be subordinating or non-subordinating, that is, it may link two equivalent, equal ranking clauses, or it may make one clause subordinate to another, so that the clause will function, for instance, as a circumstance of time, or of purpose, or of reason, with respect to the main clause. The relation between 9a and 9b, for instance, is non-subordinating, that between 10a and 10b is subordinating: 10a here becomes a circumstance of time with respect to 10b:

9a) Waiter hurry from table to table,

9b) and, on stage, the band picks up a syncopation.

10a) As the band, on stage, picks up a syncopation,

10b) waiters hurry from table to table.

Explicit subordinate conjunction makes use of a distinct set of conjunctions. Conjunctions like because, when, if, while, as, and so on, for instance, are always subordinating. Some conjunctions (e.g. as well as, but for, despite) are only used in nonfinite clauses, that is, in subordinate clauses which do not have a verb marked for tense, as in example 11:

11) Waiters hurry from table to table to get ready in time.

Here to get ready in time is not marked for tense (it cannot be changed into to got ready or to had gotten ready), whereas hurry is marked for tense (it can be changed into hurried, had hurried, and so on).

types of conjunctive relation

The overview of types of conjunctive relation in figure 1 does not include all the conjunctive options listed, for example, in Halliday and Martin, but is sufficiently detailed for the purposes of this paper. The meaning of most of the terms in the figure will be apparent from the examples of explicit conjunction given. A brief definition of the terms used to classify the different types of conjunctive relation follows:

In the case of 'elaborating conjunction' clause b, one way or another, restates the content of clause a. In the case of 'distilling conjunction' clause b does so by making the content of clause a more precise, in the case of 'presenting conjunction' clause b does so either by paraphrasing the content of clause a, or by exemplifying it.

'Extension' means that clause b, rather than restating the content of clause a, adds new content to the text. When the conjunction is 'qualifying' this is done in such a way as to actually specify the logical relation between the content of clause a and the content of clause b. The distinction between temporal qualification and spatial qualification on the one hand, and comparative, causal and conditional qualification on the other hand, is often quite important in text analysis, as this is one of the features which realise the distinction between narrative and expository texts (or parts of texts), between 'telling' and 'arguing'.

When finally, the conjunction is 'expanding', clause b adds new content to the text, but does so without specifying the relation between the content of clause a and the content of clause b.

conjunctively relatable units

Martin proposes that socalled projecting clauses (that is, the verbalisation clauses in quoted and reported speech, i.e. clauses like he said, he said that) should not be seen as separate clauses for the purpose of conjunction analysis. In other words, according to Martin, there is no conjunctive relation between 12a and 12b:

12a) Sir William said

12b) that the dismissal of the former Labour Government (...)

It is for reasons of this kind that Martin has introduced the term 'conjunctively relatable unit' (CRU): not every clause is a CRU - some CRU's may be smaller, some larger than a clause. This is a useful idea, among other things because it gives us a term that can also be applied to visual conjunction, and because we will encounter a similar problem in relation to the shot in visual conjunction. In the case of projecting clauses, however, it may still be important to look both at the relation between two subsequent projecting clauses and at the relation between the projected clauses that go with them. The relation between example 1a and 1b provides an instance. Looking at the projecting clauses the relation is one of time ('Sir William said ...' and then 'Sir William said ...'). Looking at the projected clauses, the relation is causal ('it was insensitive of Whitlam ...' because 'Armistice Day is more important than the dismissal ...'). Thus the logical argument is attributed to Sir William, while the writer restricts him or herself to 'mere reporting', 'telling' of what Sir William said. This kind of 'double structure' is an essential characteristic of news writing and should perhaps come out in the analysis.

analyzing conjunctive structure

In representing conjunctive structure I will follow Martin in using the 'reticulum' diagrams which he, in turn, derived from stratificational grammarians such as Sydney Lamb and H. Gleason. Internal conjunction is shown on the left hand side of the centre, external conjunction on the right hand side, while numbers, in the middle, indicate the CRU's. The advantage of this kind of representation is that one can also display conjunctive relations that go back further than the immediately preceding clause. If, for example, a CRU 1 is followed by 3 CRU's which all function as examples of what is stated in CRU 1, this can be shown as in figure 2:

If, as often happens with subordinating conjunction, a CRU relates conjunctively to the following rather than to the preceding CRU, this can be shown by the direction of the arrows as in figure 3.

When conjunction is not explicitly realised, one can analyze the conjunctive relations by testing which conjunction 'fits best'. In most cases this leads to an unambiguous interpretation which it would be hard to contest. Between clause 3 and 4 in figure 3, for example, there is no explicit conjunction. Clause 3 is subordinate and nonfinite, so one can look at the various nonfinite conjunctions possible. Is it as well as speaking in Canberra today Mr. Bowen said? Or perhaps without speaking in Canberra today Mr. Bowen said? From such testing it soon becomes evident that there is only one acceptable conjunction, simultaneity (While speaking in Canberra today Mr. Bowen said). However, there are cases which lend themselves to various interpretations. The relation between 1 and 4 in figure 3, for example, could be seen as merely 'additive' ('The Deputy Prime Minister said Australia needs a new constitution' and 'speaking in Canberra today he said he hoped ...'). It could be seen as 'temporal' ('The Deputy Prime Minister says Australia needs a new constitution' and then speaking in Canberra today, he said he hoped ...'). In this case I chose for one interpretation, 'particularisation', because I felt that time sequence was not convincingly implied, and because of the change in tense, from says to said, which is, in the context, also a change from the general to the specific, from the 'main gist' to the 'detail', and to the concept of 'detail' implies particularisation. However, the alternatives are not as impossible or illogical as, e.g., without speaking he said, and in many cases one should list the several possible interpretations, rather than choose between them.

Visual conjunction and the theory of film editing

In this section I will discuss some film theorists whose work is relevant to the questions of conjunction in audiovisual texts. The role of visual conjunction in the work of Pudovkin, Eisenstein, Timoshenko and Metz will be discussed first. The role of the conjunction between words and images in the work of Roland Barthes and Bill Nichols will be discussed in the second part of the section.

visual conjunction

The Soviet filmmakers and film theorists (some combining the two roles) of the 1920s were the first to formulate theories of film editing in which conjunctive relations played an important role. They were interested in the question of how the shots in a film can relate to each other in other than spatio-temporal ways, in what can be done with sequences of images beyond telling stories. Film, in that period, functioned as an important medium for political propaganda. The events narrated in films, therefore, were not just to be told, but also to be interpreted and explained in politically desirable ways. This, among other things, necessitated exploring the possibility of logical conjunctive relations between the shots in films.

Pudovkin distinguished between 'structural montage' and 'relational montage'. The former involved principles of selecting actions, reactions etc. with which to construct the story, as well as the cinematic ellipses entailing from this. The latter involved the various ways in which material not directly related to the main action could be introduced. This, at first sight, would seem to correspond to the distinction between spatio-temporal and logical (comparative, causal and conditional) conjunction. But it does so only in part. Pudovkin describes five types of 'relational montage', usually defining them by means of examples. The relation of 'contrast', for instance, is exemplified as follows: Suppose it be our task to tell of the miserable situation of a starving man; the story will impress the more vividly if associated with mention of the senseless gluttony of a well-to-do man'. In 'parallelism' two narratively unconnected stores develop in parallel. For instance a scene of a worker condemned to death for his part in a strike is intercut with a scene showing an employer getting drunk and falling asleep at the very point where the worker is executed. In 'symbolism' 'an abstract concept is introduced into the consciousness of the spectator without the use of a title'. For instance a scene showing workers being shot is intercut with the slaughter of a bull. 'Simultaneity' is a figure of editing learnt from Hollywood: 'In American films the final section is constructed from the simultaneous rapid development of two actions in which the outcome of one depends on the outcome of the other'. Pudovkin's final category of 'relational montage' is the 'Leitmotiv': 'In an anti-religious scenario that aimed at exposing the cruelty and hypocrisy of the Church in employ of the Tsarist regime, the same shot was repeated several times: a churchbell slowly ringing, and, superimposed on it, the title 'The sound of bells ends into the world a message of patience and love'.

'Relational montage', then, involves not only 'logical' conjunctive relations, but also a temporal one, the relation of 'simultaneity'. Pudovkin conflates two things: the principle of non-narrative conjunction and the principle of non-linear narration. As for the 'logical' relations Pudovkin recognises, they are primarily relations of 'comparison' - 'negative comparison' in the case of 'contrast', 'parallelism' and 'Leitmotiv'; 'positive comparison' in the case of 'symbolism'. Some of his distinctions, e.g. the distinction between 'contrast' and 'parallelism', are based, not on a difference in the type of conjunctive relation employed, but on a difference in the extent over which a certain conjunctive pattern is maintained: 'contrast' would appear to be incidental and local (an 'insert'), while 'parallelism' extends over a whole sequence. The category of 'Leitmotiv', finally, conflates repetition (redundancy) and 'negative comparison' (here between the image and the superimposed text). Repetition can, in fact, be seen as a form of elaborating conjunction alternative to 'presenting' and 'distilling' - and one which Halliday and Martin have overlooked, even though it is capable of explicit realisation (e.g. by an adjunct like to repeat). In language repetition is sometimes seen as a matter of rhetoric rather than grammar - but that distinction is, in the end, false,. Figure 4 charts the conjunctive relations recognised in Pudovkin's theory, using, where possible, the terms from figure 1.

In Eisenstein's theory of editing conjunction plays a minor role. His earliest theory, the theory of 'montage of attraction', is based on the idea of a formal 'clash' between shots (e.g. a clash between 'round' and 'square', 'horizontal' and 'vertical', left-pointing and right-pointing vectors): montage creates a 'shock effect' produced by the collision of different forms, and the 'agitation' such a 'stimulus' produces in the audience can be used for political purposes, for stirring people out of their supposed complacency. In his theory of 'intellectual montage' the formal contrast coincides with a contrast of content: the montage of the two shots can create a 'thesis-antithesis' relation which then leads to a tertium quid, a synthesis. In this way, Eisenstein maintained, cinema can express abstract ideas visually. He invoked ideo-grammatic forms of writing to lend force to his argument: in hieroglyphics, the sign for 'eyes' and the sign for 'water' can, when combined, produced the meaning 'sorrow'. 'This is exactly what we do in the cinema, combining shots that are depictive, single in meaning, neutral in content, into intellectual contexts and series'.

The form-content distinction also underlies his later emphasis on 'rhythm' and 'tone'. The principle behind the editing of a sequence can be 'metrical', that is, one of alternating different physical lengths of shots in distinct patterns, regardless of the content of the shots, or 'rhythmical', in which case the durational alternation coincides with some semantic contrast. However, conjunction is hardly invoked here. Film is deliberately compared to music, and editing creates symmetrical or asymmetrical arrangements of visual forms, rather than temporal or logical structures. The logic is a logic of emotional impact, rather than a logic of semantic coherence. Eisenstein focusses here on aspects of visual cohesion which are every bit as important as conjunction, fundamental to the creation of the 'texture' of the text, but which fall outside the scope of this paper.

Timoshenko's theory of film editing (as discussed in Arnheim) relates much more closely to the concerns of this paper, and is both richer and more systematic than Pudovkin's theory. He recognises, for instance, two types of temporal relation not mentioned by Pudovkin, 'return to past time' and 'anticipation of future'. He also distinguishes two forms of cinematic 'elaboration': 'enlargement' (in which a Close Shot is followed by a Long Shot of the same setting or action) and 'concentration' (in which a Long Shot is followed by a Close Shot of something that was seen also in the Long Shot). The Long Shot-Close Shot relation is a meronymic (part-whole) relation, rather than a hyponymic relation, and as such not the kind of relation for which there is, strictly speaking, an equivalent in linguistic conjunction, where such a relation would probably have to be analysed as 'additive', and hence a form of 'extension' rather than a form of 'elaboration'. 'Enlargement' and 'concentration' would seem to be conjunctive relations which cannot be realised linguistically.

Timoshenko's 'analytical montage' is an editing figure in which a series of shots which show details of a setting or action is neither followed nor preceded by a Long Shot of that setting or action. He finally distinguishes between two types of comparison, 'similarity of shape and contrast of meaning' and 'similarity of meaning and contrast of shape'. Although this is a very real distinction, and one which could be made also with regard to language, only the semantic aspect relates directly to our present concerns.

It can be seen that Timoshenko's theory conflates two things: the relations between shots or sequences of shots, and a typology of sequences. The categories of 'return to past time' and 'anticipation of future', for instance, clearly designate relations between two shots, or two sequences. The category of 'analytical montage', on the other hand, designates the supposedly homogeneous conjunctive structure of a sequence. We will return to this problem below, in connection with Christian Metz's grande syntagmatique. Figure 5 shows the conjunctive relations which play a role in Timoshenko's theory, again using the terminology from figure 1.

Metz's grande syntagmatique is a more recent account of editing structure, much quoted and applied in contemporary film theory. Like Timoshenko, Metz bases many of his distinctions on the conjunctive relations between shots, but presents his theory as a typology of sequences, rather than as an account of the ways shots and sequences can conjunctively relate to each other. This is why he needs the concept 'insert' for 'a-typical' conjunctive relations within a sequence which is predominantly temporally structured. If one restricted oneself to charting the kinds of conjunction that can occur in visual sequences, one would not need to assume that sequences are always conjunctively homogeneous and one would not need to interpret conjunctive relations that disrupt this homogeneity as 'foreign intrusions' into the sequence. In the next section we will see just how heterogeneous the visual conjunction in a sequence can in fact be, in certain types of film. Typologies of sequences could certainly have a place in theories of cinematic genre, but do not belong in a theory of the 'language' of cinema, such as Metz claimed to be writing.

Metz's theory ignores the soundtrack completely, and assumes the primacy of the visual in the structure of a film. Films in which the verbal element forms an important structuring element, as is the case in many documentaries, are denounced as not 'cinematic':

'It is by no means certain that an independent semiotics of non-narrative genres is possible other than in the form of a series of discontinuous remarks on the points of difference between these films and 'ordinary' films (...) It was precisely to the extent that the cinema confronted the problem of narration (...) that it came to produce a body of specific signifying procedures.'

Figure 6 shows Metz's own diagramatic representation of his theory, which, despite the label grande syntagmatique is one in which paradigmatics (conjunctive options) and syntagmatics (typology of sequences) are conflated.

A 'plansequence' is a sequence in which the various actions or settings or objects shown are not displayed in separate, subsequent shots, but linked together by camara movements and/or movements by the actors within one shot. Between the parts of such a complex shot conjunctive relations may evidently obtain - the camera can, for instance, pan to reveal a contrast or similarity between two objects or actions. Whether or not there is a cut between two conjunctively related elements is, in the end, a matter of style, rather than that it affects the conjunctive relation between the elements. Visual CRU's should perhaps be defined on the basis of ideational visual configurations such as described by Kress & Van Leeuwen - though the theory of such configurations still needs to be worked out for the case of moving images.

The 'subjective insert' is a shot showing a 'mental image' of one of the characters in a film. Its conjunctive status is unclear - it may, for example, be a 'preceding event' or a 'future event'. The category of subjectivity pertains, in any case, more to modality than to conjunction. The 'explanatory insert' Metz explains by means of an example: it may, for instance, be a map, showing the itinerary of a journey, or a letter or document which, shown in Close Shot, explains some aspect of the action. The 'displaced shot' is a shot from a later episode of the film which, as an 'insert', forms a kind of narrative enigma which only the later sequence will solve.

The 'parallel sequence' is a sequence in which two kinds of shot, contrasting along a semantic dimension (for instance 'old' and 'new') are intercut, without there being a narrative relation between the two series. The 'bracketed sequence' is a series of shots with similar content, for instance a series of shots of diverse outdoor activities, without, again, a narrative relation between the shots. The 'descriptive sequence', similar to Tomoshenko's 'analytical montage', Metz classifies as temporal, because, he argues, the series of details of, for example, a location which it shows must be understood as co-existing at the same time.

In the 'alternating sequence' relations of simultaneity and succession alternate - the classic example is the 'chase' sequence in which shots of the pursuers and the pursued are alternated. 'Linear narrative sequences' are distinguished according to the kind of ellipses occurring in them. In the 'scene' the action is played out without any temporal gaps, however small. In the 'ordinary sequence' small and adhoc ellipses serve to eliminate or abridge time-consuming or inconsequential actions. In the 'episodic sequence' the ellipses are large and symmetrical, as, for example, in a sequence which shows, in highly condensed form, a long journey. Like Pudovkin in his conception of 'structural montage', Metz here conflates two systems of cohesion which should perhaps be kept distinct: ellipsis and conjunction. Ellipsis is clearly also of interest, but falls outside the scope of this paper. Figure 7 shows the visual conjunctions that play a role in Metz's theory, abstracted from the other concerns of his theory, such as editing style, subjective modality, sequence typology and ellipsis.

image-text conjunction

Roland Barthes, in 'Rhetoric of the Image' dealt with still rather than moving images, but his concepts may also be relevant for the study of image-text conjunction in audiovisual texts. His key concepts were 'anchorage' and 'relay'. These concepts are evidently very similar to Halliday's concepts of 'elaboration' and 'extension': words can 'present' or 'distil' the meanings of images ('anchorage') or 'extend' that meaning ('relay'). While extension may occur also in the reverse direction (images extending words), 'anchorage' is uni-directional. The reverse directionality ('illustration': images anchoring words) has, according to Barthes, historically been superseded by 'anchorage':

Formerly the image illustrated the text (made it clearer); today, the text loads the image, burdening it with a culture, a moral, an imagination. Formerly, there was reduction from text to image; today, there is amplification from the one to the other.

Anchorage, finally, may be anchorage of the photographic denotation (I will call this 'identification') or anchorage of the photographic connotation (I will call this 'interpretation'). Figure 8 summarises Barthes' distinctions. The emphasis on directionality is perhaps the most interesting aspect of Barthes' theory. It establishes a hierarchy between semiotics, assigns different roles to different semiotics - roles which are socially and historically determined. In terms of image-text conjunction, however, Barthes' theory provides only the most basic distinctions, distinctions which, moreover, are not specific to image-text conjunction. The theory of conjunction could add both greater generality and more precision to his account of image-text conjunction.

Bill Nichols' work has, amongother things, dealt with image-text relations in the documentary film. He discussed these in terms of two systems. The first is the system of 'mode of address', distinct from conjunction, of course, but with relations of mutual dependency. This system realises, simultaneously, the relation of the speaker to the audience, the relation of the speaker to the subject of the film, and the relation of the speaker to the image. The relation of the speaker to the audience has two dimensions. One is realised by the distinction between 'direct address' and 'indirect address'. In 'direct address' the speaker speaks directly to the audience, and the speech situation is specific to the genre of documentary film or television, constituted by 'documentary film' as a communicative context (e.g. voice over commentary, on camera anchorperson or reporter, interview, etc.). In 'indirect address' the speaker addresses, not the audience, but 'characters' in the film, and the speech situation is specific, not to the documentary film itself as a communicative context, but to some communicative context represented in the documentary film. Here the audience is not addressed directly, but, so to speak, 'overhears' what is being said. The other dimension relates to the authority of the speaker in relation to the audience, and also in relation to the subject of the film. Reading the rightmost options in figure 9 from top to bottom, this authority gradually decreases, and as it decreases, two others things also decrease: on the one hand 'dogmatism' and 'distanciation' from the events portrayed in the film, on the other hand analytic precision and the overt signalling of the function of the film (as, for instance, propaganda) and the source of the film, its ultimate author. At the same time 'realism' increases: the text increasingly appears a record of reality, rather than a constructed message, with a source, an addressee and a social function. The relation of the speaker to the image, finally, is captured by the distinction between 'sync' (the speaker is seen while speaking) and 'non-sync' (the speaker is not seen while speaking).

In 'direct address' documentaries (or 'direct address' parts of documentaries) the verbal text is primary. It provides the expository structure, the documentary 'diegesis'. Images either show the speaker or provide 'illustration'. Only on a very local level, that is, in relation to the immediately accompanying text (rather than in relation to the preceding and following shots) can the image contribute to the expository structure, by 'confirming', 'counterpointing', 'ironically shading' or 'extending' the verbal text. The latter concept Nichols explains by means of an example: a commentary dealing with the exploitation of workers is accompanied by a shot of prisoners, and this metaphorically extends the meaning of the commentary. Systematising these distinctions somewhat, the critical options in Nichols' theory seem to be whether the image restates the verbal content (either 'literally' or 'metaphorically') or contrasts with it (either strongly or subtly). The concept of 'illustration', meanwhile, is not further defined. The illustration of a general statement in the commentary by a specific image could, for example, be seen as 'particularisation' - but also as 'confirming', by instanciation, exemplication, authentification. In terms of Barthes' definition of 'illustration' (a directionality from text to image) all of Nichols' distinctions would fall under the heading 'illustration'. Because of these indeterminacies figure 10 can only be a speculative attempt at summarising Nichols' distinctions. It is clear, however, that the directionality which, according to Nichols, obtains between the images and the text of 'direct address' documentary films is the opposite of the directionality which, according to Barthes, obtains in print media texts like advertisements: in the former image illustrate text, in the latter text anchors images.

Nichols has also distinguished four broad genres of 'expository structure' in the documentary. In 'classical exposition' one narrator is used throughout. Each scene 'sets into place a block of argumentation which the image track illustrates with more or less redundancy', and the verbal track provides explicit conjunctive links between these blocks. In 'neo-classical exposition' the structure is also that of an argument, but the blocks are formed by comments from 'social actors' (usually elicited in interviews) and the links between the blocks remain implicit. In 'dispersed classical exposition' the structure is again that of an argument, but, instead of being eliminated, the narrator is 'dispersed': 'more than one character, women as narrators, or narrators and characters both choreographed into a singular line of exposition'. In 'cinéma-vérité films', finally, the structure is not that of an argument, but the 'blocks' are more or less loosely organised as 'mini-narratives', although the links between them need not be temporal, and may constitute what Nichols calls a 'mosaic structure'. Bridging commentary may occur between the blocks - either to link them logically, or to condense time.

This is, again, a typology, of whole films, this time, rather than of sequences, and in terms of a number of different features which certainly all come together in concrete texts, but should perhaps be kept distinct in theory. Conjunction plays a role in this typology, be it in rather general terms. Firstly, 'logical', 'expository' conjunction is always realised verbally; between images there can be only temporal conjunction (and the kind of conjunction found in 'mosaic structures' - additive?). Secondly, in some genres of documentary logical conjunction tends to be realised explicitly, in other implicitly - the former always requires voice over commentary.

Five Texts

In this section I will analyze the conjunctive structure of sequences from five audiovisual texts. They are all 'documentary' in character, and they all use voice over commentary (Nichols' 'non-sync direct address'). But they stem from different periods and represent different kinds of documentary. Grierson's Industrial Britain (1933) is a 'classic documentary' which, in its time, served the purpose of social propaganda (the bolstering of the morale of industry in a time of depression) and made imaginative use of camera work, editing and sound to do so. Chris Marker's Le Joli Mai ('The Lovely Month of May') (1962) is a feature length 'film essay' about Paris and its inhabitants, in the cinÇma-vÇritÇ style. It combines penetrating location interviews, a novelty at the time, with sequences of images accompanied by personal, reflective narration. Jane Oehr's Stirring (1975) is a documentary about discipline in High Schools, made for Film Australia. It uses an observational, 'direct cinema' style to follow, over a period of several months, a teacher who involved a class of boys in discussions and activities around the theme of school discipline (role play, making a video, petitioning the Head-master, etc.) The sparse commentary serves to bridge sequences and is spoken by the teacher who also is the protagonist of the film. Strike (1986) is a news item from Sydney's Channel 9, about a strike by Ansett baggage handlers. Muslim Community School (1986) is an item from the ABC Current Affairs programme 7.30 Report, about a newly founded Muslim Community School in Perth.

Industrial Britain

The sequence below forms the beginning of the film. It shows images of traditional crafts, to later contrast them with 'the world that coal and steam have created'. Later in the film this contrast will be deconstructed:

'But if you look closely enough you will find that the spirit of craftsmanship has not disappeared. William Gavin Broadcotton of Stoke-on-Trent, whom you see working now, is a young man of 26. But he is working exactly as the Greek potters worked, making the same beautiful things, using the same simple tools. Look at those hands .... That is the sort of thing that is being done behind these industrial chimneystacks ...'

Shot 1 Sails of windmill, moving. Music:: 'old world' theme

Low angle. (sense of circular motion)

Shot 2 Windmill. Higher angle,

so that the house becomes


Shot 3 Windmill. Still higher angle Voice over: (1) The old order

so that the ground becomes changes, (2) making place

visible. for the new.

Shot 4 Spinning wheel, woman

behind it.

Shot 5 Weaving loom, detail.

Shot 6 Weaving loom, detail, with

hands working.

Shot 7 Full Shot weaving loom,

with woman working

Shot 8 Long Shot cornfield. Sheaves

of corn stacked against each

other. Farmer enters from L.

and adds sheaf to stack.

Shot 9 Close Shot sheaf being added

to stack.

Shot 10 Medium Shot basketweaver, Voice over: (3) half the

working. history of England lies

behind these scenes

of yesterday.

Shot 11 Close Shot hands of basket- (4) The history of daily

weaver. work done,

(5) of people who kept on

Shot 12 Long Shot bridge over canal. through the centuries,

A man and a towhorse walk growing things, making

along towpath, towards camera. things, transporting

things between the


Shot 13 Long Shot man and towhorse villages and the

from other side of canal, English towns.

moving R.-L. through frame.

When they have left the frame.

the barge moves through frame.

A dog walks towards the back

of the barge, thus staying in

centre frame.

Shot 14 Close Shot swans on lake. Music: 'Swan Lake'

Shot 15 Brig with full sails. The

camera moves around it,

continuing the circling

motion of the swans in

Shot 14.

Shot 16 Long Shot sailor at work

on the brig.

Shot 17 As Shot 15. Music ends

Shot 18 Factory chimney blowing Voice over: (6) But here is

smoke into the sky. the sign and symbol of the

Low angle. new order

Music: 'New World' theme.

Shot 19 Factory chimney. Higher Voice over: (7) Steam and

angle, so that more of smoke.

the chimney is visible.

Shot 20 Factory chimney. Still (8) There is power

higher angle, so that behind it.

still more of the chimney

becomes visible.

The conjunction analysis of this sequence is shown in figure 11. The bracketed numbers in the image column refer to the shots. CRU 1 and CRU 16 both consist of three shots. There is, in each case, across the three shots, only one 'process', that of 'coming down to earth', a process which more typically would be realised by a downwards camera movement, but which is here realised by a 'montage' of three shots. The technique is based on Eisenstein's filmmaking methods, and foregrounds the process, 'shocking' us into awareness of it by means of the abrupt jerkiness of the cuts.

The image track of this sequence has an intricate and highly cohesive conjunctive structure. CRU 1-15 constitutes that Metz would call a 'bracketed sequence' - the conjunctive relations between 2 and 1, 3 and 2, and between a number of 'mini-sequences' (3-5, 6-7, 8-9, 10-12 and 13-15) are all of the 'positive comparison' type. Even without the verbal commentary we would see that these shots and mini-sequences present similar activities, activities which are all traditional, 'scenes of yesterday'. They are, in turn, likened to the elegance of the swans and the brig. The mini-sequences cohere internally by means of other conjunctive relations: addition, temporal succession and the kind of elaboration Timoshenko called 'concentration' and 'enlargement' and which I have here called 'detail' and 'overview'. The section as a whole is linked to the next sequence by negative contrast ('old' versus 'new').

There is, in this sequence, a form of explicit visual conjunction. The positive comparison of content is reinforced by a positive comparison of form: the repetition of circular movement (windmill, spinning wheel, circular camera movement, and so on). The negative comparison between CRU 1 and CRU 16 is also made explicit - by the repetition of the 'jerky' montage of three shots, each with only a slightly higher angler.

The conjunctive structure of the verbal track is less intricate and cohesive. There is, for example, a gap between clause 2 and clause 3. The relations are mostly elaborative (particularisation and explanation) and logical (result and a negative comparison which reinforces the negative comparison in the image track).

Conjunction between the visual and the verbal track is elaborative. The more or less enigmatic image of the windmill receives verbal explanation as a symbol of the 'old order'. The series of images and mini-sequences of traditional crafts is verbally explained ('Here is the sign and symbol of the new order'). The points at which the image-text conjunctions occur are key points in the structure - what Nichols would call the 'bridges' between 'blocks of argumentation'. In general each point is first made visually, then elaborated verbally. The verbal track anchors the image track, rather than that the images 'illustrate' the verbal track and depend on it for their structure. The verbal track is intermittent: the structure shifts its weight, so to speak, from the visual to the verbal track, and back again.

Le Joli Mai

In the following sequence from Le Joli Mai, Chris Marker, too, compares the old and the new. But structure and style are different.

Shot 1 Low angle Long Shot Voice over: (1) From the top

new highrise apartment blocks of its towers on its towers

on its surrounding hills

Shot 2 High angle shot, from rooftop, Paris can see the Paris of

of Paris. future rise on the same

hills (2) where Sainte


Shot 3 Low angle rooftops saw the Barbarians appear.

(23) Now the Barbarians

are here.

Piano music fades up

Shot 4 The river Seine, with bridges. Voice over: (24) The metamor-

High angles. phosis which should have

inspired an architectural

festival has two guardian

witches, (25) Anarchy and


Shot 5 Statue in foreground, modern Music crescendo

building in background.

Voice over: (26) One would

like to see New York

Shot 6 Full Shot Cupido statue tempered by the Seine.

Shot 7 Gargoyle. Low angle. (27) One does not.

Shot 8 Old buildings. (28) Even if the nuances

of solitude have 2000


Shot 9 Pan L.-R. over other old (29) even if what is

buildings. described as project

pathology does not succeed

in making us regret the

former slums, (30) we know

at least

Shot 10 Street in older part of Paris. that there was room for


Piano fades up

Shot 11 Wide Shot of street with and here we're not sure

modern building. A car

drives towards camera.

Shot 12 Track R.-L. across modern


The conjunction analysis of the sequence is shown in figure 12.

Even a glance at the reticulum shows that in this sequence the image track is less and the verbal track more intricately structured than in Industrial Britain. No 'mini-sequences' and no explicit conjunction in the image track. No gaps in the verbal track.

The conjunctive relations between the images are non-temporal, for the most part a matter of 'negative comparison', of contrast between old buildings and new buildings, Cupids and gargoyles, etc. - Metz would call it a 'parallel sequence'. Where the sequence from Industrial Britain is concerned with positive comparison, with saying 'all these things belong to the same order',this sequence is based on negative comparison, concerned with saying 'I am showing you two different, contrasting aspects of Paris, the old, slummy, but happy Paris, and the new, modern, but alienating Paris'.

Conjunction between the visual and verbal track is predominantly elaborative. But here elaboration occurs in both directions. Sometimes the text makes a point which is then elaborated by the visual track, sometimes the image makes a point which is then elaborated verbally. Thus, we see a Cupid statue after Paris has been called 'New York tempered by the Seine', and a gargoyle after the text has mentioned 'barbarians invading Paris'. In both cases the image symbolically elaborates the text, more or less in the way described by Nichols.

There is also an instance of what I would like to call imputed image conjunction. Without the commentary we would, presumably, interpret the relation between 13 and 14 as one of similarity: two examples of 'slums'. But the voice over suggests a different interpretation. It says: 'although (13) these are slums, nevertheless (14) one can at least find happiness here'. In other words, the commentary directs us to read the relation between 13 and 14 as concessive. By virtue of imputed conjunction images can stand in relations of logical conjunction to each other, something of which, without words, they would not be capable. I have indicated imputed image conjunction by means of dotted line arrows.


The following sequence from Stirring forms a bridge between 'sync' sequences with linear narrative structure. Some of the shots have been analyzed into several CRU's. This is indicated by the letters behind the shotnumbers (e.g. shot 1 is comprised of 3 CRU's: 1a, 1b and 1c.

1a Medium 2-Shot two teachers, seen Mixture of voices in canteen

from behind as they pour them- fades down as

selves a cup of coffee.

Narrator-teacher (voice over):

Camera follows narrator-teacher (1) I hadn't expressed my

as he walks across to ... personal opinions up to this

point, (2) because very often

a teacher's attitude can

influence a class too much.

1b a table and sits down next to a

female teacher Sync sound of conversation, of

which only snippets can be

1c Camera pans/zooms to overheard

Close Shot female teacher.

2 Group Shot teachers. White-

shirted teacher in foreground

bends over to light cigarette,

then bends back again. Narrator-

teacher and female teacher visible

in background.

3a Close Shot curley-haired teacher, Narrator-teacher (voice over):

smoking cigarette. (3) For my own information

I'd asked some of the other

subject teachers to write

comments on the class. (4) I

asked them whether they

3b Camera tilts up to Medium Close objected to anything being

Shot teacher with glasses who is written, being read out to

standing and listening in to a the boys


3c Camera pans to Medium Close Shot

Asian teacher in front of notice


4 Wide angle Group Shot (5) None of them did.

teachers around table, lunchbox

in foreground Fade up din of conversation

Bell rings.

5a Close shot hands holding lunch

box. Camera tilts up to

5b Medium Close Shot teacher

getting up from chair and

walking out of frame.

5c Teacher in blue T-shirt walks

through frame.

5d Another teacher walks into

frame and camera follows him

to door.

The conjunctive structure of this sequence is shown in figure 13.

In this sequence visual conjunction is exclusively temporal: the sequence is concerned with showing events in their chronological order, as are most sequences in 'observational' documentaries.

Between the visual and the verbal track there is no conjunction whatsoever. Each track tells its own story. We see the story of teachers having a coffee break, we hear another story, which occurred previously, of a teacher consulting with other teachers. The story we see flows on chronologically from the previous scene and into the next one. The story we hear takes us back in time with respect to the preceding scene, and relates logically to the subsequent scene, because the actions of the teacher in that scene are constructed as a result of the consultation with other teachers. Thus the next scene conjoins temporally to the visual part of the sequence analyzed, and logically to the verbal part. Despite the fact that the teacher makes logical connections between sequences, the 'natural' temporal flow is maintained.


The sequence analyzed immediately follows the introduction by newsreader Brian Henderson:

'In a move that could have serious repercussions Ansett Airlines this afternoon sacked its striking airport baggage handlers. 370 people lost their jobs although Ansett chief Sir Peter Abeles is giving them one more chance in saying they'll be re-employed if they agree to tow their management line by 5 o'clock tomorrow afternoon. But the Transport Workers Union has warned of widening industrial action to ground the Ansett fleet.

2 Full Shot empty baggage Collis (voice over): (7)

trolleys. High angle. Ansett porters and baggage

Super: 'John Collis reporting' handlers supported by

3 Full Shot fuelling truck driving aircraft refuellers have been

L.-R. towards camera. Camera on strike for eleven days.

pans along with truck. (8) The result:

Other vehicles, going in opposite

direction, pass between camera and


4 Long Shot people waiting at check chaos for commuters and

in counters. Camera pans L. along millions of dollars in lost

queue. revenue for the airline.

(9) The row surfaced as an

inter-Union wrangle between


5 Medium Long Shot tractor towing Transport Workers Union and

trail of baggage trolleys, the Australian Transport

driving L.-R. alongside aeroplane. Officers Federation over who

Camera pans with tractor as it should have the responsibility

turns somewhat towards camera, for...

revealing other (full) trolley.

The drivers greet each other.

6 Full Shot men loading suitcases four tarmac supervisors.

in hold of Ansett plane.

7 Back to Shot 2 (10) The TWU is accused of

attempting to, in Union terms,

body-snatch (11) and since

8 Long Shot group of passengers last Wednesday the men have

walking across tarmac. High angle. been defying an Industrial

Camera zooms in as they reach nose Court Order to return to work.

of plane and walk L. around plane.

9 Full Shot Boeing 767 taking off. (12) Using giant Boeing 767's

in an attempt to beat the

strike, Ansett struggled on

(13) but today joint chairman

10 Long Shot boardroom with Sir Peter Abeles decided

journalists and cameramen enough was enough.

silhouetted in fore-ground and

Abeles, under film lights, in back-

ground at head of table.

11 Full Shot letter of dismissal. (14) This afternoon he author-

ised the salary dismissal

12 Close Shot letter ('A most

serious breach of your contract of all 370 strikers represent-

of employment (...) justifies ing about one sixth of the

summary termination. Accordingly airline's Sydney staff.

your employment with Ansett

(...) is terminated')

13 Tight Close Shot Abeles as he is (15) It was a decision, he

talking, in shirtsleeves said taken with regret.

Abeles: (16) It was not easy

for us to make this decision

(17) but eh we came to the

conclusion that we've given

enough time. (18) We've tried

everything (19) and we have to

stand firm.

14 Back to Shot 10 Collis (voice over): (20) Sir

Peter said the sackings were

with the knowledge (21) but

not necessarily the support

of the ACTU

15 Close Shot Abeles (22) which has already

openly defended Ansett

in the dispute.

16 Closer version of Shot 6

The conjunctive structure of this segment is shown in figure 14:

The image track of this segment is difficult to analyze. There often appears to be no other conjunctive relation between a pair of shots than the mere fact that they are both taken somewhere in or around the airport ('spatial continguity'). Three times an action continues across a cut ('succession'), with or without a change from Close Shot to Long Shot ('overview'). Once a Long Shot is followed by a Close Shot of the same subject ('detail'). In general, however, conjunction is intermittent and shallow. The images do not mean much by themselves. The structure, by and large logical and elaborative, is carried by the verbal track.

It is difficult even to see in which way the images could be said to illustrate the verbal track. In one case a picture of stranded passengers becomes, by virtue of imputed image conjunction, the 'result' of the action of strikers. In another case, a shot of a letter particularises the words of a participant, after these words have first been summarised in the voice over. In general, the images, rather than illustrating the verbal track, provide a background for the verbally structured 'news story', a setting, a sense of 'being there'. So unimportant is the content of the images that it does not even seem to matter that, at times, they contradict the words: we see baggage handlers loading suitcases in the hold of a plane when we have just learnt that they have been on strike for eleven days.

Muslim Community School

In this current affairs item two kinds of images are contrasted: images which show the 'Muslim community' as 'others' (dress, praying, etc.) and images which show the children from this community as 'like our children', innocent, playful, creative. The teacher, Linda Walewski, occupies a middle ground and, to some extent resolves the contradictions. She is an Australian who has adopted the Islamic religion, sometimes seen in Western clothes, sometimes with her head covered according to Islamic custom. Magar is the Headmaster of the school. Bob Clyde is the reporter.

19a Close Shot Arabic sign on Clyde (voice over): (29) And

wall. while religion and Muslim

teaching remain foremost, (30)

19b Camera tilts down to the school conforms with

picture of clown on wall. educational guidelines in all

other respects.

20 Medium Close Shot Magar. Magar: (31) We'll follow the

Government programme exactly.

(32) The only difference is,

we are giving the environment

21a Medium Long Shot three men, lying (33) we're giving the atmosphere

prostrate, praying. Magar is one (34) we are giving the teaching

of them. of the Holy Koran on top of that.

21b Camera pans right to include (35) We are giving them the feel-

praying children. ing every day that everything you

do is to please God and every

thing you are not doing good is

21c The praying people stand up. because you are clear (?) of God.

22 Close Shot Linda Walewski and Clyde (voice over): (36) The

another teacher. Both wear senior teacher is Linda Walewski

traditional dress. (37) who applied for the job

(38) because she wanted some-

thing different.

23 Close Shot children writing. (39) The kids, she says, are

really no different from any

others (40) except that they

share one special quality.

Linda Walewski (voice over):

(41) It comes from the family


24 Medium Shot smiling little girl. (42) They have a terrific

feeling of community spirit.

25 Close Shot Linda Walewski, in (sync) Family-mindedness of

Western dress. Super: 'Linda sharing, (44) of caring for

Walewski, senior teacher'. these children.

26a Medium Long Shot children in (voice over) (45) They were happy,

cloakroom, hanging their bags I've been told (46) if someone

on coat hooks. has no lunch at school, (45 cont)

they will automatically share

26b Camera pans to Full Shot their own things.


27 Medium Long Shot children in (47) This is something that

playground, eating lunch. they're taught through their

The two children walk towards religion, (48) that you share

camera. (49) before you can give your-

self. (50) You give it to some-

one else rather than yourself.

28 Long Shot children in playground.

The conjunctive relations in this segment are shown in figure 15.

As in the television news item, conjunction between the images is shallow and intermittent, verbal conjunction more tightly structured and continuous. Temporal conjunction is found only between the parts of one complex shot (21a-c). 'Positive comparison' is made between shots showing children in traditional dress. The verbal conjunction is again, predominantly elaborative and logical.

Between the visual and the verbal track some relations of explanation occur, but rather than that the text elaborates the images, the images elaborate the text. Two instances of imputed image conjunction occur: the conjunction between 19a and 19b is not to be read as a mere contrast, but as concessive ('although (19a) religious remains foremost', 'nevertheless (19b) the school complies with educational standards' - a picture of a clown here illustrates 'educational standards'!). The conjunction between a shot of children in traditional dress and a close up of a smiling child is to be read as replacive ('these children are no different from any other (23)', 'except that (24) they have one special quality - the smile elaborates that 'special quality' visually).

Some conclusions

The discussion of theories of film editing in section 3 and the text analyses in section 4 make it possible to chart the types of conjunction that can obtain between images (figure 16). I have included 'future event', even though there was no instance of it in the sequences and segments analyzed.

Some of the distinctions made by the film theorists discussed in section 3 have not been included: distinctions which pertain to (supposedly homogeneous) conjunctive patternings in sequences rather than to the types of conjunction possible within and between sequences (e.g. Pudovkin's distinction between 'contrast' and 'parallelism', Timoshenko's distinction between, one the one hand, 'analytical montage', and, on the other hand, 'concentration', simultaneity' and 'similarity' - all of which converge in 'analytical montage'); distinctions pertaining to ellipsis (Metz's types of 'linear narrative sequence'); and distinctions of which the conjunctive status is unclear (e.g. Metz's 'subjective' and 'explanatory' inserts).

Figure 16 shows the repertoire of possible conjunctive relations between images to be more restricted than the repertoire of conjunctive relations possible in language (figure 1). This should not be interpreted as a restriction which is always and everywhere inherent in the visual medium. It is a socially and historically determined restriction on the semiotic role of the visual, and on its relation to language. The attempts of the 1920s Soviet theorists and filmmakers to broaden the repertoire have, for the time being, not succeeded in changing the semiotic hierarchy and role division. But that does not mean that such attempts will not succeed at some future time, for instance in connection with the much more dialogic forms of visual communication now made possible by iconic computer interfaces.

For the moment the diagram in figure 16 should perhaps still be considered specific to a certain set of genres of audiovisual text. More texts might need to be studies before one could construct a system network applicable to all (Western) audiovisual texts. On the other hand, the text analyzed do cover a variety of genres and the types of conjunction found in them correspond reasonably well on other types of text. It is, in the long run, desirable that system networks of this kind should be 'language specific' rather than 'genre specific' even though a 'language' (and therefore a 'language specific' system network) is always an artificial construct, resting on someone's decision as to which genres legitimately belong to the 'language'. We have seen that certain genres of documentary tend to make certain conjunctive choices - 'observational' documentaries, for instance, tend to favour temporal conjunction, 'classic documentaries' 'comparison' conjunction. But we have also seen that this is only a tendency, and that genres are not homogeneous with respect to conjunction. A 'language specific' system network can provide a reference point for studying, not only how genres differ in, for instance, the conjunctional choices they favour, but also how fluid or strict their boundaries with other genres are in this respect - and this would seem to be an important aspect of textual politics and textual history.

Two areas remain underdeveloped here. The first relates to the question of visual CRU's. A more precisely defined method for delimiting such CRU's still needs to be developed. The second relates to the question of the polyinterpretability of implicit conjunctive relations (and the visual conjunctive relations in modern audiovisual texts are usually implicit). The approach proposed here allows us to perceive such polyinterpretability. We have seen, for instance, that it is more common in modern television texts than in the documentaries analyzed, even though this increase in polyinterpretability seems to be accompanied by a decrease in the importance of the role of the visual track. The question is worth further study, because the degree to which texts are explicit and to be read as singular in meaning or implicit and polyinterpretable is also an important aspect of textual politics and textual history.

In image-text conjunction the question of directionality is important: conjunction can move from image to text ('illustration') or from text to image (Barthes' 'anchorage'). When the verbal track conjunctively relates to images seen previously, all types of verbal conjunction would, one might think, in principle be possible. In fact conjunction is almost exclusively elaborating ('explanation' and 'summative'). When the visual track conjunctively relates to preceding text, all types of image conjunction would, in principle, be possible, but, again, elaborating conjunction dominates ('explanation', 'example', 'particularisation'). Text and image, then, elaborate each other, the text generalising what is seen in the image, 'summing it up' for example, the image making specific what is said in the text, 'particularising' it, 'exemplifying' it. This is shown in figure 17.

Comparing the two earlier documentaries (Industrial Britain and Le Joli Mai to the two television texts, we can note what is perhaps a historical shift in the relation between image and text. In Grierson's film the image has primacy. Much in the way Barthes has described, images function here as 'nature', as empirical evidence, and text functions as 'anchorage', burdening (the image) with a culture, a moral, an imagination. The movement is from 'nature' to 'discourse'. In the two television texts the verbal has primacy. The authoritative text of the anchorperson precedes 'images of the world', and in the voice over section directionality is reversed: the visual authenticates, particularises and exemplifies the verbal. The movement is from 'discourse' to 'exemplification' (and 'authentication'). We have also seen that in the film documentaries the conjunctive structure of the image track is constant and cohesive, that of the verbal track more intermittent, while in the two TV texts visual conjunctive structure is shall and intermittment, and verbal conjunctive structure cohesive and continuous. Le Joli Mai in fact occupies a somewhat intermediary position, and has, on the one hand, a still uninterrupted and fairly cohesive visual structure, but, on the other hand, a continuous narration and some instances of visual elaboration of the voice over.

Stirring provides 'nature' in the raw, ostensibly without any form of elaboration. It may be, however, that 'direct cinema' has been only a shortlived interruption in the historical development sketched here.

For Roland Barthes 'illustration' belonged to an earlier period in which images illustrated the 'holy writ' of the culture (the Bible, ancient mythology), and was, in modernity, superseded by 'anchorage'. In today's audiovisual texts 'anchorage' seems, in turn, to have been superseded by new forms of illustration. This shift perhaps reflects deeper cultural changes in what has force of evidence in the interpretation of 'reality'. It already has made documentary, the older mode, the medium of political and artistic avantgardes which may yet turn out to have been rearguards.


1. Jim Martin, 'Conjunction: The Logic of English Text' in J.S. Petîfi and S. Sîzer eds., Micro and Macro Connexity of Text (Hamburg: Helmut Buske, 1983).

2. Sergei Eisenstein, The Film Sense (London: Faber, 1943); Sergei Eisenstein, Film Form (London, Dennis Dobson, 1963); Vladimir Pudovkin, Film Technique and Film Acting (New York: Bonanza Books, 1926). On Timoshenko, see Rudolph Arnheim, Film as Art (Berkeley: University of California Press, 1933).

3. Christian Metz, Film Language (New York: Oxford University Press, 1974).

4. M.A.K. Halliday, & Ruqaiya Hasan Cohesion in English (London: Longman, 1976); M.A.K. Halliday, Introduction to Functional Grammar (London: Edward Arnold, 1985); Martin, 'Conjunction'.

5. Halliday, Introduction; Martin, 'Conjunction'.

6. Martin, 'Conjunction'.

7. Pudovkin, Film Technique.

8. Pudovkin, Film Technique, p.47.

9. Pudovkin, Film Technique, p.49.

10. Pudovkin, Film Technique, p.49.

11. Pudovkin, Film Technique, p.50.

12. Eisenstein, The Film Sense; Film Form.

13. Eisenstein Film Form, p.30.

14. Arnheim, Film as Art.

15. Metz, Film Language.

16. Metz, Film Language, pp.99-95.

17. Gunther Kress & Theo van Leeuwen, Reading Images (Geelong: Deakin University Press, 1990).

18. Roland Barthes, Image-Music-Text (London, Fontana, 1977).

19. Barthes, Image-Music-Text, p.26.

20. Bill Nichols, 'Documentary Theory and Practice' Screen v.17 n.4 (1976) pp.34-39; Bill Nichols, Ideology and the Image (Bloomington: Indiana University Press, 1981).

21. Nichols, 'Documentary Theory', p.40.

22. Nichols, Ideology and the Image, p.197.

23. Nichols, Ideology and the Image, ch.7.

New: 18 March, 2015 | Now: 18 March, 2015