VIPudokin. Film technique. Translator, Ivor Montagu. Vision Press. 1914. 05. Asynchronism as a principle of sound film. 06. Rhythmic problems in my first sound film.


The technical invention of sound has long been accomplished, and brilliant experiments have been made in the field of recording. This technical side of sound-film making may be regarded as already relatively perfected, at least in America. But there is a great difference between the technical development of sound and its development as a means of expression. The expressive achievements of sound still lie far behind its technical possibilities. I assert that many theoretical questions whose answers are clear to us are still provided in practice only with the most primitive solutions. Theoretically, we in the Soviet Union are in advance of Western Europe and U.S.A.
Our first question is: What new content can be brought into the cinema by the use of Sound? It would be entirely false to consider sound merely as a mechanical device enabling us to enhance the naturalness of the image. Examples of such most primitive sound effects: in the silent cinema we were able to show a car, now in sound film we can add to its image a record of its natural sound; or again, in silent film a speaking man was associated with a title, now we hear his voice. The role which sound is to play in film is much more significant than a slavish imitation of naturalism on these lines; the first function of sound is to augment the potential expressiveness of the film’s content.
If we compare the sound to the silent film, we find that it is possible to explain the content more deeply to the spectator with relatively the same expenditure of time. It is clear that this deeper insight into the content of the film cannot be given to the spectator simply by adding an accompaniment of naturalistic sound; we must do something more. This something more is the development of the image and the sound strip each along a separate rhythmic course. They must not be tied to one another by naturalistic imitation but connected as the result of the interplay of action. Only by this method can we find a new and richer form than that available in the silent film. Unity of sound and image is realised by an interplay of meanings which results, as we shall presently show, in a more exact rendering of nature than its superficial copying. In silent film, by our editing of a variety of images, we began to attain the unity and freedom that is realised in nature only in its abstraction by the human mind. Now in sound film we can, within the same strip of celluloid, not only edit different points in space, but can cut into association with the image selected sounds that reveal and heighten the character of each—wherever in silent film we had a conflict of but two opposing elements, now we can have four.
A primitive example of the use of sound to reveal an inner content can be cited in the expression of the stranding of a town-bred man in the midst of the desert. In silent film we should have had to cut in a shot of the town; now in sound film we can carry town-associated sounds into the desert and edit them there in place of the natural desert sounds. Uses of this kind are already familiar to film directors in Western Europe, but it is not generally recognised that the principal elements in sound film are the asynchronous and not the synchronous; moreover, that the synchronous use is, in actual fact, only exceptionally correspondent to natural perception. This is not, as may first appear, a theoretical figment, but a conclusion from observation.
For example, in actual life you, the reader, may suddenly hear a cry for help; you see only the window; you then look out and at first see nothing but the moving traffic. But you do not hear the sound natural to these cars and buses; instead you hear still only the cry that first startled you. At last you find with your eyes the point from which the sound came; there is a crowd, and someone is lifting the injured man, who is now quiet. But, now watching the man, you become aware of the din of traffic passing, and in the midst of its noise there gradually grows the piercing signal of the ambulance. At this your attention is caught by the clothes of the injured man: his suit is like that of your brother, who, you now recall, was due to visit you at two o’clock. In the tremendous tension that follows, the anxiety and uncertainty whether this possibly dying man may not indeed be your brother himself, all sound ceases and there exists for your perceptions total silence. Can it be two o’clock? You look at the clock and at the same time you hear its ticking. This is the first synchronised moment of an image and its caused sound since first you heard the cry.
Always there exist two rhythms, the rhythmic course of the objective world and the tempo and rhythm with which man observes this world. The world is a whole rhythm, while man receives only partial impressions of this world through his eyes and ears and to a lesser extent through his very skin. The tempo of his impressions varies with the rousing and calming of his emotions, while the rhythm of the objective world he perceives continues in unchanged tempo.
The course of man’s perceptions is like editing, the arrangement of which can make corresponding variations in speed, with sound just as with image. It is possible therefore for sound film to be made correspondent to the objective world and man’s perception of it together. The image may retain the tempo of the world, while the sound strip follows the changing rhythm of the course of man’s perceptions, or vice versa. This is a simple and obvious form for counterpoint of sound and image.
Consider now the question of straightforward Dialogue in sound film. In all the films I have seen, persons speaking have been represented in one of two ways. Either the director was thinking entirely in terms of theatre, shooting his whole speaking group through in one shot with a moving camera. Using thus the screen only as a primitive means of recording a natural phenomenon, exactly as it was used in early silent films before the discovery of the technical possibilities of the cinema had made it an art-form. Or else, on the other hand, the director had tried to use the experience of silent film, the art of montage in fact, composing the dialogue from separate shots that he was free to edit. But in this latter case the effect he gained was just as limited as that of the single shots taken with a moving camera, because he simply gave a series of close-ups of a man speaking, allowed him to finish the given phrase on his image, and then followed that shot with one of the man answering. In doing so the director made of montage and editing no more than a cold verbatim report, and switched the spectator’s attention from one speaker to another without any adequate emotional or intellectual justification.
Now, by means of editing, a scene in which three or more persons speak can be treated in a number of different ways. For example, the spectator’s interest may be held by the speech of the first, and—with the spectator’s attention—we hold the close-up of the first person lingering with him when his speech is finished and hearing the voice of the commenced answer of the next speaker before passing on to the latter’s image. We see the image of the second speaker only after becoming acquainted with his voice. Here sound has preceded image.
Or, alternatively, we can arrange the dialogue so that when a question occurs at the end of the given speech, and the spectator is interested in the answer, he can immediately be shown the person addressed, only presently hearing the answer. Here the sound follows the image.
Or, yet again, the spectator having grasped the import of a speech may be interested in its effect. Accordingly, while the speech is still in progress, he can be shown a given listener, or indeed given a review of all those present and mark their reactions towards it.
These examples show clearly how the director, by means of editing, can move his audience emotionally or intellectually, so that it experiences a special rhythm in respect to the sequence presented on the screen.
But such a relationship between the director in his cutting-room and his future audience can be established only if he has a psychological insight into the nature of his audience and its consequent relationship to the content of the given material.
For instance, if the first speaker in a dialogue grips the attention of the audience, the second speaker will have to utter a number of words before they will so affect the consciousness of the audience that it will adjust its full attention to him. And, contrariwise, if the intervention of the second speaker is more vital to the scene at the moment than the impression made by the first speaker, then the audience’s full attention will at once be riveted on him. I am sure, even, that it is possible to build up a dramatic incident with the recorded sound of a speech and the image of the unspeaking listener where the latter’s reaction is the most urgent emotion in the scene. Would a director of any imagination handle a scene in a court ofjustice where a sentence of death is being passed by filming the judge pronouncing sentence in preference to recording visually the immediate reactions of the condemned?
In the final scenes of my first sound film Deserter my hero tells an audience of the forces that brought him to the Soviet Union. During the whole of the film his worse nature has been trying to stifle his desire to escape these forces; therefore this moment, when he at last succeeds in escaping them and himself desires to recount his cowardice to his fellow-workers is the high-spot ofhis emotional life. Being unable to speak Russian, his speech has to be translated.
At the beginning of this scene we see and hear shots longish in duration, first of the speaking hero, then of his translator. In the process of development of the episode the images of the translator become shorter and the majority of his words accompany the images of the hero, according as the interest of the audience automatically fixes on the latter’s psychological position. We can consider the composition of sound in this example as similar to the objective rhythm and dependent on the actual time relationships existing between the speakers. Longer or shorter pauses between the voices are conditioned solely by the readiness or hesitation of the next speaker in what he wishes to say. But the image introduces to the screen a new element, the subjective emotion of the spectator and its length of duration; in the image longer or shorter does not depend upon the identity of the speaking man, but upon the desire of the spectator to look for a longer or shorter period. Here the sound has an objective character, while the image is conditioned by subjective appreciation; equally we may have the contrary—a subjective sound and an objective image. As illustration of this latter combination I cite a demonstration in the second part of Deserter; here my sound is purely musical. Music, I maintain, must in sound film never be the accompaniment. It must retain its own line.
In the second part of Deserter the image shows at first the broad streets of a Western capital; suave police direct the progress of luxurious cars; everything is decorous, the ebb and flow of an established life. The characteristic of this opening is quietness, until the calm surface is broken by the approach of a workers’ demonstration bearing aloft their flag. The streets clear rapidly before the approaching demonstration, its ranks swell with every moment. The spirit of the demonstrators is firm, and their hopes rise as they advance. Our attention is turned to the preparations of the police; their horses and motor-vehicles gather as their intervention grows imminent; now their champing horses charge the demonstrators to break their ranks with flying hoofs, the demonstrators resist with all their might and the struggle rages fiercest round the workers’ flag. It is a battle in which all the physical strength is marshalled on the side of the police, sometimes it prevails and the spirit of the demonstrators seems about to be quelled, then the tide turns and the demonstrators rise again on the crest of the wave; at last their flag is flung down into the dust of the streets and trampled to a rag beneath the horses’ hoofs. The police are arresting the workers; their whole cause seems lost, suppressed never to re-arise —the welter of the fighting dies down—against the background of the defeated despair of the workers we return to the cool decorum of the opening of the scene. There is no fight left in the workers. Suddenly, unexpectedly, before the eyes of the police inspector, the workers’ flag appears hoisted anew and the crowd is re-formed at the end of the street.
The course of the image twists and curves, as the emotion within the action rises and falls. Now, if we used music as an accompaniment to this image we should open with a quiet melody, appropriate to the soberly guided traffic; at the appearance of the demonstration the music would alter to a march; another change would come at the police preparations, menacing the workers—here the music would assume a threatening character; and when the clash came between workers and police—a tragic moment for the demonstrators—the music would follow this visual mood, descending ever further into themes of despair. Only at the resurrection of the flag could the music turn hopeful. A development of this type would give only the superficial aspect of the scene, the undertones of meaning would be ignored; accordingly I suggested to the composer (Shaporin) the creation of a music the dominating emotional theme of which should throughout be courage and the certainty of ultimate victory. From beginning to end the music must develop in a gradual growth of power. This direct, unbroken theme I connected with the complex curves of the image. The image succession gives us in its progress first the emotion of hope, its replacement by danger, then the rousing of the workers’ spirit of resistance, at first successful, at last defeated, then finally the gathering and reassembly of their inherent power and the hoisting of their flag. The image’s progress curves like a sick man’s temperature chart; while the music in direct contrast is firm and steady. When the scene opens peacefully the music is militant; when the demonstration appears the music carries the spectators right into its ranks. With its batoning by the police, the audience feels the rousing ofthe workers, wrapped in their emotions the audience is itself emotionally receptive to the kicks and blows of the police. As the workers lose ground to the police, the insistent victory of the music grows; yet again, when the workers are defeated and disbanded, the music becomes yet more powerful still in its spirit of victorious exaltation; and when the workers hoist the flag at the end the music at last reaches its climax, and only now, at its conclusion, does its spirit coincide with that of the image.
What role does the music play here? Just as the image is an objective perception of events, so the music expresses the subjective appreciation of this objectivity. The sound reminds the audience that with every defeat the fighting spirit only receives new impetus to the struggle for final victory in the future.
It will be appreciated that this instance, where the sound plays the subjective part in the film, and the image the objective, is only one of many diverse ways in which the medium of sound film allows us to build a counterpoint, and I maintain that only by such counterpoint can primitive naturalism be surpassed and the rich deeps of meaning potential in sound film creatively handled be discovered and plumbed.

It is sad to find that, since the introduction of sound and the predominance of talking films, directors both in the West and in the Soviet Union have suddenly lost the sense of dynamic rhythm that they had built up during the last years of the silent cinema. It is almost impossible to-day to find a film with the sharp dramatic rhythm of, for instance, the Odessa Steps sequence in Potemkin, or of certain episodes in the early picture Intolerance, which belongs to the first period when the hitherto mechanical film record became a creative medium. Most of the latest sound films are characterised by exceedingly slow development of subject and dialogue full of interminable pauses. Many directors are developing a talkie style that involves the use of explanatory words for matters that should be conceived visually; this kind of style introduces elements from the Theatre into a medium where they are out of place. Theatre has its own technique, depending on the power of the spoken word since it is incapable of presenting visual changes in rapid sequence, while Cinema is based on the possibility of presenting a variety of visual impressions in a time and space differing from that obtaining in the natural material recorded.
I do not believe that this change of method is indicative of any audience change of taste. I think that the real situation is that directors hesitate to make experiments with sound, and particularly hesitate to apply montage to the sound strip.
Many hold the view that, with the introduction of sound into film, the cutting methods established during the development of silent films must all go by the board. The development of constructive editing of frequent changes of shot made possible in silent film the achievement of great richness of visual form. The human eye is capable of perceiving, easily and immediately, the content of a succession of visual shots, whereas, as they point out, the ear cannot with the same immediacy detect the significance of alterations in sound. Accordingly, they maintain, the rhythm of changing sound must be much slower than need be that of changing image. They are right, in so far as concerns the combination with a succession of short images of a series of equally short sound effects matched with them in a purely naturalistic relation. Certainly it would be impossible to compose the short shots of Eisenstein’s Odessa Steps sequence in Potemkin—the soldiers shooting, the woman screaming, the children weeping—with sound cut in a parallel manner. Consequently, it is held, we must make each image longer, thus diminishing the richness of the visual form; the rapid montage of the silent film must give place to more leisurely scenes recorded from a more set distance and with a relatively fixed camera position, the construction being linked by the spoken word and not by the sequence of dynamically edited images. This policy, I maintain, is the line of least resistance, and instead of helping film to progress, holds it back, forcing it once again into its primitive position of mere photographic record of material actually suited to the Theatre. There is no necessity, in my view, to begin a sound when its corresponding image first appears and to cut it when its image has passed. Every strip of sound, speech, or music may develop unmodified while the images come and go in a sequence of short shots, or, alternatively, during images of longer duration the sound strip may change independently in a rhythm of its own. I believe that it is only along these lines that the Cinema can keep free from theatrical imitation, and advance beyond the bounds of Theatre, for ever limited by the supremacy of the spoken word, the fixture to one significant position throughout of decor and properties, the dependence of both action and audience’s attention entirely upon the actor, and reduction of the world’s wide globe to a single room less its fourth wall.
One of the most important problems in my Deserter was posed by the mass scenes—meetings, demonstrations, etc. First, it is necessary to understand that the mass never has been and never will be mere quantity; it is a differentiated quality. It is a collection of individuals and quite different from their sum; each mass consists of groups, each group of persons. These may be united by one emotion and one thought, and in that case their mass is the greatest force in the world. The conflicting processes at work within the groups to produce this result afford immediately obvious dramatic material, and accent upon the characteristics of individuals is an integral part of the creation of a living mass. What real method can there be of creating this qualitatively altered mass of individuals save by the editing of close-ups? I have seen a German film in which Danton is shown speaking to the citizens of Paris; he was placed at a window, and all we were allowed to know of his audience was their mass voice, like the traditional “voices off.” Such a scene in a film is nothing else than a photograph of bad Theatre.
In the first reel of Deserter I have a meeting addressed by three persons one after the other, each producing a complexity ofreactions in their audience. Each one is against the other two; sometimes a member of the crowd interrupts a speaker, sometimes two or three of the crowd have a moment’s discussion among themselves. The whole of the scene must move with the crowd’s swaying mood, the clash of opposing wills must be shown, to achieve these ends I cut the sound exactly as freely as I cut the image. I used three distinct elements. First, the speeches; second, sound close-ups of the interruptions—words, snatches of phrases, from members of the crowd; and third, the general noise of the crowd varying in volume and recorded independently of any image.
I sought to compose these elements by the system of montage. I took sound strips and cut, for example, for a word of a speaker broken in half by an interruption, for the interrupter in turn overswept by the tide of noise coming from the crowd, for the speaker audible again, and so on. Every sound was individually cut and the images associated are sometimes much shorter than the associated sound piece, sometimes as long as two sound pieces—those of speaker and interrupter, for example—while I show a number of individual reactions in the audience. Sometimes I have cut the general crowd noise into the phrases with scissors, and I have found that with an arrangement of the various sounds by cutting in this way it is possible to create a clear and definite, almost musical, rhythm: a rhythm that develops and increases short piece by short piece, till it reaches a climax of emotional effect that swells like the waves on a sea.
I maintain that directors lose all reason to be afraid of cutting the sound strip if they accept the principle of arranging it in a distinct composition. Provided that they are linked by a clear idea of the course to be pursued, various sounds can, exactly like images, be set side by side in montage. Remember the early days of the cinema, when directors were afraid to cut up the visual movement on the screen, and how Griffith’s introduction of the close-up was misunderstood and by many labelled an unnatural and consequently an inadmissible method. Audiences in those days even cried: “Where are their legs!”
Cutting was the development that first transformed the cinema from a mechanical process to a. creative one. The slogan Cut remains equally imperative now that sound film has arrived. I believe that sound film will approach nearer to true musical rhythm than silent film ever did, and this rhythm must derive not merely from the movement of artist and objects on the screen, but also—and this is the consideration most important for us to-day—from exact cutting of the sound and arrangement of the sound pieces into a clear counterpoint with the image.
I worked out in fine rhythm, suitable to sound film, a special kind of musical composition for the May Day demonstration in Deserter. A hundred thousand men throng the streets, the air is filled with the echoing strains ofmassed bands, lifting the masses to exuberance. Into the patchwork of sound breaks singing, and the strains of accordions, the hooting of motor-cars, snatches of radio noises, shouts and huzzas, the powerful buzzing ofaeroplanes. Certainly it would have been stupid to have attempted to create such a sound scene in the studio with orchestras and supers.
In order to give my future audience a true impression of this gigantic perspective of mass sound, its echoes and its multitudinous complexities, I recorded real material. I used two Moscow demonstrations, those in May and November of one year, to assemble the variety of sounds necessary for my future montage. I recorded pieces of various music and sound, varying in their volume, transitions from bands to crowd noises, and from hurrahs to the whirling propellers of aeroplanes, slogans from the radio and snatches of our songs. Just like long-shots and close-ups in silent film. Then followed the task of editing the thousand metres of sound to create the hundred metres of rhythmical composition. I tried to use the pieces like the separate instruments that combine to form an orchestra. I recorded two marching bands, and as passage of transition from one to the other cut between them some dominating sound like a mass hurrah or a whirling propeller. I endeavoured to bring the pieces already possessing a musical rhythm of their own into a new montage over-rhythm.
The images that go with this sound are edited with similar exactness, smiling workers, merry marching youths, a handsome sailor and the girls that flirt with him. But this sequence of images is but one of the rhythmical lines that make up the whole composition; the music is never an accompaniment but a separate element of counterpoint; both sound and image preserve their own line.
Perhaps a purer example of establishing rhythm in sound film occurs in another part of Deserter—the docks section. Here again I used natural sounds, heavy hammers, pneumatic drills working at different levels, the smaller noise of fixing a rivet, voices of sirens and the crashing crescendo of a falling chain. All these sounds I shot on the dock-side, and I composed them on the editing table, using various lengths, they served to me as notes of music. As finale of the docks scene I made a half-symbolic growth of the ship in images at an accelerated pace, while the sound in a complicated syncopation mounts to an ever greater and grandiose climax. Here I had a real musical task, and was obliged to “feel” the length of each strip in the same spirit as a musician “feels” the accent necessary for each note.
I have used only real sound because I hold the view that sound, like visual material, must be rich in its association, a thing impossible for reconstructed sound to be. I maintain that it is impossible artificially to establish perspective in sound; it is impossible, for instance, to secure a real effect of a distant siren call in a closed studio and relatively near the microphone. A “distant” call achieved by a weak tone in the studio can never create the same reality of effect as a loud blast recorded half a mile away in the open air.
For the symphony of siren calls with which Deserter opens I had six steamers playing in a space of a mile and a half in the Port of Leningrad. They sounded their calls to a prescribed plan and we worked at night in order that we should have quiet.
Now that I have finished Deserter I am sure that sound film is potentially the art of the future. It is not an orchestral creation centring round music, nor yet a theatrical dominated by the factor of the actor, nor even is it akin to opera, it is a synthesis of each and every element—the oral, the visual, the philosophical; it is our opportunity to translate the world in all its lines and shadows into a new art form that has succeeded and will supersede all the older arts, for it is the supreme medium in which we can express to-day and to-morrow.

