Whats that coming over the hill?

Is it a monster? Or a cow with a ring modulator on?

So how exactly do you make the sound of something that doesn’t exist? How exactly do you record the sound of an intergalactic squid monster. Well, a squid monster has at least a basis in reality, but what about a squerch monster? What the hell does that sound like and how do you get the sound of one anyway?

Take the creature below as an example.

Creature 1 with no sound

This is obviously not a creature we will find in reality, so we can’t get near to one and record it. In fact, looking sat it, I wouldn’t want to go near one and especially not if I was carrying heavy recording equipment.

So we have to imagine what the creature might sound like. So we need to think about it for a bit. Is it friendly or unfriendly? Big or small? Loud or quiet? Aggressive or defensive? Intelligent or not?

To create this sound, I imagined that this was a big, unfriendly aggressive animal that is no more intelligent than a dog. So this gave me an idea as to the type of sound I could give this creature. As its an ‘animal’, it doesn’t need a language, so I don’t need to worry about words, or the creature giving instructions or describing something – its all about ‘noise’. This is why we need to be able to assess the intelligence of the thing we are making the sounds for. If something needs to speak in order to communicate something, then it needs to sound like words, or at least, the shape of the sound needs to follow what we know and understand of speech. For example, a question will rise in pitch at the end.

Looking at this creature though and the shape and size of it’s mouth, I think its fair to say it needs to be a ‘roar’ kind of sound. However, in reality, most animals that roar aren’t lizard like; and this animal is. The nearest animal that exists is probably a crocodile or alligator, and whilst they do make a very satisfying noise, it doesn’t fit this animal.

This is the final result. Read on and I’ll show you how I got to this.

Creature 1 with sound

I wanted something ‘shouty’ and ‘brash’ so started with an elephant. And yes, I know an elephant is a mammal. I know this creature looks nothing like an elephant, but bare with me on this.

This came across as a bit high pitched on its own and we know that big things sound lower in pitch than small things. Yes, I know elephants are big. I know, I’ve just said big things make low, deep noises and that the elephant’s sound is too high pitched…….. these ‘rules’ are generalisations. You can always find the exception to the rule and this particular elephant trumpet sound I’m using is that exception. There are probably better elephant sounds available and probably ones with a deeper pitch, but this is the one I have that has the brash edge I am wanting. In any case, I haven’t finished with the sound yet.

It needed more weight to the sound. So, I added a cow, but I used a pitch-shifter to lower the pitch a little bit.

Then I made the cow sound backwards and lowered it in pitch even further.

Then I added a walrus. They sound big and scary. Unlike the cow, there is a sharp attack to the walrus sound. It sounds like a shout or a bark.

And finally, so that the background animals had a sound as well as the voice of the one in the foreground, I added a stampede of wildebeest.

Listen to the finished sound again in the video above. Can you hear the separate sounds now you know what to listen for?

This is how it works. When we can’t record something as it is, we make it up by blending sounds together of things that do exist to create something new. You can clearly hear the different sounds in this now you know what to listen out for, but when put together, it sounds like one creature roaring away. Big, brash and quite scary.

These sound samples came from Sound Ideas XV.

Rabbit Rabbit Rabbit



Dialogue in games is a difficult thing to integrate so it sounds natural. Conversations can flow into an almost infinite amount of possibilities, but computers are limited and simply can’t.

In some games, like Dishonoured (2012), NPC dialogue is stilted and repetitive.

However, dialogue in games like Detroit Become Human (2018) show that dialogue in games has come on a lot.

Dialogue has to be written in such a way that it is natural. In a game, this means that it is going to be, to a greater or lesser extent, dynamic. As the player goes through the game, they will meet different people in different situations which will have the effect of making the dialogue different. Detroit Become Human is an example of this, and so is Life Is Strange (2015).

The dialogue choices in a game like this mean the script is going to be very different from the script of a film. A story like this has dialogue written in nodes, rather than a whole script. After a few choices have been made, these get complicated very quick.

In order to affect this type of writing, a technique called ‘bottle necking’ is used. This is where different choices eventually meet in one node and then split off again. An example of this style of writing is the Fighting Fantasy books by Steve Jackson and Ian Livingstone. This excerpt from The Forest of Doom (1983) shows how this works.

From this, the story nodes branch off like this:

You can clearly see a node like 267 is a bottle neck. Lots of ways into that section of the story, but there are only two ways out. Dialogue works in the same way. There aren’t endless possibilities, but it feels like there isn’t. After all, when you get to a bottleneck, you won’t realise as a game player.



The Sound of the Sound of the Sound of Video Games

Jesper Juul wrote a paper (August 2018) about the ‘Aesthetics of the Aesthetics of the Aesthetics of Video Games’ and it’s a good read. I want to explore this idea but looking at sound rather than aesthetics as a whole. Of course, the aesthetics of a game must include sound – a game world is aesthetically very different if you change or remove the sound elements from it. Imagine, for example, a Call of Duty game with the weapon sounds made by cheap air rifles, or when playing Gran Turismo, Yakety Sax plays on the final lap. The aesthetics of the game would change, as would the level of immersion in terms of sensory and imagination at least.

Juul outlines the three levels of aesthetics in this way:
The aesthetic of video games is that games themselves have no utility beyond themselves, but playing at utility. When playing as a character, you play at having purpose; to rescue the princess, or find the treasure or whatever, but this is just play within the game world, and has no utility beyond itself.

Level 2, the aesthetics of the aesthetics, draws out a contradiction to this in that games are ‘anti-play’: they are goal-oriented and emphasise a goal-directed nature of the game. Like a sport, winning is what matters. The game is no longer ‘play’ but a ‘labour’ to achieve the goal.

Level three, the aesthetics of the aesthetics of the aesthetics of video games points to a modern and growing trend in video games to reject goal orientated game and abandon utility in its entirety. Games like Proteus fall into this category by having no goal, no ‘point’ to them and this, according to Juul, removes many elements of play.

As with Juul’s distinctions outlined above, sound plays an important role in this and falls into these distinctions as well.

At the first level, the purpose of sound is immersion of the senses, immersion of the imagination and immersion by challenge. To immerse the senses, the sound must be real, and realistic to the world the game is set in. This concept even transcends a visual aesthetic as shown in games like The Walking Dead. Whilst the visuals are cell shaded and ‘comic book’ style, the diegetic sounds are realistic and the backing music could come straight out of a prime time TV series or Hollywood film. This helps with immersion of the senses and imagination by providing what the visuals do not; an anchor in the known world. The world does not look like a graphic novel (although a strong argument can be made that, on a political level, it would be better if it did), but as a graphic novel has no sound, a real world anchor must be found; sound.

Level 2, the sound of the sound is about ludic, or gameplay function. If, aesthetically, this level is concerned with goal orientation then the sound has to provide for that. This needs to be done whilst keeping to the nature and rules of the game world. As Rhianna Pratchett says in her TedX talk, the job of a game writer is to find a way of giving instructions, information, and hints to the player without breaking character or the internal logic of the game world, and the same is true of the sound designer. Sound, when used well, can show a player where to go, can help a player choose a tactic in a new game area, and can warn of danger and indicate many different statistics. In Call of Duty, sound is used to let you know when your health is low – you gasp for breath. In Battlefield, it is used to let you know when your ammo is running out – a low pass filter is used to change the sound of the gun. Games such as Metal Gear Solid have a very distinctive ‘you’ve been spotted’ sound that, whilst non diegetic, doesn’t break immersion because it shocks the senses into action. Challenge based immersion kicks in and, in the case of MGS, even a cardboard box isn’t going to save you now.

Level three, the sound of the sound of the sound takes us in a different direction as to the utility of sound in games. This level in aesthetics is concerned with removing goal-orientated play and, Juul argues, this removes a lot of play itself. He says:

The third layer, the aesthetics of aesthetics of aesthetics is not, as we might first think, about going back to play, about letting players be creative in an open universe. It is the reverse: it is about keeping almost all of game structure, keeping goals and “winning”, but removing the playful element of games, removing the element of games where players improve their skills, or where they improvise creatively, where they plan.

Here, Juul is arguing for a definition of play that involves learning and using skillsets, and then using these skills in the game world to do new things. This is a similar idea used when learning a musical instrument. At first, the learner is instructed closely and must follow the teachers instructions and mimic what they do. As the learner grows in skill, they gain more freedom from the teacher and can begin to find their own voice with the instrument, eventually creating new and exciting ways of playing all whilst being structured by what they learned in the initial lessons.

Juul argues that if we take away the ‘goals’ and concepts of winning from the game play initially, then this removes and not enhances play. The beginner players lose their lessons and are simply lost.

In response, Chris Bateman responds that Juul is applying a very specific definition of ‘play’ of which these games fall outside. I’m with Bateman on this one. As a subscriber to Wittgenstein’s theory of Language Games, ‘play’ cannot be defined in the way Juul is attempting. Evidence of usage of the word ‘play’ (and underlying concepts) fall outside his definition. Whilst this is a gross over simplification, I don’t want to get bogged down in this at this point.

So where does this leave us with audio? On this third level, if we accept Juul’s argument, then there is no real ludic function as previously described in ‘level 2’. If, however we take Bateman’s counter argument into effect, then an entire new and exiting range of possibilities open up before us. Bateman says:

Proteus, which is my favourite game of this century, is rife with play – what it is devoid of is the play of utility. Bees, frogs, squirrels, sunsets, shamanic figures all provide ample playful elements where the player has ways to assert their agency within the distinct and definite authorial intent, not to mention (since the landscape is a soundscape) the playful expression of an audio journey to match the Zhuangzi-inspired hiking play that lies at the core of Ed Key and David Kanaga’s masterwork.

Play here is not used as a measure of achieving anything other than a nice wander through a nice world. When creating audio, this gives a whole new world to play in (pun intended). Procedurally generated audio and visuals is an exciting direction for game audio to take. In this way, it is fully ludic in purpose. Whilst the audio cannot help us achieve our goals – there aren’t any – it can help us create and explore our game world.

When we reject challenge as a necessary part of a game, then, whilst this closes off many applications of audio, it opens up new ones. If the point of playing a game is, in essence the same as the point of reading a novel, simply to read it and finish it, then it is ludic audio that draws a player aesthetically into the game. The distinction between immersive and ludic audio blurs to a point of non existence. Immersion in all its forms is the game play, the goal, the utility and, perhaps even the Holy Grail of the future of game design.

There’s a procedure for everything

Imagine being on a rollercoaster. Every time you ride the rollercoaster, the experience is the same. It has been designed to provide a captive audience with a carefully planned and designed experience. This is an example of ‘ordinary media’, like films or TV. The director plans everything out for you and each time you watch it, you experience the same things in the same order along the same timeline.

Interactive media is different and so cannot be planned out by a director in the same way. Video games can loosely fall into two categories. Semi-linear and completely free. Semi-linear games are analogous to a maze; where the player makes choices at pre-determined junctions and is therefore relatively controlled by the game. The player has a limited amount of control and choice. Many adventure games, or narrative driven games fall into this category in broad terms. Each level begins and ends at the same place with the same cutscene, no matter how you choose to play the game. A completely free game, or ‘sandbox’ game, like the GTA series, offer different challenges to the sound designer. You can’t have a soundtrack for a level, or a prescribed length of time for a piece of music to play – the player may well not cooperate with your direction.

One answer to this issue is procedural sound design.

Imagine there are four Ace playing cards face down in front of you. You have to guess which suite the card belongs to before it’s turned over. After 6 or 7 goes, you are going to know the order of the cards. That game will become very boring very quickly. Now imagine you add a King, Queen and Jack into the game. Not only do you have to name the suite of the cards, but the order in which they come AND the order I mixed up at random. The guessing game now becomes impossible. There are a limited number of possibilities, but the number is so great that there is no discernible pattern. This is what we mean by procedural sound design.

Instead of playing cards, imagine instead we’ve chopped up five sounds of explosions falling into the sections of ‘crack’ ‘boom’ ‘tail’ and ‘thump’. If we stitch these back together in a completely random order, perhaps even randomising some filters, modulation and volume, then from the listeners point of view, we have an endless cycle of explosions without two full explosions ever repeating themselves. This has the benefit of sounding natural, yet we’ve managed to do it only a small amount of memory and space being used. To quote the Genie in Aladdin “phenomenal cosmic power in an iddy-biddy little space” (Disney 1992).

Here’s an example of it:

According to the Oxford Handbook of Interactive Audio,

In the visual and tangible realms, the apprehension of conditions and process is required, but it is usually the re­sults of the process, not the process itself, that are of interest. (Andy Farnell, Edited by Karen Collins, Bill Kapralos, and Holly Tessler, 2014)

So, basically, how it works is of little interest to most people, but the results are important. When looking at a game that doesn’t use procedural sound design, the effect is obvious.

It can be argued that in a game like Minecraft, a more realistic or natural sound design policy wouldn’t fit with the aesthetics of the game, the point still stands that the same explosion sounds repeated does not sound realistic at all.

However, in games where procedural sound design does operate, the effect is a very natural result.

Gunfire, footsteps, shouts, explosions and most sounds feel natural and don’t instantly repeat on themselves.

In practice, however, it’s not necessarily easy to get right. If one is chopping up a sound that would normally be continuous, there are the issues around stitching sounds together that need to have similar enough timbres that they don’t jar on the ear. This is my first attempt and it suffers from not sounding properly ‘joined up’. This is obviously a skill tooth’s I need to learn.

So after a lot more work and (only a few) tears, I think I’m happy with this version. It feels more natural to me.

Hello!

Welcome to my blog and portfolio. I will be writing about all things game audio and posting examples of my work. I hope you enjoy.

I teach game design at Leeds City College, make music with A Handful of Fools and make games with Sleepy Brain Studios. I love designing sound and music for games as well as designing levels, puzzles and game environments.