kyliethicc
Member
Some devs from PlayStation's Creative Arts Sound team gave a very detailed (and very long) interview on the sound design of Returnal, and also explained a lot more about the PS5's 3D audio. They discuss mixing in 3D as well, and how they created the haptic sounds/feels.
Here's some of the main sections broken out to read, but even still, there's a lot more to read in the full interview (linked at the end.)
Quotes are from Loic Couthier, Simon Gumbleton, Ash Read, Peter Hanson, and Lewis Everest, all members of PlayStation's sound team.
On being PS5 only:
"The advantage of being a PS5 exclusive title is that it gives us a license to go “all in” with the platform features. 3D Audio, from the beginning of our involvement, was our target and we were aiming at doing the maximum we could to provide the best possible audio for Tempest (and haptics). At CSG, we have years of expertise designing, implementing, and mixing 3D Audio, as we have been making PSVR titles since the launch. Returnal has been our first title applying this expertise to the PS5 with a non-VR title."
On designing sound for 3D:
"I think a good point to make is that it’s really about “designing for 3D” across the whole game. The obvious advantage for this game is the situational awareness in combat (hearing an enemy at the right location can literally save your life) and this shines with the variety of enemies in the game, and the great use of verticality in the level design.
But beyond the combat advantage, 3D audio allows us to build totally new experiences for the player. A good example is the mural particles, where we can surround and immerse the player in a dynamic and reactive “cloud” of sound, something that was simply not possible before.
Going beyond these showpiece moments, and extending the philosophy of “designing for 3D” to everything in the game, we have been able to craft rich, detailed, and enveloping environments for the player to explore and truly feel immersed in the world of Atropos.
As for challenges, certainly designing for 3D requires more content. There is much more space for audio to surround the player; we need to consider the location and size of objects more carefully at the design stage. It also has an impact on the implementation, as we need to use more game objects, and play more voices (e.g. a single ambient bed on its own is 16 channels!)."
On mixing 3D audio:
"The mix of a 3D title is complex due to the hybrid nature of the content and formats you are summing together.
In Returnal, we have a mix of 5th Order Ambisonics, 3D Objects, 7.1 Passthrough, as well as stereo haptics and controller speaker. This is the norm for a title that supports all the PS5 audio features.
Each of these bus structures have very specific requirements as to how they can actually be manipulated (for example, objects cannot benefit from bus processing), how expensive that is (ambisonics is 4-times more demanding than 7.1), and how they behave when they are downmixed.
The gain structure takes time and experimentation to get under control. The mixer in Returnal is 540 busses wide, with another 260 aux busses. While this really isn’t a quality factor, it shows the complexity of the systems you have to deal with as a mixer, and the time it takes to assign tens of thousands of sounds to that mixer, with the correct rules."
...
"In terms of “content bandwidth” (i.e., how much we can play at a time), it’s important to acknowledge that our brains are the real bottleneck, especially with this generation where we can play as many sounds as we want (technically).
While the technology can play hundreds of sounds in 3D, less than 5 are enough to saturate our ability to understand what is going on. We do have more available “space” to cover with positional sounds, and more CPU to generate them, but we don’t have “more brain” to understand the soundscape! And this is obviously where the mix comes in.
The 3D mix ends up delivered in two channels of binaural stereo. That is a lot more information to print on the “same old” two waveforms. Having more “playback space” doesn’t make the mix easier, on the contrary, there is a lot more content fighting for our brain’s attention.
It took some time for me to understand how to handle the mix in Returnal (conceptually) and a lot more time to manually set it up in Wwise haha..
I started in the wrong direction of treating the game like a blockbuster shooter and realized that the powerful player gun approach, as satisfying as it felt, was not right. You need to hear everything around you when you play, otherwise you’re dead, haha! People would get really frustrated if they were dying because of the mix."
...
"Then I drafted a list of the most important actions/attacks that, as a player, you want/need to know are happening — the ones that can deal more damage, the ones you must avoid as a priority, the ones that could feel unfair if the sound wasn’t helping.
It led to creating a priority-based system, where all enemy sounds were assigned a priority level (this includes everything, footsteps, etc.). The higher priority sounds are given the right to affect the lower, usually by ducking them as they play. Much of mixing the game was to define what sound needed what priority in what context, and how it should affect the rest. And also making sure no exception or manual mistake could ruin it all, haha…
Obviously, while those enemies are attacking, you also have Selene firing, moving, potentially talking, the music blasting, and the UI punctuating all of this. There’s a good average of 70 projectiles around you when it’s busy — all that for the mixer to digest."
On DualSense Haptics:
"The advent of haptics on PS5 was another novel concept for the audio team to get to grips with and the approach for using it in Returnal took some time to figure out. The strengths and weaknesses of haptics took some getting used to and what might seem obvious now was an unknown ruleset at the start of development.
Initially, attempts were made to utilize the haptics signal path as a secondary set of speakers (or a third set if you count the controller speaker) in order to invoke the same kind of response you would expect from hearing sound effects. That was the natural instinct because, up to that point, our day-to-day tasks were about designing sound content for your ears and consequently designing for the brain that interprets that content, deciphers what that content means informationally, what that content is related to visually, what emotion is consequently generated from that content, and so on.
The initial assumption was that perhaps the haptics were capable of delivering the same kind of response — delivering information to the brain (via the hands) that would interpret it in the same way as sound. Through experimentation, we learned that the perception of haptics content is almost always swayed by the other senses. For example, trying to design haptics content that suggests any kind of emotion is next to impossible without the context provided by the audio and visuals.
Similarly, the exact same haptic content can be perceived differently depending on the visual and audio context. For example, the same haptic pulses used for landing on solid ground after jumping will be perceived as feeling different if landing in mud (despite the content being the same). The brain makes the connection between what is happening on-screen and informs the hands what they are feeling.
So, with that knowledge behind us we established a general rule for what worked best: haptic content works best when all senses are in agreement. If the eyes see rain, the ears hear rain (better yet, they hear rain in 3D Audio) and the hands feel a sensation like rain. So the brain can be totally convinced by all the data it is receiving that it is indeed raining. This is where haptics can really shine, adding new physicality which aids the depth of immersion."
...
"The nature of each feature in-game would dictate what approach would work best for the haptic content; many of the haptic sensations in-game are actually synths being processed at runtime, which allows the most flexibility when designing the feel of the haptics in context, actually playing the game and feeling the haptics react."
On Returnal:
"A lot of those audio concepts and iterations were also reviewed by Housemarque directors, to ensure it was fitting their creative expectations. Many elements in the game have a background story and deep meaning; we wanted to make sure this was conveyed by audio (the sonic DNA I mentioned previously), on top of the “primary requirement” of sounding exciting and fitting the sonic style of the game.
Worth noting, it was only at a short peak of the production that we had 30 simultaneous sound designers (this includes outsourcing), and yes it was crazy haha! For the most part of our 2-year audio development on Returnal, we had around 10 people between Housemarque and Sony CSG. The reason for expanding so much was the sheer amount of varied content to produce under tight deadlines. It was too much, but we had to!"
...
"From the technical side, the “ever-changing” nature of the game presented some exciting challenges for us. Firstly, the world is different each cycle. The level layout is generated on-the-fly from smaller modules we called “rooms.” Because there is no fixed level, we had to build systems to automatically handle the ambience beds and reverb zones as well as stitching them all together with portals to allow for propagation between rooms. This allows the player to move seamlessly through any combination of spaces in a level and always be enveloped by the right ambience and reverb."
...
"To help define what an Alien gun should sound like, we studied the three main categories of weapon-type as defined by Housemarque: Electromechanic, Biomechanic, and Cosmic.
We decided that the Electromech arsenal would draw heavily from synth and electricity sources (we were keen to steer away from laser ‘pew pew’ sounds). The Biomech wanted some organic content to bring the weapon to life (they are literally alive) while the Cosmic category needed to sound unlike any traditional gun but still very powerful (‘air ripping thunder’ was a frequent reference)."
...
"For the alien fauna vocals, we wanted to use organic elements that reflected their mass and the kind of materials that make up their physical structure. The first four-legged creatures you come across have large, almost bird-like skulls, so along with using bird call source material for their shrieks and screams we used resonance to give the sound a more hollow and flavored tone to match how the sound could emit from their jaws."
For most non-humanoid creatures, resonance played an important part in making their vocal tones not only flavourful but believable. The massive bipedal monstrosity that claimed many players on their first encounter consists of numerous deep breaths, hisses, and gurgly inhales (mostly blowing air into yoghurt through plastic tubing). The plastic tube breaths were great to convey the creature’s potentially massive and messy airways, while we used a mixture of sub-harmonics plugins and resonance effects on other source material to match that tube flavor."
Full interview link
Here's some of the main sections broken out to read, but even still, there's a lot more to read in the full interview (linked at the end.)
Quotes are from Loic Couthier, Simon Gumbleton, Ash Read, Peter Hanson, and Lewis Everest, all members of PlayStation's sound team.
On being PS5 only:
"The advantage of being a PS5 exclusive title is that it gives us a license to go “all in” with the platform features. 3D Audio, from the beginning of our involvement, was our target and we were aiming at doing the maximum we could to provide the best possible audio for Tempest (and haptics). At CSG, we have years of expertise designing, implementing, and mixing 3D Audio, as we have been making PSVR titles since the launch. Returnal has been our first title applying this expertise to the PS5 with a non-VR title."
On designing sound for 3D:
"I think a good point to make is that it’s really about “designing for 3D” across the whole game. The obvious advantage for this game is the situational awareness in combat (hearing an enemy at the right location can literally save your life) and this shines with the variety of enemies in the game, and the great use of verticality in the level design.
But beyond the combat advantage, 3D audio allows us to build totally new experiences for the player. A good example is the mural particles, where we can surround and immerse the player in a dynamic and reactive “cloud” of sound, something that was simply not possible before.
Going beyond these showpiece moments, and extending the philosophy of “designing for 3D” to everything in the game, we have been able to craft rich, detailed, and enveloping environments for the player to explore and truly feel immersed in the world of Atropos.
As for challenges, certainly designing for 3D requires more content. There is much more space for audio to surround the player; we need to consider the location and size of objects more carefully at the design stage. It also has an impact on the implementation, as we need to use more game objects, and play more voices (e.g. a single ambient bed on its own is 16 channels!)."
On mixing 3D audio:
"The mix of a 3D title is complex due to the hybrid nature of the content and formats you are summing together.
In Returnal, we have a mix of 5th Order Ambisonics, 3D Objects, 7.1 Passthrough, as well as stereo haptics and controller speaker. This is the norm for a title that supports all the PS5 audio features.
Each of these bus structures have very specific requirements as to how they can actually be manipulated (for example, objects cannot benefit from bus processing), how expensive that is (ambisonics is 4-times more demanding than 7.1), and how they behave when they are downmixed.
The gain structure takes time and experimentation to get under control. The mixer in Returnal is 540 busses wide, with another 260 aux busses. While this really isn’t a quality factor, it shows the complexity of the systems you have to deal with as a mixer, and the time it takes to assign tens of thousands of sounds to that mixer, with the correct rules."
...
"In terms of “content bandwidth” (i.e., how much we can play at a time), it’s important to acknowledge that our brains are the real bottleneck, especially with this generation where we can play as many sounds as we want (technically).
While the technology can play hundreds of sounds in 3D, less than 5 are enough to saturate our ability to understand what is going on. We do have more available “space” to cover with positional sounds, and more CPU to generate them, but we don’t have “more brain” to understand the soundscape! And this is obviously where the mix comes in.
The 3D mix ends up delivered in two channels of binaural stereo. That is a lot more information to print on the “same old” two waveforms. Having more “playback space” doesn’t make the mix easier, on the contrary, there is a lot more content fighting for our brain’s attention.
It took some time for me to understand how to handle the mix in Returnal (conceptually) and a lot more time to manually set it up in Wwise haha..
I started in the wrong direction of treating the game like a blockbuster shooter and realized that the powerful player gun approach, as satisfying as it felt, was not right. You need to hear everything around you when you play, otherwise you’re dead, haha! People would get really frustrated if they were dying because of the mix."
...
"Then I drafted a list of the most important actions/attacks that, as a player, you want/need to know are happening — the ones that can deal more damage, the ones you must avoid as a priority, the ones that could feel unfair if the sound wasn’t helping.
It led to creating a priority-based system, where all enemy sounds were assigned a priority level (this includes everything, footsteps, etc.). The higher priority sounds are given the right to affect the lower, usually by ducking them as they play. Much of mixing the game was to define what sound needed what priority in what context, and how it should affect the rest. And also making sure no exception or manual mistake could ruin it all, haha…
Obviously, while those enemies are attacking, you also have Selene firing, moving, potentially talking, the music blasting, and the UI punctuating all of this. There’s a good average of 70 projectiles around you when it’s busy — all that for the mixer to digest."
On DualSense Haptics:
"The advent of haptics on PS5 was another novel concept for the audio team to get to grips with and the approach for using it in Returnal took some time to figure out. The strengths and weaknesses of haptics took some getting used to and what might seem obvious now was an unknown ruleset at the start of development.
Initially, attempts were made to utilize the haptics signal path as a secondary set of speakers (or a third set if you count the controller speaker) in order to invoke the same kind of response you would expect from hearing sound effects. That was the natural instinct because, up to that point, our day-to-day tasks were about designing sound content for your ears and consequently designing for the brain that interprets that content, deciphers what that content means informationally, what that content is related to visually, what emotion is consequently generated from that content, and so on.
The initial assumption was that perhaps the haptics were capable of delivering the same kind of response — delivering information to the brain (via the hands) that would interpret it in the same way as sound. Through experimentation, we learned that the perception of haptics content is almost always swayed by the other senses. For example, trying to design haptics content that suggests any kind of emotion is next to impossible without the context provided by the audio and visuals.
Similarly, the exact same haptic content can be perceived differently depending on the visual and audio context. For example, the same haptic pulses used for landing on solid ground after jumping will be perceived as feeling different if landing in mud (despite the content being the same). The brain makes the connection between what is happening on-screen and informs the hands what they are feeling.
So, with that knowledge behind us we established a general rule for what worked best: haptic content works best when all senses are in agreement. If the eyes see rain, the ears hear rain (better yet, they hear rain in 3D Audio) and the hands feel a sensation like rain. So the brain can be totally convinced by all the data it is receiving that it is indeed raining. This is where haptics can really shine, adding new physicality which aids the depth of immersion."
...
"The nature of each feature in-game would dictate what approach would work best for the haptic content; many of the haptic sensations in-game are actually synths being processed at runtime, which allows the most flexibility when designing the feel of the haptics in context, actually playing the game and feeling the haptics react."
On Returnal:
"A lot of those audio concepts and iterations were also reviewed by Housemarque directors, to ensure it was fitting their creative expectations. Many elements in the game have a background story and deep meaning; we wanted to make sure this was conveyed by audio (the sonic DNA I mentioned previously), on top of the “primary requirement” of sounding exciting and fitting the sonic style of the game.
Worth noting, it was only at a short peak of the production that we had 30 simultaneous sound designers (this includes outsourcing), and yes it was crazy haha! For the most part of our 2-year audio development on Returnal, we had around 10 people between Housemarque and Sony CSG. The reason for expanding so much was the sheer amount of varied content to produce under tight deadlines. It was too much, but we had to!"
...
"From the technical side, the “ever-changing” nature of the game presented some exciting challenges for us. Firstly, the world is different each cycle. The level layout is generated on-the-fly from smaller modules we called “rooms.” Because there is no fixed level, we had to build systems to automatically handle the ambience beds and reverb zones as well as stitching them all together with portals to allow for propagation between rooms. This allows the player to move seamlessly through any combination of spaces in a level and always be enveloped by the right ambience and reverb."
...
"To help define what an Alien gun should sound like, we studied the three main categories of weapon-type as defined by Housemarque: Electromechanic, Biomechanic, and Cosmic.
We decided that the Electromech arsenal would draw heavily from synth and electricity sources (we were keen to steer away from laser ‘pew pew’ sounds). The Biomech wanted some organic content to bring the weapon to life (they are literally alive) while the Cosmic category needed to sound unlike any traditional gun but still very powerful (‘air ripping thunder’ was a frequent reference)."
...
"For the alien fauna vocals, we wanted to use organic elements that reflected their mass and the kind of materials that make up their physical structure. The first four-legged creatures you come across have large, almost bird-like skulls, so along with using bird call source material for their shrieks and screams we used resonance to give the sound a more hollow and flavored tone to match how the sound could emit from their jaws."
For most non-humanoid creatures, resonance played an important part in making their vocal tones not only flavourful but believable. The massive bipedal monstrosity that claimed many players on their first encounter consists of numerous deep breaths, hisses, and gurgly inhales (mostly blowing air into yoghurt through plastic tubing). The plastic tube breaths were great to convey the creature’s potentially massive and messy airways, while we used a mixture of sub-harmonics plugins and resonance effects on other source material to match that tube flavor."
Full interview link
Returnal: How its glorious, dark electronic sound was made (and making the most of PS5's new 3D audio engine) | A Sound Effect
Get a powerful game audio deep-dive with Loic Couthier, Simon Gumbleton, Ash Read, Peter Hanson, Lewis Everest, and Harvey Scott:
www.asoundeffect.com
Last edited: