News
Creating the Next Generation of Audio Tools
Nicolas Fournel, owner of Tsugi talks to Behind the Glass about starting the audio tools company, how he meets customer demand and how he has seen the game industry change.
In 2011, Fournel moved to Niigata in Japan. With the area being more famous for mouth-watering sushi, sake and rice than for its audio or gaming industries, he decided to create his own company, Tsugi, or ‘next’ in Japanese. “The name seemed apt as it was literally the next adventure in my life, and the goal behind Tsugi was to focus on R&D for the next-generation of game audio tools and engines, using mainly procedural audio and machine learning techniques, as well as new types of user interfaces.”
The development of the sound design tools for which Tsugi is most known for came later. The first graphic designer he hired was a local artist, well versed in manga and anime production (another speciality of Niigata). As a side project, "DSP Anime was born, and right after its release, it was rather unexpectedly picked up by all the biggest news portals in Japan, even making the front page of Yahoo News. Having DSP Anime suddenly classed as the most interesting new audio software in front of Apple’s Garage Band in Japan suddenly put us on the map, and we have continued the development of software for sound designers and game developers since then.”
“R&D is still our main business and we have now worked with most of the big Japanese game companies and a lot of AAA studios around the world as well, still focusing on procedural audio and artificial intelligence which are part of the DNA of the company.”
Over the years Fournel has seen the game audio industry change. “When I started my career a few decades ago, the audio programmer and the sound designer working on a project were often one and the same person. There was no middleware, and everything was pretty much hardcoded. A decade or so later, audio middleware radically changed the way game audio was implemented, by putting more control in the hands of the sound designer, with the advent of tools like ADX and FMOD (I was now working on MusyX, the audio tool shipped with the Dolphin SDK, i.e., for the Nintendo GameCube). After a while, Wwise also appeared. As these systems become more and more complex and the size of the projects grew exponentially, the need for technical sound designers, who would make the link between the technical and creative worlds - similar to what occurred in graphics – became apparent. So basically, the maturation of the discipline triggered a specialization of its actors.”
Where the tools are concerned he has also observed that in the same way music producers don’t want to get out of their DAW and prefer everything to run as plug-ins, sound designer for games increasingly want to work in the game middleware itself, where all the rest of the development happens, for a better integration. “Game middleware are starting to offer advanced audio features that are sufficient for a lot of games. It seems to me that the days of an audio middleware as a huge piece of software sitting outside of the main development tool are counted.”
Customer expectations have naturally changed in terms of quality, with recording gear becoming more performant, but also in the sheer number of sound assets (for banks) or presets (for synths) that are expected to be provided by the vendors. User interface and ease of integration in the designer’s workflow are now more carefully examined as well. “The way we meet some of these expectations is inherent to the way our tools work. Since procedural audio can generate sounds from scratch, it uses the full dynamic range right from the beginning, and we do not have to deal with potential background noises etc…”
“The fact that procedural audio is based on sound models, for which adjusting a parameter will generate a new sound, allows our customers to get an almost infinite number of sounds from our products. You could design a wood impact sound, then change the size of the wood piece, its resonance, the strength of the impact etc... and generate a new sound from it. If needed, you can assign a range of useful values for each parameter based on your requirements and creative choices, and then automatically generate hundreds of variations.”
This is possible across all the tools, from the DSP series which focuses on one genre of game at time, to Tsugis flagship tool, GameSynth. “In addition, the patch repository of GameSynth is the largest collection of procedural sound models available for games and animations. Combined with a patching environment to model any sound you can imagine and the possibility to generate infinite variations, it is a tool you can spend your life in!”
As for the integration, all their products interface with the main tools of the trade. For example, the sound generators from the DSP Series export sound variations directly to Unity, even generating C# scripts on the fly. GameSynth interfaces with Unity, GameMaker Studio and Cocos2D, but also with game audio middleware such as Wwise, ADX2 or FMOD Studio. So, you can automatically generate a few whoosh variations, add them to your project, create the right type of container and apply the default settings you want, all in one click. It also exports directly to Reaper and Audacity.
“In addition, if you are designing sound for animation, GameSynth can import animation curves (and even motion capture data) from many graphic packages such as Houdini, Spine, Sprite Studio, Live2D, Maya, 3DS Max, Blender etc… to drive its sound parameters and have audio that synchronize perfectly with the movements.”
So what makes Tsugi different to other plugin companies? “I guess it is the fact that we are the only company fully specialized in procedural audio for games and animations, with more than 10 years of experience in that domain, and a very strong R&D focus. All our technology is proprietary and we also invest a lot of time in developing new user interfaces to interact with sounds: from creation to edition and even when browsing them.””
“For example, one of our signature UI elements is the Sketch Pad, on which you can literally draw sounds with a mouse or graphic tablet. More than a 2D surface, it also takes speed, acceleration and pressure into account and sometimes can even interact with generative graphics triggers and backgrounds. It is especially visible in DSP Motion and DSP Action, two very affordable sound design tools which let you draw evolving and high-impact sounds, respectively, but also in GameSynth, in which you can draw contact sounds, whooshes, footsteps, weather elements, or even control a massive amount of sound particles with your movements. To make it short, in our sound design tools you are sure to find ways to create and interact with audio that are not available anywhere else, which makes them such good complements to whatever you already have in your arsenal.”
Tsugis plugins are procedural audio. “Procedural audio is the natural evolution of audio for interactive media, as 3D graphics were the natural evolution of still pictures. It allows for the sounds to be rendered differently based on what is happening in the game, like a 3D object would be rendered differently based on the camera’s position, lights etc. Of course, artists still use pictures for textures or even for sprites and backgrounds in games, and we will always use samples as well (possibly in some cases to inform procedural audio models via features analysis). However, for anything interactive, procedural audio offers undeniable advantages.”
“You can take a sound model, adjust a few parameters, and render a new variation, potentially in real-time, without any loss in audio quality. This is perfect for games, both to modify a sound dynamically based on the context (strength of impact, material etc…) and to create a different sound each time it is needed, to fight ear fatigue. Moreover, because the sound model itself is saved instead of many audio files, the memory footprint is orders of magnitude lower than with sampled data. “
“Actually, procedural audio is also great for linear media. For instance, if the duration or the movement of an animation has changed during production, you simply need to import the new animation curves and generate the sound again, instead of needing to record new source material (which is not always possible) or to apply some artefact-prone post-processing.”
There have been many technical challenges along the way. “As we have been the only company specializing in procedural audio for games and animation for a long time, we had to come up with new ways to create models, new paradigms to interact with them, store them, and so on. In procedural audio, some of the hardest models to design are probably the ones related to animal vocalizations for example.”
Technical hurdles are usually dealt with relatively easily. “I think the main challenge we had to overcome at first was the perception that procedural audio could not compete with traditional sample libraries. This perception has considerably changed though, especially for interactive audio, and we are lucky enough to have now developed custom procedural audio solutions for most of the major Japanese studios. GameSynth now has several thousands of users across the world as well, with accelerating sales, which proves that we passed that milestone.”
Sound designers are often met with challenges and Tsugi is known for meeting those challenges. “One of the main challenges of game sound designers is not only to create great sounding assets, but to implement them successfully in their game. It needs to sound good in context, and also often to be interactive and non-repetitive. Procedural audio is obviously already a good solution for all of this. That being said, the main question we usually get after this is how to make a procedural audio model sound more natural, or realistic. With time, we have developed a series of control generators for GameSynth that will simulate natural phenomena, physical movements and so on. In addition, audio analysis can be used in most GameSynth models to extract audio features (pitch, volume, and noisiness curves, but also resonances, event distribution etc.), in order to create more realistic control curves. Audio analysis is something that can advantageously be used outside of procedural audio to inform a design process, and that I feel is still very underused.”
Another challenge that all sound designers are confronted to is the management of a very large quantity of assets. “We have developed different ‘smart’ technologies based on audio features extraction and machine learning, such as Dataspace which lets you browse files based on their perceptual features. So, for example all similar sounding files will be located close to each other, independently of their type, name, and location on disk. This is not only a very fast way to find similar sounds, but also a great way to get sound design ideas. If you find out that a specific thunderclap sounds like a distant explosion, why not use it in your battle scene design? This technology is actually coming to GameSynth 2022 as well. “
As game technology is always pushing boundaries, Tsugi makes sure it’s at the forefront of customer need. “Having been in the industry for a (very!) long time definitely helps spotting trends and allows us to not invest too much time in “flashy” technologies that we believe may not survive past a year. More generally, we try to determine what core technologies will play an important role in the long term, and be beneficial to sound artists. Of course, we also listen to our customers. Then, we work on these new technologies until we are satisfied that they can bring something to our users once integrated in our creative tools. That was for example the case with the aforementioned ‘smart’ asset management system.”
So where does Fournel see sound libraries heading in the next 5-10 years? “I think there may be a lot of ‘smarter’ tools that will go much further than the basic digital sound processing functions and will assist the designer based on what he/she/they are actually working on (editing, processing, tagging, etc.). This implies some kind of AI, coupled with audio or music analysis functions, that will know what you are trying to achieve and will help you do it. This has actually already started in some ways.”
“New methods of generating sounds such as procedural audio and machine learning (using generative adversarial networks or newer techniques) may change the way sound libraries are created. The goal will not be to replace the sound designer of course, but rather to help capture his/her/their process to build models and templates more suitable for interactive audio than a collection of fixed recordings (which will still be needed!).”