Project: Real-time Interactive 3D Rendering of Musical Recordings
i3DMusic intends to enable enhanced playback of existing musical content on up to date and emerging 3D audio systems. i3DMusic will create a new class of audio playback systems for mono, stereo, and 5.1 content by combining specifically designed audio source separation techniques and sound spatialization algorithms into real-time, interactive products, targeted towards sound engineers, DJs and consumers._x000D_ _x000D_Audio source separation is an emerging technology facilitating the separation of several sound sources from mixed audio files. Ideally, it aims to reverse-engineer the mix to recover the original source tracks. Spatialization refers to sound rendering techniques that enable the reproduction of spatial properties of individual sound sources (angle, height, distance) and their environment (reflections from the boundaries of the real or virtual room, reverberation)._x000D_ _x000D_Existing systems do not rely on source separation, but rather give the possibility to isolate or spread some spatial slices of the signal (for example, find the elements of the signal contained in the centre of the sound stage). This leads to poor spatialization and limited interaction possibilities. These systems, generally called “upmixing” systems, are contained in several professional products (DB 600-C by Dolby, Neural Surround Upmix by DTS, Sonic Spatializer by sonic emotion). The upmixing process is also sometimes used in consumer products (for example in sound cards and HiFi). By contrast, i3DMusic will provide access to sources contained in the signal, for example a voice, drums or solo instrument, thus greatly improving spatialization quality and making it possible to manipulate the spatial position or other spatial characteristics of each source in real time._x000D_ _x000D_Perfect source separation reCOs an unsolved problem to date. Nevertheless, approximate source separation as performed by current state-of-the-art systems can be sufficient for high-quality spatialization purposes. Due to the masking properties of the human ear, distortions between the estimated sources and their original counterparts may not be heard when simultaneously rendering all tracks on multiple channels, depending on the chosen rendering setup, the type of distortions and the degree of content manipulation. i3DMusic will seek to understand and model under which conditions such distortions are not heard and so develop improved real-time source separation algorithms based on the findings. Similarly, perfect spatialization for a large number of listeners would require an immense number of loudspeakers and reCOs out of reach. The algorithms developed with i3DMusic will account for the limited capabilities of human perception for sound source localization in order to propose highly efficient solutions._x000D_ _x000D_The market application will be:_x000D_-3D audio for events and installation, in which mono or stereo signals will be entered and outputted on 3D audio systems with interaction possibilities for the audio engineer or DJ,_x000D_-music production/post-production systems, enabling an operator to transform mono or stereo music content into a spatialized content in a simple way,_x000D_-consumer products, derived from the professional applications._x000D_ _x000D_The consortium will be composed of:_x000D_-Audionamix, French SME specialized in audio source separation technologies and implementation,_x000D_-sonic emotion, Swiss SME specialized in 3D sound,_x000D_-INRIA, METISS team, French laboratory specialized in research on audio source separation._x000D_-EPFL, Acoustic Group, Swiss laboratory specialized in audio engineering and sound quality assessment.
Acronym | i3DMusic (Reference Number: 5582) |
Duration | 01/10/2010 - 31/03/2014 |
Project Topic | i3DMusic will provide a system enabling real-time interactive respatialization of mono or stereo music content. This will be achieved through the combination of the emerging technology of audio source separation within the application context of 3D audio rendering. |
Project Results (after finalisation) |
The goal of the R&D program was to implement a voice unmixing technology operating in real time (Audionamix' and INRIA-IRISA METISS 's responsibility) and to combine it with a real-time sound spatialization in 3D (developed by sonic emotion). The combination has been evaluated by the LEMA laboratory at EPFL. The CO use case was the spatial mixing of the voice in music by DJs. In this project, Audionamix has first prototyped a causal version of its algorithm (converting it from offline to online), and has evaluated the results. Considering the high computation time and the compromise to do on the quality, other algorithms have been implemented to be prototyped and compared with the baseline offline algorithm. The use case has been modified to take into the account the technological challenge: since the technology was not mature enough to enable real-time unmixing, we have supposed that a user would prepare the unmixed tracks before the actual performances (which is already the case for DJs to a certain extent). Then, the offline algorithm has been used for the making of the final prototype, and improved form a sound quality point of view, expecially for reverb and consonants extraction. Most of the technologies developed within the project have been integrated in Audionamix' ADX Trax product, released in January 2014. The product has also been used in the final demo of the project in Zürich in May 2014._x000D__x000D_In summary:_x000D_- online version of the baseline voice extraction algorithm has been implemented, but limited results (compromise speed/quality not good)._x000D_- online algorithms variants implemented._x000D_- improvement of the baseline algorithm._x000D_- integration into a software product release in 2014._x000D__x000D_ |
Network | Eurostars |
Call | Eurostars Cut-Off 4 |
Project partner
Number | Name | Role | Country |
---|---|---|---|
4 | AUDIONAMIX | Coordinator | France |
4 | Ecole Polytechnique Fédérale de Lausanne | Partner | Switzerland |
4 | INRIA METISS Project-Team | Partner | France |
4 | sonic emotion | Partner | Switzerland |