The S3A project created an integrated audio-visual system for object-based audio. This led to a coherent ecosystem of software tools to capture, create, represent, reproduce, evaluate and archive object-based content. Our novel system architecture enabled new scientific contributions around objectification: the creation of audio object metadata from various audio-visual inputs and processing. The system is first introduced, then we present the evaluation of an end-to-end system with visual interfaces (for performer and listener tracking), and show that using beamforming in combination with channel-based recording can provide an enhanced listening experience.
The S3A project has been focussed on listener-centric reproduction, putting the listener at the centre of the experience. Achieving this involves developing a detailed understanding of the listeners’ perception. In an object-based reproduction there is much more to consider than simply audio objects: the listener, their devices and their environment are considered as objects. Interaction between these objects is via metadata, which allows the audio to be optimised based on listener, device and environmental attributes. This adaption is informed by the results of perceptual experiments and allows for personalised audio mixes, adjustment of reverberation to match environment or automated up/downmixing, for example.
We have developed a framework for semantically informed rendering of object-based audio, in which metadata can be adapted to optimise audio reproduction. The Metadapter software enables creation of processors for modifying metadata based on semantic descriptions of audio objects, loudspeakers, listeners, and/or the reproduction space. Adaptation can be driven by rulesets (which might, for example, be derived from experiments with mix engineers). Example implementations include: automatic remixing of 3D content for 2D loudspeaker arrays; personalisation of envelopment; choosing appropriate loudspeakers from an ad hoc array of connected devices; and modifying reverberation depending on the acoustics of the reproduction space.