
WHEN
15 July 2025WHAT
Talks and performances exploring the intersection of creative audio synthesis and AI-enabled synthesizer programming.WHO ARE WE
Organised by C4DM's
Communication Acoustics LabWHERE
G2 & Performance Lab, Engineering Building,
QMUL’s Mile End campus <access instructions>
Schedule
@ G2
9:30 – Coffee
10:00 – Workshop Introduction
10:10 - Session 1
12:40 – Lunch
13:40 – Session 2
15:40 – Close Workshop
@ Performance Lab
16:00 – Concert Setup
17:00 – Concert Begins
Invited Talks and Performances
TALKS
Title: Facilitating serendipitous sound discoveries with simulations of open-ended evolution
Author: Björn Thor Jónsson -- University of Oslo
Abstract: Björn will explore the application of evolutionary algorithms for sound discovery, particularly how Quality Diversity search can facilitate serendipitous discoveries beyond what can be found by prompting models trained on existing data. A specific sound synthesis approach based on pattern-producing networks, coupled with DSP graphs, will also be examined, along with the integration of its neuro-evolution within the evolutionary simulations. Furthermore, the relevance of these sound discoveries in compositional contexts will be addressed, along with efforts to visualise the evolutionary processes and enable interaction with the discovered artefacts.
Title: Perceptually Aligned Deep Image Sonification
Author: Balint Laczko -- University of Oslo
Abstract: Imaging technology has dramatically expanded our understanding of biological systems. This overabundance of images has come with unique problems, such as visual overload, which can potentially obscure data relationships and induce eye fatigue or divert vision from important tasks. Image sonification offers potential solutions to these problems by channeling data into the auditory domain, leveraging our natural pattern recognition skills through hearing. In my PhD project I have been exploring the potential of Machine Learning in solving the two fundamental challenges of image sonification: the perceptually aligned representations of images and sounds, and the cross-modal mapping between them. In this talk I will present my journey through timbre spaces, Differentiable DSP, and cross-modal domain transfer in search of new methods for image sonification.
Title: Autonomous control of synthesis parameters with listening-based reinforcement learning
Author: Vicenzo Madaghiele -- University of Oslo
Abstract: Music improvisation requires listening, reflection, and timely decision-making. When improvising together, musicians make decisions in time, responding to other musicians’ ideas, their style, and the musical context of the performance. I present a novel approach for agent-based autonomous control of synthesis in an improvised music context. The model employs reinforcement learning to control the continuous parameters of a sound synthesizer in response to live audio from a musician. The agent is trained on a corpus of audio files that exemplify the musician’s instrument and stylistic repertoire. During training, the agent listens and adapts to the incoming sound according to a set of perceptual descriptors by continuously adjusting the parameters of the synthesizer it controls, and a reward function expressing a musical objective. To achieve this objective, the agent learns specific strategies that characterize its autonomous behavior in a live interaction. I discuss the theoretical background, the formal model of improvisational interactions, a preliminary implementation of the model and its application in three selected scenarios.
Title: Timbre latent space transformations for interactive musical systems oriented to timbral music-making.
Author: Andrea Bolzoni -- The Open University
Abstract: This talk presents a set of timbre latent space transformations implemented in an interactive musical system for timbral music-making, called TimbReflex. The system enables musicians to explore timbre through turn-taking interactions that structure the musical exchange. By transforming the timbre latent space of the performer’s material, TimbReflex generates sonic variations that elicit further exploration of the musician’s timbral vocabulary. The results show the system's potential to support rich timbral exploration in the context of free improvisation. These findings point toward promising directions for further development and the application of similar methods in interactive, timbre-oriented musical systems.
Title: Designing Percussive Timbre Remappings: Negotiating Audio Representations and Evolving Parameter Spaces
Author: Jordie Shier -- Queen Mary University of London
Abstract: Timbre remapping is an approach to audio-to-synthesizer parameter mapping that aims to transfer timbral expressions from a source instrument onto synthesizer controls. This process is complicated by the ill-defined nature of timbre and the complex relationship between synthesizer parameters and their sonic output. In this work, we focus on real-time timbre remapping with percussion instruments, combining technical development with practice-based methods to address these challenges. As a technical contribution, we introduce a genetic algorithm – applicable to black-box synthesizers including VSTs and modular synthesizers – to generate datasets of synthesizer presets that vary according to target timbres. Additionally, we propose a neural network-based approach to predict control features from short onset windows, enabling low-latency performance and feature-based control. Our technical development is grounded in musical practice, demonstrating how iterative and collaborative processes can yield insights into open-ended challenges in DMI design. Experiments on various audio representations uncover meaningful insights into timbre remapping by coupling data-driven design with practice-based reflection. This work is accompanied by an annotated portfolio, presenting a series of musical performances and experiments with reflections.
Title: Can a Sound Matching Model Produce Audio Embeddings that Align with Timbre Similarity Rated by Humans?
Author: Haokun Tian -- Queen Mary University of London
Abstract: Psychoacoustical so-called “timbre spaces” map perceptual similarity ratings of instrument sounds onto low-dimensional embeddings via multidimensional scaling but suffer from scalability issues and are incapable of generalization. Recent results from audio (music and speech) quality assessment as well as image similarity have shown that deep learning provides emergent embeddings that align well with human perception while being largely free from these constraints. In this talk, we present metrics to evaluate the alignment between three representations—extracted from a simple sound matching model—and the existing "timbre space" data containing 2,614 pairwise ratings on 334 audio samples. Among the tested representations, we highlight the effectiveness of the "style" embeddings, which are inspired by the style transfer task in computer vision. We compare our results to commonly used representations, including MFCCs and CLAP embeddings. Furthermore, based on the performance of the style embedding derived from the CLAP model, we demonstrate its broader effectiveness in modeling human timbre perception.
Title: GuitarFlow: Realistic Electric Guitar Synthesis From Tablatures via Flow Matching and Style Transfer
Author: Jackson Loth -- Queen Mary University of London
Abstract: Music generation in the audio domain using artificial intelligence (AI) has witnessed steady progress in recent years. However for some instruments, particularly the guitar, controllable instrument synthesis remains limited in expressivity. We introduce GuitarFlow, a model with a high degree of controllability designed specifically for electric guitar synthesis. The generative process is guided using tablatures, an ubiquitous and intuitive guitar-specific symbolic format. The tablature format easily represents guitar-specific playing techniques (e.g. bends, muted strings and legatos), which are more difficult to represent in other common music notation formats such as MIDI. Our model relies on an intermediary step of first rendering the tablature to audio using a simple sample-based virtual instrument, then performing style transfer using Flow Matching in order to transform the virtual instrument audio into more realistic sounding examples. This results in a model that is quick to train and to perform inference, requiring less than 6 hours of training data. We present the results of objective evaluation metrics, together with a listening test, in which we show significant improvement in the realism of the generated guitar audio from tablatures.
More talks to be announced!PERFORMANCES
Title: Experience Replay
Author: Vicenzo Madaghiele -- University of Oslo
Abstract: Experience Replay is an improvised performance in which autonomous sound-generating processes from various sources are combined into a network of feedback relations focusing on dynamic and textural evolution. The performance employs electric guitar, sound granulation and chaotic synthesis to explore states of fragile equilibrium and apparent disorder.
More performances to be announced!
ORGANISERS
