How can an easy-accessed Mobile Phone Orchestra (MPO)-platform be developed to facilitate high-end, flexible artistic expressions suitable for concert hall performances?. Moreover, What challenges and possibilities arises in the development process of a novel MPO-platform embracing school children as performers?. In this paper we discuss these questions, which were integrated in the development process of mobilephoneorchestra.com. In particular, we highlight the main challenge of developing a performance platform which both is mobile and artistically rewarding, but still easy accessed for school children. Mobilephoneorchestra.com is developed as a Web Audio API to enable large-scale concert hall performances of fixed polyphonic contemporary art music. A kind of philharmonic orchestra for electronic sounds, embracing school children as performers, using smart phones as instruments. Through an autoethnographic approach the research is carried out in multiple iterations, inspired by action research. In this paper we present findings from three iterations including performances of music involving MPO. School children/youths between ten- to seventeen years old were addressed as MPO-performers. The findings both reveal artistic challenges and possibilities with the platform. We specifically highlight the use of mobile music interfaces in combination with animated notation as a novel approach for a MPO-concept.
A number of tools to create online listening tests are nowadays available. They provide an integrated platform consisting of a user-facing frontend and a backend to collect responses. These platforms provide an out-of-the box solution for setting up static listening tests, where questions and audio stimuli remain unchanged and user-independent. In this paper, we detail the changes we made to the webMUSHRA platform to convert it into a frontend for adaptive online listening tests. Some of the more advanced workflows that can be built around this frontend include session management to resume listening tests, server-based sampling of stimuli to enforce a certain distribution over all participants, and follow-up questions based on previous responses. The backends required for such workflows need a large amount of customisation based on the exact listining test specification, and are therefore deemed out of scope for this project. The proposed frontend therefore isn’t meant as a replacement for the existing webMUSHRA platform, but as starting point to create custom listening tests. Nonetheless, a fair amount of the proposed changes are also beneficial for the creation of static listening tests.
This paper introduces the new music live coding language Glicol (graph-oriented live coding language) and its web-based run-time environment. As the name suggests, this language is designed to represent directed acyclic graphs (DAG), using a syntax optimised for live music performances. The audio engine and the language interpreter are both developed with the Rust programming language. With the help of WebAssembly and AudioWorklet, this language can run in web browsers. It also supports co-performance with the support for collaborative editing. Taking advantages of the Rust programming language design, the run-time environment is both safe and efficient. Documentation and error handling messages can be accessed in the web browser. All in all, we see Glicol as an efficient and future-oriented language for collaborative text-based musicking.
immaterial.cloud is an immersive audiovisual installation that explores a possible networked future of peer-to-peer technologies, away from cloud computing. Participants experience the work via two to four smartphones placed in<br/>different locations in a room. As participants walk up to a phone, they see a representation of themselves through data. If the participant gets close enough, the phone triggers a change in the sound of immaterial.cloud and the other phones follow.
Web technologies in general and Web Audio API in particular have a great potential as a learning platform for developing interactive sound and music applications. Earlier studies at the Royal College of Music in Stockholm have led to a wide range of student projects but have also indicated that there is a high threshold for novice programmers to understand and use Web Audio API. We developed the WebAudioXML coding environment to solve this problem, and added a statistics module to analyze student works. The current study is the first presentation and evaluation of the technology. Three classes of students with technical repsectively artistic background participated through online courses by building interactive, sound-based applications. We analysed the source code and self-reflective reports from the projects to understand the impact WebAudioXML has on creativity and the learning process. The results indicate that WebAudioXML can be a useful platform for eaching and learning how to build online audio applications. The platform makes mapping between user interactions and audio parameters accessible for novice programmer and supports artists in successfully realizing their design ideas. We show that templates can be a great help for the students to get started but also a limitation for them to expand ideas beyond the presented scope.
A large number of web applications are available for videoconferencing and those have been very helpful during the lockdown periods caused by the COVID-19 pandemic. However, none of these offer high fidelity stereo audio for music performance, mainly because the current WebRTC RTCPeerConnection standard only supports compressed audio formats. This paper presents the results achieved implementing 16-bit PCM stereo audio transmission on top of the WebRTC RTCDataChannel with the help of Web Audio and AudioWorklets. Several measurements with different configurations, browsers, and operating systems are presented. They show that, at least on the loopback network interface, this approach can achieve better quality and lower latency than using RTCPeerConnection, i.e., latencies as low as 50-60~ms have been achieved on MacOS.
In this paper, we describe artistic practices with the web-based music making tool Playsound.space held over the last two years since the inception of the platform. After completing a first design phase, which is documented in previous publications, Playsound has been regularly used by the first author as her main musical instrument to perform in live improvisation contexts. The tool proved to be especially useful during the quarantine period due to Covid-19 in Brazil, by enabling the musician (i) to take part in performances with other musicians through online gatherings, and (ii) to compose solo pieces from home leveraging crowd-sourced sounds. We review these endeavours and provide a critical analysis exploring some of the benefits of online music making and related challenges yet to tackle. The use of Playsound “in vivo”, in real artistic practices outside the laboratory, has enabled us to uncover playing strategies and user interface improvements which can inform the design of similar web-based music making tools.
We present the Choir Singers Pilot, a web-based system that assists choir singers in their individual learning and practice. Our system is built using modern web technologies and provides singers with an interactive view of the musical score along with an aligned audio performance created using state-of-the-art singing synthesis technology. The Web Audio API is used to dynamically mix the choir voices and give users control over sound parameters. In-browser audio latency compensation is used to keep user recordings aligned to the reference music tracks. The pitch is automatically extracted from user recordings and can then be analyzed using an assessment algorithm to provide intonation ratings to the user. The system also facilitates communication and collaboration in choirs by enabling singers to share their recordings with conductors and receive feedback. Our work, in the larger scope of the TROMPA project, aims to enrich musical activities and promote use of digital music resources. To that end, we also synthesize thousands of public-domain choir scores and make them available in a searchable repository alongside relevant metadata for public consumption.
Freesound has a huge community dedicated to upload and share sounds online for other users to use those sounds in their productions. In some cases, the sounds will need to be transformed and users will have to go in their devices and edit that sound in order to get the exact version that fits their production the most. In this paper, an audio editor is developed and added into the Freesoud environment in order to have those two steps together in the same workflow. This approach will add a huge value to the Freesound experience and will differentiate it from its competitors. For the development of this editor, different Javascipt and HTML frameworks have been used, like Wavesurfer (based on Web Audio API) or Bootstrap, always trying to keep the interface and functionalities accessible and easy-to-use to a general audience. Finally, two evaluation methods have been developed, the first in order to get some general feedback about the use of the editor and the second one to be sure that all its different functionalities behave as expected. The first evaluation showed the need of a simpler interface and functionalities, which was developed and then tested using the second evaluation approach, showing that all different actions, filters and effects worked as expected.
Lawrence Fyfe (Centre National de la Recherche Scientifique–UMR9912 / STMS Laboratoire (IRCAM))*; Daniel Bedoya (IRCAM); Corentin Guichaoua (STMS, IRCAM, Sorbonne Université); Elaine Chew (CNRS-UMR9912/STMS IRCAM, Paris, France)
CosmoNote is a web-based citizen science tool for annotating musical structures, with a focus on structures created by the performer during expressive musical performance. The software interface enables the superimposition of synchronized discrete and continuous information layers which include note data representations, audio features such as loudness and tempo, and score features such as harmonic tension in a visual and audio environment. The tools provide the means for users to signal performance decisions such as segmentation and prominence using boundaries of varying strengths, regions, comments, and note groupings. User-friendly interaction features have been built in to facilitate ease of annotation; these include the ability to zoom in, listen to, and mark up specific segments of music. The data collected will be used to discover the vocabulary of performed music structures and to aid in the understanding of expressive choices and nuances.
Soundcool Online is a Web Audio re-implementation of the original Max/MSP implementation of Soundcool, a system for collaborative music creation. Soundcool has many educational applications, and because Linux has been adopted in many school systems, we turned to Web Audio to enable Soundcool to run on Linux as well as many other platforms. An additional advantage of Soundcool Online is the elimination of a large download, allowing beginners to try the system more easily. Another advantage is the support for sharing provided by a centralized server, where projects can be stored and accessed by others. A cloud-based server also facilitates collaboration at a distance where multiple users can control the same project. In this scenario, local sound synthesis provides high-quality sound without the large bandwidth requirements of shared audio streams. Experience with Web Audio and latency measurements are re- ported.
Developing real-time audio applications, particularly those with an element of user interaction, can be a difficult task. When things go wrong, it can be challenging to locate the source of a problem when many parts of the program are connected and interacting with one another in real-time.
We present a time-travel debugger for the Flow Web Audio framework that allows developers to record a session interacting with their program, playback that session with the original timing still intact, and step through individual events to inspect the program state at any point in time. In contrast to the browser’s native debugging features, audio processing remains active while the time-travel debugger is enabled, allowing developers to listen out for audio bugs or unexpected behaviour.
We describe three example use-cases for such a debugger. The first is error reproduction using the debugger’s JSON import/export capabilities to ensure the developer can replicate problematic sessions. The second is using the debugger as an exploratory aid instead of a tool for error finding. Finally, we consider opportunities for the debugger’s technology to be used by end-users as a means of recording, sharing, and remixing ideas. We conclude with some options for future development, including expanding the debugger’s program state inspector to allow for in situ data updates, visualisation of the current audio graph similar to existing Web Audio inspectors, and possible methods of evaluating the debugger’s effectiveness in the scenarios described.
Narcissus is a work created in 1987 by Thea Musgrave for clarinet or flute and delayed audio, using a delay box. In order to recreate the piece using modern technology, our research group developed a web application to emulate the functions of the original equipament in Narcissus. “Surfing with Narcissus” is our report about the experience of recreating Musgrave’s work using web audio and web music APIs.
The Freesound Player is a digital instrument developed in Max MSP that uses the Freesound API to make requests for as many as 16 sound samples which are filtered based on the sounds content. Once the samples are returned they are loaded into buffers and can be performed using a MIDI controller and processed in a variety of ways. The filters implemented will be discussed and demonstrated using three music compositions by the authors, along with considerations for composing and improvising using sound content-based descriptive filtering.
Functional Reactive Programming (FRP) is a way to model discrete and continuous signals using events, which carry information corresponding to a precise moment in time, and behaviors, which represent time-varying values. This paper shows how the behavior pattern can be used to build sample-accurate interactive audio that blends the WebAudio API with other browser-based APIs, such as mouse events and MIDI events. It will start by presenting a brief history of FRP as well as definitions of polymorphic behavior and event types. It will then discuss the principal challenges of applying the behavior pattern to WebAudio, including named audio units, front-loaded evaluation, and scheduling precise temporal events using different clocks.
MIDI (Musical Instrument Digital Interface) is the standard communication protocol to connect audio hardware and software. The protocol is commonly used to send messages from hardware MIDI controllers to software that controls music synthesis and playback. Although a variety of hardware and software MIDI controllers exist, they typically use traditional, skeuomorphic input modes like keys, buttons, faders, and knobs. Since 2015, web browsers started supporting this protocol through the Web MIDI API, opening up an enormous untapped potential of integrations.
To overcome the deficit of live events in 2020 and a lack of feeling of live performance through streaming, a trio of concerts were created using the web as a medium for performance. The first concert was an attempt to infuse live event dynamics into a series of recorded pieces. The second incorporated live performances streamed in novel ways to the audience like 360 degree video and a composite of browser based and live stream media. The third was done entirely live with 11 performers located in different spaces. <br/><br/>This talk will cover the novelty of using a website and distributed performance tools as a platform for delivering live performance and reintroducing the audience/performer connection in mediated performance.
In this talk we will explain why and how we used modern web technologies, and specifically, the Web Audio API to build a creative digital tool intended to be used in the entry levels of our educational system (primary and secondary education). We are Rubén Alonso (artist and co-creator of “Antropoloops” project: a creative approach to ethnomusicology, where musical expressions from different cultures and times come into dialogue through electronic music) and Daniel Gómez (musician and software developer).We are part of a team made up of artists, educators and technicians, that asked ourselves how to combine sound collage, body movement and electronics to celebrate and engage cultural diversity in educational environments. Over the last three years, in collaboration with Carasso Foundation, we have developed both a methodology and an online remixing tool (https://play.antropoloops.com) that aims to explore the potential of music remixing as a way to work towards cultural inclusion at schools.
Accompanying systems provide musical accompaniment to a human performer. These systems can adapt to music contextual changes like the tempo or the tonality, which makes them an interesting tool for music students. In this paper I describe a browser-based multitrack reactive music player. This player enables bringing up automatic accompanying systems to the browser. The player is implemented in TypeScript, using the web audio API and MIDI.js soundfonts.
From it’s very beginning the Web Audio conference was a place which showcased systems using web technology to enable remote audience participation. Numerous talks, workshops and live performances were dedicated to this topic. But up until recently this was considered to be a nice add-on for live events outside of the Web Audio world. However the current pandemic changed this drastically.
Many events can’t be attended in person anymore. This is true for small concerts and conferences but also for usually well attended sports events. In many cases remote participation became the only possible form of participation. For sports events this imposes a special problem since the reactions and chants from the stands are considered to be a part of the experience. Therefore many TV stations choose to play artificial background noise when broadcasting sports events without any onsite spectators.
I would like to present a system which uses web technologies to bring back the voices of sport fans into the arenas. It builds up on common rituals that many fans are used to. It’s meant to be as less obtrusive as possible. It works by sending the announcements of the pa announcer to the fans at home. They can respond as they are used to. The responses get recorded as well and sent back. In the arena the players will hear the slightly delayed announcement together with the responses from their fans. The system is very scalable and can be used by as many people as would otherwise be in the stadium.
All above testing tools have been successfully used by the Open Source community.
Collab-Hub is a networking tool for sharing data across the internet for multimedia collaboration. Utilizing the Max graphical coding environment and its integrated Node.js functionality, Collab-Hub allows for multiple users to send data easily and remotely so they can collaborate with that data in real-time. <br/><br/>In this talk, we discuss the design of the Collab-Hub framework and provide examples of using it to create scalable and reconfigurable interaction layouts between Max patches and web-based instruments/interfaces. An analysis of pertinent historical precedents highlights the advantages Collab-Hub provides to artists who have little to no web development experience or to those who may be new to creating telematic and remote performance environments. A showcase of works created with Collab-Hub demonstrates the wide variety of artistic endeavors made possible through the framework.
This demo consists of two elements: the web technology for real-time vocoding and its application for live voice transformation in web conferencing. This is an interesting approach, now more than ever in times of COVID-19, for modifying your speech in virtual conferences. We present a software-based solution that should be available to anyone doing web conferencing.
While there are many examples of interactive applications being built with the Web Audio API – from new and unique synthesis instruments, to emulators of popular signal processors, to audio engines for games – there are remarkably few that make use of the API’s ability to implement systems which autonomously generate music. The work presented at the link below is a collection of six pieces in which all sounds are synthesized in real time: https://www.paulparoczai.net/#/webaudio/
immaterial.cloud is an immersive audiovisual installation that explores a possible networked future of peer-to-peer technologies, away from the cloud. Participants experience the work via two to four smartphones placed in different locations in a room. As participants walk up to a phone, they see a representation of themselves through data. If the participant gets close enough, the phone triggers a change in the sound of immaterial.cloud and the other phones follow.
LudoTune is a browser-based musical toy that allows users to create music in a 3D environment by building cube-based structures. Users’ creations can range from simple loops, to strange musical sculptures, to entire songs involving hundreds of cubes.
In this paper, I describe the artwork Radioactive Monstrosities, a web-audio interface that addresses ways of listening to collective voices and certain female-sounding voices that are perceived as inappropriate or annoying — because of the quality of their sound, their gender, the medium’s distortions but also stereotypes and collective memories that they often awake. These are verbal expressions that have been associated with forms of ‘monstrosity’ since ancient times. Visitors of the page are invited to record themselves and choose a type of distortion to participate in, forming new imaginaries around technologically-mediated voices, which through their technical ‘monstrosity’, can reveal other forms of speech.
Noise symphony is part of a series of pieces created from a selection of 690 sound sample taken by the author from https://freesound.org database using http://playsound.space software, from a query using the word “noise”, that retrieved around 37000 results. Sound were selected based on their timber morphology, following a criteria of apparent visual difference. In this piece, sounds will be played together with solo extended vocal techniques
Aerial Glass is a real-time performance, delivered through a website, of aerial silk acrobatics composited with a live browser performance of various video clips and audio composition. It uses the NexusHub framework to distribute a live in-browser performance of pre-rendered, transparent video clips, audio files, and web audio effects to everyone else visiting the website. A live physical performance on aerial silks is recorded in front of a green screen and live streamed to the website to be composited with the browser performance. <br/><br/>We propose a live performance of the work during the Web Audio Conference with performers at LSU, distributed to a worldwide audience through a website.
We present a hybrid live coding musical performance using the latest version of Sema, the live coding language design and performance playground for the Web, with a custom Web Audio API signal engine. We explore machine learning, bespoke languages and interfaces as creative material in musical performance. Bernardo, Kiefer and Magnusson will collaborate as an ensemble of three, with networked and synchronized instances of Sema, in a performance which combines elements of improvisation and machine agency.
SHP of THSEUS is an audiovisual composition we worked on collectively, drawing inspiration from the Greek myth in which, while on his adventures, Theseus would slowly replace parts of his ship until every part was eventually replaced. This is a philosophical quandary of identity. We navigate this concept through the use of Collab-Hub and our individual performance setups, in combination with the score.<br/>The composition uses a series of 10 graphical images, selected at random, that are sequentially presented to each performer. These graphical images are categorized as either ‘sound’ or ‘control’. When performers receive a ‘sound’ image, they interpret their image and contribute sonically. When performers receive a ‘control’ image, they interpret their image and send control data to the performers which affect the gestures performed by others. As we alter and perform one another’s patches over time, each of the individual performers relinquish control over some aspects of their own performance, even as they continue to steer forward with their interpretation of the score
This submission proposes a web-based collaborative performance using Floop, a loop jamming system based on the Freesound database. The performers meet in a virtual room with a collection of loops that have been matched for rhythm, and organized according to timbre similarity. Their exploration develops as a DJ session, which is visible to the audience through the projected interface.
This performance combines machine learning algorithms with music information retrieval techniques to retrieve crowdsourced sounds from the online database Freesound.org. The use of a live coder virtual agent companion complements a human live coder in her/his practice. The core themes of legibility, agency and negotiability in performance are researched through the collaboration between the human live coder, the virtual agent live coder and the audience.
Glicol is a graph-oriented live coding language developed with Rust, WebAssembly and AudioWorklet. This language can run in its web-based IDE that supports collaborative coding. In this performance, we invite the participants from the Glicol workshop to join the performance both at the scene or virtually. Each performer will be given the opportunity to write at least one node chain in the audio graph, and the first author will be in charge of when to execute the code. The music style will be improvised experimental/ambient techno.
Web Audio is an intrinsic part of the next generation of applications for multimedia content creators, designers, researchers, music tutors, artists, and consumers. New advances in web audio and software for audio analysis, music information retrieval (MIR), and machine learning open up exciting possibilities. We have recently released Essentia.js, based on one of the most popular MIR libraries on the native platform. We have also created various pre-trained deep learning models for inference with TensorFlow.js. In this tutorial, we introduce the key concepts in MIR and cover the basics of using the library for music and audio analysis. We will show example use-cases and assist the participants in building their MIR Web applications.
In this workshop, participants will be invited to try out Glicol, a graph-oriented live coding language written in Rust.
Participants will get familiar with the syntax of Glicol, as well as its browser-based environment developed with WebAssembly, AudioWorklet and SharedArrayBuffer. In the browser-based interface, a new form of interaction in collaborative live coding will be introduced too. After that, participants can brainstorm new features together and learn how to customise the language. In addition, there will be a scheduled live coding performance with Glicol at the conference, and participants of the workshop can choose to join as co-performers.
This workshop serves as an introduction to building remote/local networked audiovisual performances and pedagogical tools using Collab-Hub, a package for remote collaboration based on Node.js and implemented within Max and as a web-based interface. Collab-Hub is a system built for sharing of data and eliminates the need for collaborators to be aware of their/each others’ IP address. It has applications in many performance paradigms, including telematic performance, laptop orchestra, mixed ensemble with digital elements, distributed control, net-to-physical interaction, and more.
This paper introduces a design and the implementation of a proof of concept for a sonic cyberspace. The purpose of this is to explore new media, and find potential in our existing technology and infrastructure. The central theme of this cyberspace is collective collaboration, and documenting the process of developing speculative creativity platforms. It is discovered some streaming technology, such as Icecast, is not suitable for more complex use-cases. The paper proposes an appropriation of modern streaming protocols, and discusses the potential of incorporating out-of-band metadata to explore unique applications of this design. The paper discusses how the attitude towards composition transforms when the ability to dominate experience is countered by randomness. Additionally, the design suggests only the creative experience can have no latency as well as a certainty of realness, questioning the relevance of real-time and live streaming for performance and collaboration in music.
In this workshop, we describe building a project integrating Web Audio with REST APIs. Highlighted will be discussion of an approach for quantifying audio quality. This semi-supervised algorithm helps assess changes in speech-based audio quality.
In the past, two standards for WebAudio plug-ins existed, with a certain degree of compatibility: WAP (for WebAudio Plugins) and WAM (for WebAudio Modules). Such plugins could be used in different hosts, including a commercial online DAW (AmpedStudio.com), see screenshots at the end of this proposal.
There were some relationships between the two, some authors worked on both projects, and WAMs were a particular case of WAPs, but this was a bit confusing.
All the people involved (Jari Kleimola and Oliver Larkin from WebAudioModules.org, engineers from the online DAW AmpedStudio.com, Michel Buffa and Shihong Ren, Steven Yi from Csound, FAUST DSL team Stéphane Letz, Yann Orlarey, a small french company 53JS.com) decided to merge and unify their work in early 2020.