Making sounds in JavaScript

First sound

Let's make a sound in JavaScript. Browsers implement WebAudio. WebAudio provides two ways of making sounds:

  1. The simpler way is by connecting a collection of primitive nodes that WebAudio provides - these are things like oscillators, barebones IIR filters etc.
  2. The more advanced way is to use a thing called "AudioWorklet", wherein we write code to compute the individual samples themselves.

Let's make a sound using the first method.

const ctx = new AudioContext();

const osc = new OscillatorNode(ctx);
// Reduce volume, the default is too loud!
const mix = new GainNode(ctx, { gain: 0.1 });

That is what the button above does.


When you toggle the button above, you'd notice a glitch when the sound starts and stops.

Our ears are more sensitive than you might consciously imagine. When we start the sound above, it immediately starts playing, and this sudden movement is heard as a rather ugly pop. Ditto when the sound immediately stops.

Real life sounds don't behave like that. They have an attack phase, when the sound slowly rises from nothing (say if you hit a drum with a stick). And there is a release phase where the sound slowly dies away instead of suddenly stopping.

Let's add them to our sound. And also move it to a function.

const beep = (duration) => {
  const attack = 0.001; // Attack of 1 ms
  const release = 0.1; // Release of 100 ms
  const t = ctx.currentTime;

  const osc = new OscillatorNode(ctx);
  const env = new GainNode(ctx);
  env.gain.setValueCurveAtTime([0, 1], t, attack);
  env.gain.setTargetAtTime(0, t + attack + duration, release / 5);
  const mix = new GainNode(ctx, { gain: 0.1 });
  osc.stop(t + attack + duration + release);


Instead of clicking the button ourselves, let us ask the computer to click it for us, 7 times every second.

setInterval(() => {
}, 1000 / 7);

Why 7?

In music, a second is like an eternity. Events in music happen at the time scale of milliseconds, and there are a thousand milliseconds in a second.

However, something interesting happens at around 20 to 50 milliseconds, give or take. You might recall that movies have 24 frames per second (i.e. 40 ms for each frame). So if we take a still picture, and move it 24 times per second, it starts to look animated to us.

The same happens with sound! If there is some movement of air more than around 24 times per second, we start perceiving it as sound. Less than that, we hear them as individual musical notes.

So the number of beeps have to be less than 24 for us to hear them as individual beeps and not as a single sound. In practice, this threshold varies depending on the exact sound being played and a lot of other factors. 7 per second is a conservative value that we'll always hear as events and not as a single sound.

7 is also interesting for other reasons (e.g. musical scales usually max out at 7 notes, even though there are 12 available to us). This might be related to The Magical Number Seven, Plus or Minus Two, but this aside is already too long and I'm now ranting.

Rest of the owl

Those of you in the know might've winced at my using setInterval above to set the time between individual notes.

Even though we can't hear individual notes smaller than say 40 milliseconds, we are still surprisingly sensitive to other aspects of sound at smaller timescales. We can perceive music as being "out of beat" even if notes are off by, say, 10 ms. For some aspects like the attack time, we're sensitive to millisecond level differences.

The setInterval function in JavaScript has too much jitter for it to be useful for triggering musical events.

Fortunately, there is an alternative. The time kept by WebAudio itself is precise (we used it above as ctx.currentTime). We can tell WebAudio to do something at a particular time, and it'll do it exactly then.

Unfortunately, Safari spoils the party here. Safari throttles both setInterval (and its better alternative, requestAnimationFrame) when the user switches tabs. While WebAudio time is precise, we still need to generate and schedule the next batch of events if we're trying to play a generative piece of music, where it is expected that the user will keep our tab running in the background instead of staring at it.

So WebAudio's standard APIs are not helpful for playing long generative pieces of music. For that, we'll need to use the more advanced option of AudioWorklets instead which run unthrottled in the background.

That's for a future post. But keep in mind that if you're not looking for background music, the standard APIs might be enough. Here is a small synth with a lot of source code comments explaining the aspects we covered here in more depth.

I also wrote a tutorial on making music using Euclidean rhythms.