How Shazam Works 👂🎵

Morning Bruno!

You’re at a wedding.

The Bride and Groom procrastinated, so the DJ is from Craigslist.

No sweat, because he knows exactly what the people want to get down to.

You hear that fateful sound…

It’s the first beat of Bruno Mars’ killer wedding anthem: Uptown Funk.

Now, you easily recognize the song from its first beat.

The specific combination of sounds activates neurons in your brain that unlock the historical song data.

But how does Shazam do that?

Because a computer doesn’t have an understanding of music. It doesn’t get rhythm, or pitch, or timbre.

It thinks in bips and bops – 1s and 0s.

So for a computer to try and identify Uptown Funk, it’d be like trying to find a needle in a haystack:

Where it can only find the needle by looking at a picture of the needle:

Then comparing that picture to every. single. piece. of. hay. one. by. one.

Did I mention the stack of hay is 230 MILLION songs deep?

So with this huge haystack comparison problem, the engineers at Shazam HQ got smart:

When you press SHAZAM!, your phone microphone starts listening:

Then Shazam produces a spectrogram of the soundwaves it receives.

That’s time on the x-axis.

Frequency on the y-axis.

And loudness on the z-axis. Or in this case the depth of color from black to yellow.

With this data, Shazam can distill any song into a fingerprint: