Short-time Fourier transform

A spectrogram (using the Short-Time Fourier Transform, STFT) is a visual representation of how the frequencies of a signal change over time. It works by breaking the signal into small time segments (windows), applying the Fourier Transform to each, and displaying the resulting frequencies as a heatmap, where time is on the x-axis, frequency on the y-axis, and color intensity represents amplitude. A longer window gives you finer frequency resolution (Δf1/T) but poorer time localization; a shorter window does the opposite. This is related to uncertainty principle.

Our brain-ear system performs a process similar to a spectrogram. The cochlea in the inner ear acts like a frequency analyzer, breaking incoming sounds into different frequency components, much like the Short-Time Fourier Transform (STFT). These frequency signals are then sent to the brain, which processes and interprets them as speech, music, or other sounds.

Related:

Related: how the sound is produced by software.