Sampling

Pulse Code Modulation as a method to digitize sound has been proposed by  Alec Reeves in 1937. It had to wait until the transistor and the IC become available at low cost before this method was successfully introduced in the consumers market. Almost all of today's digital audio is PCM audio.

Sampling and quantization of a signal (red) for 4-bit PCM

The principle is simple; the analogue signal is measured at uniform intervals.

This is called the sample rate.

The magnitude of the signal is translated into a numeric value and as it is a number, it could be represented as a binary number. This are the bits.

Sampling rate

According to Shannon-Nyquist you must sample at the double of the highest possible frequency in the input.
If you sample at 44.1 kHz any frequency above Nyquist (22.05) will be an error as you capture only 1 point instead of 2.
To avoid this, the input must be band limited. This is a matter of using a low pass filter.
If you want to cover the entire audible range, up to 20 KHz, you need a very steep filter (brick wall) to avoid signals > Nyquist.
This filter might have a negative effect on sound quality just like the brick wall filters in the first generation CD players.
If you sample at 88.2, your problems remain the same. No frequencies above Nyquist please.
This time Nyquist is at 44.1.
As there aren’t too many instruments producing frequencies this high and as our hearing stops at 20 KHz, we now might decide to use a gentle low pass filter starting e.g. at 30 kHz and smoothly roll off.
We avoid the artifacts of brick wall filters this way.

A direct effect of the higher sampling rate is that for an identical filter design the time
displacements will scale inversely with sample rate. Hence an improvement can be
made just from raising the sample rate - even for those who cannot hear above
20kHz.
Julian Dunn

Bit depth

Each sample is represented by a number of bits. In case of CD audio, this number is 16. The entire range from silence to the highest possible signal is divided in 2^16=65536 steps. Using 24 bits, the same range is divided in 2^24=16777216 steps.

 

8Bit 24Bit
8 bit 16 bit 24 bit


Today almost all recordings are done in 24.
You always need some headroom when recording as you can’t predict the loudest passage. Record at 16 bits with 12 dB headroom (1 bits is 6.02 dB) means you are recording with a 16-2=14 bits resolution. Do the same with 24 bits and you have a 22 bits resolution.

Oversampling

Playing digital audio is converting the digital samples to an equivalent analogue value.

This is done by a piece of circuitry aptly called DAC (Digital to Analogue Converter).

 

Inherent to the mathematics involved in the conversion is that higher-frequency multiples of the audio are created, the so called aliases. In case of CD audio, the sampling rate is 44.1 kHz, the audio range is the half; so the first alias will start at 22.050 kHz.

 

 

These aliases are outside the audible range so why bother? Although we can't hear them; they are there and can create intermodulation distortion in the audible range. Maybe your tweeters don't like it either.

The answer is a low pass filter (anti alias filter). If this filter starts at 20.000 and has to remove everything before the first alias starts (22.050 assuming CD audio) then it has to be very steep (brick wall filter). Filters this steep are expensive, complex and introduces all kind of artifacts like phase distortion and pre-ringing.

 

The big trick is oversampling.
Oversample a 44.1 kHz recording 8 times.
Now it runs at 352.8 kHz.
As it is digital, everything remains the same so we still have the aliases but as they still start at 1/2 Fs now we have the first alias at 176 kHz.
Oversampling simply pushes the aliases much farther away from the audible range. This allows using a very gentle low pass filter way out of the upper limit of our hearing.

 

Upsampling

In principle there is no difference between upsampling and oversampling. It is both about stepping up the sample rate.
In practice oversampling is done using a whole number (8 in today’s DACs) and is often a implemented in the hardware.


Upsampling is often done in software, you have 44.1 kHz audio file and your sound card supports 48 kHz only. Playing 44.1 with a rate of 48 is like playing a LP with 45 rpm.
To play it properly, the 44100 samples per second must be transformed in 48000 samples a second. This can only be done by interpolating (guessing the points in between using math).
This is more complex than plain oversampling.

Asynchronous sample rate conversion

This topic is covered here

References
  1. Sampling 101 - DonH50
  2. WHICH OF THE TWO DIGITAL AUDIO SYSTEMS BEST MATCHES THE QUALITY OF THE ANALOG SYSTEM? - Wieslaw Woszczyk, Jan Engel, John Usher, Ronald Aarts, Derk Reefman