[Introduction]
He gave an example of how audio data is affected by post-processing and the process when releasing a master audio source recorded in Hi-Res in April of last year. Specifically, he talked about dithering and noise shapers when reducing the bit depth, and also showed graphs of how the audio changes when uploaded to YouTube.
However, the change in sound quality due to the codec used by Internet services feels a little undercooked, and we thought that further detailed testing would lead to a better understanding, so we conducted a new test this time, and we hope that you will find it useful.
In the previous test, we used a step-like input of a specific frequency signal between 20Hz and 160KHz, but the compression algorithms used by many codecs thin out bits according to the dynamic behavior of the signal, and most of them use knowledge from psychoacoustics. Specifically, there are methods that use equal loudness curves as the compression threshold, and methods that use the masking effect. In simple terms, it is like thinning out sounds that are inaudible to the human ear and omitting sounds that are masked by loud sounds. Actual compression algorithms use more elaborate and complex processing, and mathematical information compression (such as Huffman coding) is used.
Considering the above, this time, the main purpose is to upload actual musical sounds and show the difference in hearing sensation, and as additional information, we have prepared a graph that allows you to visually see what happens when a 20Hz to 20KHz sweep signal is input. Also, last time we only looked at YouTube, but since many videos are posted to Facebook and X (formerly Twitter), we also covered the change in sound quality when videos are posted to these sites.
[Verification of differences in audio codecs used when creating videos]
The audio for the video was created by mastering a piano recording (32bit/352.8KHz) made by the author, down-converting it to 24bit/48KHz as appropriate, and then encoding it into the following three types of videos for uploading.
Video and audio
・AAC_LC 256kbps
・Opus 256kbps
・PCM 24bit/48KHz (uncompressed)
First, let's look at a graph of a 20Hz to 20KHz sweep signal.
[Changes in sound quality due to YouTube]
Original (PCM 24bit/48KHz)
Audio data created in AAC_LC 256kbps was uploaded to YouTube and then played
Audio data created with Opus 256kbps was uploaded to YouTube and then played back.
Audio data created in PCM 24bit/48KHz was uploaded to YouTube and then played back.
The order of least change compared to the original (least deterioration in sound quality) appears to be PCM → Opus → ACC_LC. In particular, from 500Hz and above, ACC_LC has bleeding over a fairly wide range, and it can be seen that the deterioration in sound quality in the mid-to-high range is more severe than PCM and Opus.
On the other hand, it is also interesting to see how audio encoded in Opus is processed by YouTube. Since various audio data is encoded in Opus on YouTube, if you prepare audio encoded in Opus in advance, you may be able to avoid re-encoding on the YouTube server side. However, as you can see from the graph, there is a difference, although it is slight, between PCM and Opus. Therefore, I think it can be said that even if you upload in Opus, it will be re-encoded.
[Audible differences when uploading music]
Now let's compare the results of uploading a music video with a different codec. To make it easier to understand, we will extract and compare the difference from the original. The quieter the difference sound is, and the less the original music image can be felt, the less the sound quality degradation due to the codec is.
*The volume of the difference data is low, so please listen with headphones etc.
What do you think? As you can see from the sweep graph, the difference between AAC_LC and the original is that the melody and chords of the song are more noticeable than the PCM upload. In other words, I think it can be said that AAC_LC contains more distortion and distortion compared to the original.
[Changes in sound quality when posting videos on Facebook]
Next, let's look at what happens when you post a video on Facebook.
As before, let's look at a graph of a 20Hz to 20KHz sweep signal.
Original (PCM 24bit/48KHz)
Audio data created in AAC_LC 256kbps was uploaded to Facebook and then played
Audio data created with Opus 256kbps was uploaded to Facebook and then played
Audio data created in PCM 24bit/48KHz was uploaded to Facebook and then played back.
The order of least change (least deterioration in sound quality) compared to the original appears to be PCM → ACC_LC → Opus. What is common to all of them is that they are significantly worse than the ones uploaded to YouTube. Although it is unclear what process causes this, several thin lines that seem to be reflected from the original sound can be seen throughout the entire range, and from 8KHz onwards, a band of overlapping distortion appears at a fairly loud volume, as if scribbled with a magic marker. Also, the high range cannot be reproduced up to 20HKz, and is cut off around 17.5KHz.
It is noteworthy that the audio data encoded with Opus is the most degraded, with distortion components spreading widely not only in the high frequencies but also in the low frequencies, making it clear that posting videos to Facebook using Opus is the worst. AAC_LC does not show a significant difference compared to PCM, which is likely related to the fact that Facebook's server encoding is currently AAC_LC. However, it seems that the same codec is re-encoded, and it can be confirmed that distortion around 2KHz to 7KHz has increased with AAC_LC.
Now, let's test it with music data, just like we did with YouTube.
*The volume of the difference data is low, so please listen with headphones etc.
[Changes in sound quality due to video posting on X (formerly Twitter)]
Finally, let's look at the change in sound quality caused by video posts on X (formerly Twitter). X's video posts did not accept PCM or Opus audio, so we will only post videos encoded in AAC_LC.
Let's look at a graph of a 20Hz to 20KHz sweep signal.
Original (PCM 24bit/48KHz)
Audio data created in AAC_LC 256kbps uploaded to X and played back
The degradation caused by encoding with X was not as bad as I expected, and the sound was relatively clean. The degradation was significantly reduced compared to Facebook. However, the high frequencies were limited to 15KHz, and were cut off before reaching 20KHz.
What if you upload music data?
*The volume of the difference data is low, so please listen with headphones etc.
In the case of X, it was not possible to compare the different codecs, so try listening to the difference between the audio when uploaded to YouTube and Facebook. There is little blurring, but the afterimage of the original musical image is the strongest. Of course, there is a difference in the degree of degradation, but each compression algorithm has its own quirks, which I think are somehow the cause of the difference in sound quality.
【lastly】
Above, we have examined the change in sound quality when uploading videos to YouTube, Facebook, and X, using concrete materials. Many people choose different SNS audio quality depending on the purpose of their posts. Of course, when it comes to distinguishing conversations or understanding the state of a snapshot video, I think all of the services guarantee impeccable quality.
However, if you want people to listen to your music content in high quality, you may need to stop and think for a moment.
For example, Facebook and X posts are automatically played on the timeline, which gives them an advantage in terms of frequency of exposure or number of views, so they may be used as initial promotions. Also, it may be necessary to consider the content of posts according to the user demographics of each SNS (age, music genre preferences, etc.).
Taking these things into consideration, there may be cases where you choose a social networking site that sacrifices sound quality but can increase the number of views from a marketing perspective. However, you may also want to keep in mind that you will end up spreading a lot of music that contains significant degradation and noise that is unsuitable for playing music.
We hope that this information will be of use to you as a reference when making your decision.
Comments