GNOME Bugzilla – Bug 753275
Opus: playback gain always relative to EBU R128 level (-23db LUFS), must be played 5db louder to match ReplayGain level (-18db LUFS) when using ReplayGain
Last modified: 2018-11-03 11:39:57 UTC
(not sure if this is the correct place to ask for this) I use Clementine (GStreamer-based) and ReplayGain all my music. I noticed that while playing around with the new Opus format, those files are played much quieter than music in other formats. After digging around, I found: Opus files are by design always normalized to -23db LUFS (R128), while ReplayGain defaults to -18db LUFS. Foobar2000 seems to simply preamp these files +5db when playing. Could GStreamer do the same? Or maybe take an argument of what reference level to use so players can set and forget it? (Maybe related: Bug 751534)
I'm not sure I get the problem, but this would normally be the player's choice. opusdec has an option to automatically scale audio based on the header's stated gain (enabled by default). Opus can have a secondary gain in the comment header, which is not used by opusdec, but may be used by a player as it appears in a tag. Now, it may be that the automatic scaling opusdec does is wrong, I'm not familiar with the use of replay gains so I'm not sure how I'd tell.
I reported this to Clementine first but was told to report this here. A cursory look at the source code and a comment tells me that the player sets up this pipeline: "queue ! audioconvert ! <caps32> ! rgvolume ! rglimiter ! audioconvert2 ! tee [etc.]", grepping the code further doesn't reveal any other handling of ReplayGain. So any responsibility is shoved off into the rg* components of GStreamer it seems. The problem is that GStreamer correctly turns down the volume while playing an Opus file, but undershoots the mark by 5db. The playback gain header of Opus files is married to the EBU R128 specification, which means turning down to a reference level of -23db. RG defaults to -18db. REPLAYGAIN tags like in other formats are forbidden by the Opus specification, so Opus files need special treatment by ReplayGain-handlers. I know nothing of the inner workings of GStreamer, but it seems to me that rgvolume should catch this and adjust the volume accordingly.
PS: https://tools.ietf.org/id/draft-ietf-codec-oggopus-08.txt pages 12, 13, 23
Do you have a small sample file with this setup ?
Created attachment 350681 [details] Ogg Vorbis sample, ReplayGained WAV sample from http://soundbible.com/2164-Steam-Train.html, converted to Ogg Vorbis using oggenc2.exe from http://www.rarewares.org/ogg-oggenc.php#oggenc-libvorbis, ReplayGained with Foobar2000.
Created attachment 350682 [details] Opus sample, ReplayGained with Foobar2000 WAV sample from http://soundbible.com/2164-Steam-Train.html, converted to Opus using opusenc.exe from http://opus-codec.org/downloads/, ReplayGain'ed with Foobar2000.
Thanks. When this bug is fixed, these two samples are supposed to sound the same, volume wise, right ?
Correct.
Created attachment 350789 [details] [review] taglist: add R128_TRACK_GAIN
Created attachment 350790 [details] [review] vorbistag: add R128_TRACK_GAIN
Created attachment 350791 [details] [review] rgvolume: add R128_TRACK_GAIN
Created attachment 350792 [details] [review] rgvolume: add R128_TRACK_GAIN Make the 5 a define with comment
Hmm. I think I broke it when cleaning up, the Opus one sounds louder now. Checking...
Played without rgvolume, those sound about the same I think. The Vorbis one has a track gain of -8.79, the Opus one of 0. If I change the code to offset by -5 (the R128 difference from current replay gain) minus 3.79, I get something that sounds about hte same volume. Is the R128_TRACK_GAIN in the Opus file really supposed to be 0 ?
I'm no expert here, but opusinfo tells me that while R128_TRACK_GAIN is 0, playback gain is -13.9922 dB, so I guess it's correct? I read somewhere that since all (?) players honor the playback gain but not necessarily the tag, the Opus people wanted the gain encoded mainly into the playback gain field. Here's the most current RFC I could find: https://tools.ietf.org/rfc/rfc7845.txt Page 14, regarding the output gain field: """ Players and media frameworks SHOULD apply it by default. If a player chooses to apply any volume adjustment or gain modification, such as the R128_TRACK_GAIN (see Section 5.2), the adjustment MUST be applied in addition to this output gain in order to achieve playback at the normalized volume. """ Page 24 talks about the tags: """ If present, R128_TRACK_GAIN and R128_ALBUM_GAIN MUST correctly represent the R128 normalization gain relative to the 'output gain' field specified in the ID header. If a player chooses to make use of the R128_TRACK_GAIN tag or the R128_ALBUM_GAIN tag, it MUST apply those gains _in addition_ to the 'output gain' value. If a tool modifies the ID header's 'output gain' field, it MUST also update or """ If I understand correctly, you need to add output gain and track/album gain and then add 5 to that to get to what ReplayGain does. Also, I think you should expand on your comment on the 5 a bit. Something like "We normalize to a reference volume of -18db LUFS because that's what early ReplayGain implementations did; the Opus file format instead defers to the EBU R128 standard of volume normalization which normalizes to -23db LUFS, so we have a difference of 5db that we need to add to be consistent with the ReplayGained content in the wild."
There were two additional things: - opusparse is nuking the header gain, which was showing up as 0. This was introduced by 51edbeb9d94b2a85b303097759af7cf35a344cb5, and this patch fixes it (though probably breaks what slomo changed): diff --git a/ext/opus/gstopusparse.c b/ext/opus/gstopusparse.c index 56e8bb8..41007b3 100644 --- a/ext/opus/gstopusparse.c +++ b/ext/opus/gstopusparse.c @@ -366,7 +366,7 @@ gst_opus_parse_parse_frame (GstBaseParse * base, GstBaseParseFrame * frame) } if (!(frame->flags & GST_BASE_PARSE_FRAME_FLAG_QUEUE)) { - if (FALSE && parse->id_header && parse->comment_header) { + if (/*FALSE &&*/ parse->id_header && parse->comment_header) { guint16 pre_skip; gst_buffer_map (parse->id_header, &map, GST_MAP_READWRITE); This causes the header to be parsed again, and thus use the real gain from the header, instead of assuming 0. - since there is no peak specified, rgvolume will assume 1.0, and refuse to increase volume past what it sees as potentially clipping. If I add headroom=200 to get rid of that check, volume with the sample you gave does increase by the 5 dB (or at least the Vorbis and Opus versions sound similar in loudness to me). slomo: ideally, I reckon we should parse the id_header (if we have one) in the else branch (the one that's always taken currently) and use preskip (which AFAICT is always 0 now) and gain (and possibly other fields) when creating the new id header (since I don't really know why it's now re-created if we have one).
Created attachment 351171 [details] [review] opusparse: do not drop preskip and gain from OpusHead Well, I've made a patch which parses preskip and gain without interfering with slomo's changes.
Review of attachment 350789 [details] [review]: ::: gst/gsttaglist.c @@ +409,3 @@ + gst_tag_register_static (GST_TAG_R128_TRACK_GAIN, GST_TAG_FLAG_META, + G_TYPE_DOUBLE, _("EBU-128 replaygain track gain"), + _("EBU-128 track gain in db"), NULL); It's called EBU-R128, or not? ::: gst/gsttaglist.h @@ +1089,3 @@ + * GST_TAG_R128_TRACK_GAIN: + * + * track gain in db using EBU-128 (double) It's dB, not db @@ +1090,3 @@ + * + * track gain in db using EBU-128 (double) + */ Since: 1.14 @@ +1091,3 @@ + * track gain in db using EBU-128 (double) + */ +#define GST_TAG_R128_TRACK_GAIN "r128-track-gain" ebu-r128-track-gain?
Review of attachment 350789 [details] [review]: ::: gst/gsttaglist.h @@ +1087,3 @@ #define GST_TAG_PRIVATE_DATA "private-data" +/** + * GST_TAG_R128_TRACK_GAIN: And here of course also GST_TAG_EBU_R128_TRACK_GAIN
Review of attachment 350792 [details] [review]: ::: gst/replaygain/gstrgvolume.c @@ +97,3 @@ +/* R128 needs 5 dB more to match usual replay gain semantics */ +#define R128_GAIN_ADJUSTMENT 5 It's just replaygain offset by 5dB? Seems unlikely? Are you sure it's the same algorithm for measuring/calculating it?
They're different algorithms and can come to different conclusions, the 5 dB are just the difference between the reference levels. There are problem samples that show differences of 10 dB (search Hydrogenaudio). However: Foobar2000 seems to use the EBU R128 algorithm in newer versions and still writes the results to ReplayGain tags. So you never know what algorithm was actually used by looking at whatever tags. And the difference for the majority of audio out there is probably small. You could lower ReplayGain things by 5 dB, too, it would just go against what users might be accustomed to ;)
See https://hydrogenaud.io/index.php/topic,109076.msg899298.html#msg899298 and the rest of the thread for more info.
Great, that sounds like foobar2000 does it very wrong then :P I'd prefer to use the correct calculations for each, and make sure to keep them separately everywhere.
Well, uh, you're not losing anything by using a good enough statistical match for a dB-difference? All you can really do on your end here is, I think, settling on one reference volume level. And maybe offer a shiny new tool that scans your music and video collection and applies the correct (R128) algorithm so I don't have to use Foobar2000 anymore on my Fedora box :) Also, what other tags should Foobar2000 write to, given that tag support is spotty as is (which is why the Opus people firm-coded it into the spec) and nobody wanted to invent a new tag for all supported container formats? I think mixing them is okay, since, sadly, only technical people seem to know about volume normalization and most RG tags are probably homegrown anyway.
Well, EBU R128 specifies how to measure the value, and so does ReplayGain. It seems wrong to call something EBU R128 or ReplayGain if the algorithm used is different. While the difference shouldn't matter much in practice, it just does not seem right. However elements can of course opt-in to handle both tags the same (+/- 5dB offset).
Created attachment 351417 [details] [review] taglist: add R128_TRACK_GAIN
Created attachment 351418 [details] [review] vorbistag: add R128_TRACK_GAIN
Created attachment 351419 [details] [review] vorbistag: add R128_TRACK_GAIN
Created attachment 351420 [details] [review] rgvolume: add R128_TRACK_GAIN Just renames, no algorithm change.
commit 413406d28a39a088b4a4749af5f0523fcf92f094 Author: Vincent Penquerc'h <vincent.penquerch@collabora.co.uk> Date: Fri May 5 11:05:40 2017 +0100 opusparse: do not drop preskip and gain from OpusHead header https://bugzilla.gnome.org/show_bug.cgi?id=753275
The attached patches are missing the R128_ALBUM_GAIN tag, see the RFC I linked to above. > However elements can of course opt-in to handle both tags the same (+/- 5dB offset). Which means that all clients would need to be updated to handle this?
(In reply to Nikolaus Waxweiler from comment #31) > > However elements can of course opt-in to handle both tags the same (+/- 5dB offset). > > Which means that all clients would need to be updated to handle this? Not really. From the application point of view, they should just use elements that automatically apply the gain as needed and support both. Or for media creation, use one element to generate both tags at once and then have muxers/encoders store those in any way they can.
Created attachment 351526 [details] [review] taglist: add R128_TRACK_GAIN
Created attachment 351527 [details] [review] vorbistag: add R128_TRACK_GAIN
Created attachment 351529 [details] [review] rgvolume: add R128_TRACK_GAIN I found the algorithm in ITU-R BS.1770-2. The resulting loudness of the Opus sample is noticeably lower than that of the Vorbis sample when using it, however.
Created attachment 351702 [details] [review] rgvolume: add R128_TRACK_GAIN Oops, forgot to add header file changes in the patch.
What remains here?
The rgvolume patch has a big #if 0 block and Vincent said that it behaves noticeable different with Vorbis and Opus, which needs investigation. And from what I can see this so far only has the different algorithm in rgvolume. rganalysis should probably get the same too.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/210.