GNOME Bugzilla – Bug 786539
Ensure mp3 extractor doesn't step out of mmap'ped memory
Last modified: 2017-08-22 16:56:05 UTC
Coverity taints the big endian->host format translations done for ID3, which ends up in warnings because those sometimes contain offsets within the file (CID #162187). It seems to me all calculations already prevent any jumps out of mmap'ped memory boundaries (NB: This may happen on legit cases if tags exceed the 5MB limit we read, this is feasible with eg. files containing multiple album art tags). However, I'm attaching some patches that straighten out the checks a bit and make others to happen explicitly.
Created attachment 358013 [details] [review] extract/mp3: Check for experimental flag right after fetching flags No need to do further operations here if the experimental flag is enabled.
Created attachment 358014 [details] [review] extract/mp3: Make clearer checks on offsets read from file Put these on one side of the operand without added values. Spotted by Coverity (CID #162187) (Tentative fix)
Created attachment 358015 [details] [review] extract/mp3: Assert that the reader position is within mmap boundaries The 'pos' variable is always at the start of a frame, which means pos can't surpass the mmap size minus the frame size. This just happens implicitly, and is a programming error if we let that ever happen, so let's make it really sure this isn't possible.
Review of attachment 358013 [details] [review]: Looks good. ::: src/tracker-extract/tracker-extract-mp3.c @@ +1838,3 @@ + /* We don't handle experimental cases */ + if (experimental) { + g_message ("[v24] Experimental MP3s are not extracted, doing nothing"); ooi, shouldn’t this be g_debug() rather than g_message()? (That would be a change for a separate patch, if at all.)
I think this code would benefit from merging parse_id3v23() and parse_id3v24(). From comparing their implementations, the only differences are: • The sub-calls to get_id3v*_tags(). • 5 lines of code around ‘We found a IDv2 footer’ in the v24 implementation. • A missing `continue` after “[v23] Content size was 0, moving to next frame” in the v23 implementation (I suspect this is a bug?). Similarly, get_id3v*_tags() look almost identical, apart from a missing (offset >= csize) check in the get_id3v23_tags() implementation (ID3V24_COMM case).
Review of attachment 358014 [details] [review]: This looks good to me. The only non-obvious change is for (pos + frame_size + csize > size) ⇒ (csize > size - frame_size - pos), where we want (frame_size + pos <= size) otherwise we underflow. That’s guaranteed, however, by the fact that: • (pos + frame_size <= tsize + header_size) from the start of the loop • (tsize + header_size <= size) from before entering the loop • Hence (pos + frame_size <= size)
(In reply to Philip Withnall from comment #5) > I think this code would benefit from merging parse_id3v23() and > parse_id3v24(). From comparing their implementations, the only differences > are: > • The sub-calls to get_id3v*_tags(). > • 5 lines of code around ‘We found a IDv2 footer’ in the v24 implementation. > • A missing `continue` after “[v23] Content size was 0, moving to next > frame” in the v23 implementation (I suspect this is a bug?). > > Similarly, get_id3v*_tags() look almost identical, apart from a missing > (offset >= csize) check in the get_id3v23_tags() implementation (ID3V24_COMM > case). Same also for parse_id3v20(), actually.
Review of attachment 358015 [details] [review]: ++
Thanks Philip for the extra eyes :). (In reply to Philip Withnall from comment #4) > ::: src/tracker-extract/tracker-extract-mp3.c > @@ +1838,3 @@ > + /* We don't handle experimental cases */ > + if (experimental) { > + g_message ("[v24] Experimental MP3s are not extracted, doing nothing"); > > ooi, shouldn’t this be g_debug() rather than g_message()? Tracker executables have a special log handler that also manage visibility of g_message(). Would be nice to reduce/conflate logging levels but not too high priority :). (In reply to Philip Withnall from comment #5) > I think this code would benefit from merging parse_id3v23() and > parse_id3v24(). From comparing their implementations, the only differences > are: > • The sub-calls to get_id3v*_tags(). > • 5 lines of code around ‘We found a IDv2 footer’ in the v24 implementation. > • A missing `continue` after “[v23] Content size was 0, moving to next > frame” in the v23 implementation (I suspect this is a bug?). Oops, it is. > > Similarly, get_id3v*_tags() look almost identical, apart from a missing > (offset >= csize) check in the get_id3v23_tags() implementation (ID3V24_COMM > case). It would indeed be nice to put all that together... IIRC there's also frame/header incompatibilities, at least between ID3v20 and 2.3/2.4, but still easy enough to merge in a single codepath. (In reply to Philip Withnall from comment #6) > Review of attachment 358014 [details] [review] [review]: > > This looks good to me. The only non-obvious change is for (pos + frame_size > + csize > size) ⇒ (csize > size - frame_size - pos), where we want > (frame_size + pos <= size) otherwise we underflow. That’s guaranteed, > however, by the fact that: > • (pos + frame_size <= tsize + header_size) from the start of the loop > • (tsize + header_size <= size) from before entering the loop > • Hence (pos + frame_size <= size) Exactly :). The last patch making that an assert() should clear up all doubts to the casual reader.
Attachment 358013 [details] pushed as d7643db - extract/mp3: Check for experimental flag right after fetching flags Attachment 358014 [details] pushed as be8b267 - extract/mp3: Make clearer checks on offsets read from file Attachment 358015 [details] pushed as 2f3d04c - extract/mp3: Assert that the reader position is within mmap boundaries
(In reply to Carlos Garnacho from comment #9) > Thanks Philip for the extra eyes :).