GNOME Bugzilla – Bug 721676
typefind does not find the correct media type for mpg with http streaming
Last modified: 2018-11-03 12:19:36 UTC
Created attachment 265502 [details] [review] The patch analyzes the data to find the max probabiltity till the max data size is reached. Sometimes for some mpg streams in http streaming case, only audio is decoded. Video is not displayed. streams has both audio/video content. Analysis: The issue is coming from typefind element. In this, minimum data size required for parsing is set to 2048 bytes and max as 128*1024. Failing case: in this case, httpsrc (soup) source gives 2625 bytes to typefind as first buffer. This buffer is used for parsing to find the suitable data type for further auto plugging. For this size, five consecutive frames of mp3 are found. So typefind declares it as mpeg/audio with probability 99. This is maximum for current case. Hence only audio pipeline is created and so no video at all. Passing case: when this passes ( showing video/audio both) then httpsrc(soup) gives 1165 bytes which are less than min size( 2048) so they are stored and soup again gives 4096 bytes which makes a total of 5261 bytes to typefind. With this data length video/mpeg-sys with probability 100 is declared as typefind output. This causes further auto plugging of demuxer and audio/video decoding path. Hence working fine. These figures are based on some logs and may vary for different run case, but the issue should remain same. Main issue is that when some format is found then it does not check for case where some other format might be found with more probability if some more data is used for parsing. Also proposed a patch to fix the issue.
Comment on attachment 265502 [details] [review] The patch analyzes the data to find the max probabiltity till the max data size is reached. Please update this patch to cleanly apply against 1.2 or git master. 0.10 is no longer maintained since a very long time. Chances are also good that this is fixed already in 1.2. Please provide a testcase or check if this is the case.
This is still reproducible with latest code git master. will submit a new patch for it.
Created attachment 265818 [details] [review] patch with git master new patch set with git master.
it is showing error message as the test file size is more than 1600 kb, max size set for non-patch attachment.
Maybe you could upload it somewhere else? Or maybe the bug is reproducable with a small part of the clip already? (head --bytes=1500k foo.mpg > head.mpg)
Created attachment 265893 [details] Test stream Test stream attached.
Thank you for the patch and the test file. I can confirm the issue with git master.
Thanks for confirmation and suggestion for uploading the file.
Hello Tim, is the patch ok to go ahead?
If I'm not mistaken this is not very efficient... if we get hundreds of little buffers, we will combine them into a single big one... one by one, copying data over and over again. And then typefind over that data over and over again, every time with a little bit of more data in the end. This doesn't seem like a great default behaviour
(In reply to comment #10) > If I'm not mistaken this is not very efficient... if we get hundreds of little > buffers, we will combine them into a single big one... one by one, copying data > over and over again. And then typefind over that data over and over again, > every time with a little bit of more data in the end. > > This doesn't seem like a great default behaviour Thanks Sebastian. Any suggestion please on this to proceed further? how this use case should be handled?
as we k(In reply to comment #10) > If I'm not mistaken this is not very efficient... if we get hundreds of little > buffers, we will combine them into a single big one... one by one, copying data > over and over again. And then typefind over that data over and over again, > every time with a little bit of more data in the end. > > This doesn't seem like a great default behaviour as we know that the data size used in typefind is not sufficient to detect the media type correctly in some streams then i suppose it would be better to have a max data size property in typefind for detection. application can simply set that property for outlier streams which don't play with default typefind data size.
(In reply to comment #10) > If I'm not mistaken this is not very efficient... if we get hundreds of little > buffers, we will combine them into a single big one... one by one, copying data > over and over again. And then typefind over that data over and over again, > every time with a little bit of more data in the end. > > This doesn't seem like a great default behaviour hi Sebastian, with the current default implementation, we are iterating over and again by more data till we find the probability greater than the minimum probability. The minimum probability is not sufficient always for finding correct media type( as in the reported case). In the patch, I have extended the same philosophy, to check for maximum probability for the given scanning range of data. when 100% probability found in b/w, simply break and declare the valid type found. For most of the cases, 100% probability should be found with very few iterations. BR/satish
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/48.