After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 614479 - [mpegtspacketizer] Use CRC to check if tables are duplicate
[mpegtspacketizer] Use CRC to check if tables are duplicate
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal normal
: 1.1.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2010-03-31 15:23 UTC by Sebastian Pölsterl
Modified: 2013-06-23 07:07 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Added CRC comparison (2.64 KB, patch)
2010-03-31 15:23 UTC, Sebastian Pölsterl
committed Details | Review
mpegtspacketizer: Better detect already seen section (5.03 KB, patch)
2013-06-19 18:46 UTC, Edward Hervey
none Details | Review
mpegtspacketizer: Better detect already seen section (5.07 KB, patch)
2013-06-20 15:20 UTC, Edward Hervey
none Details | Review

Description Sebastian Pölsterl 2010-03-31 15:23:57 UTC
Created attachment 157595 [details] [review]
Added CRC comparison

The current code just uses table id, subtable extension and version number to check if the section has been seen before. However, this comparison is not sufficient, causing actually new tables being dismissed.

The attached patch retrieves the CRC and compares those as well.

The patch is based on a discussion I had with Zaheer Merali.
Comment 1 Sebastian Dröge (slomo) 2010-04-02 17:53:21 UTC
commit d7ab75abb8dca1f1b09509a98ce5edb441c5e3fd
Author: Sebastian Pölsterl <sebp@k-d-w.org>
Date:   Fri Apr 2 19:52:25 2010 +0200

    mpegtspacketizer: Additionally use the CRC to check if tables are duplicates
    
    The current code just uses table id, subtable extension and version number to
    check if the section has been seen before. However, this comparison is not
    sufficient, causing actually new tables being dismissed.
    
    Fixes bug #614479.
Comment 2 Edward Hervey 2013-06-19 18:37:59 UTC
So actually it seems to be a bit more subtle than that. That fix did avoid "losing" data ... but a side effect is re-posting duplicated data.

I have a stream here capture from Numericable (french cable ISP, 400+ channels) with a *lot* of NIT updates (over 4500 in 30seconds)

while we do indeed need to check for both version AND CRC ... it seems to cycle.

Here's a snippet of logs for a changed section (same PID, same subtable)

4707 version number:6 (previous6) crc 0xd3a293fa (previous:0xe71e8911)
4707 version number:6 (previous6) crc 0x49fc0a8e (previous:0xd3a293fa)
4707 version number:6 (previous6) crc 0x73de41ba (previous:0x49fc0a8e)
4707 version number:6 (previous6) crc 0x509e86da (previous:0x73de41ba)
4707 version number:6 (previous6) crc 0x11291ee (previous:0x509e86da)
4707 version number:6 (previous6) crc 0x7779998f (previous:0x11291ee)
4707 version number:6 (previous6) crc 0x73c9b6da (previous:0x7779998f)
4707 version number:6 (previous6) crc 0xe71e8911 (previous:0x73c9b6da)
4707 version number:6 (previous6) crc 0xd3a293fa (previous:0xe71e8911)
4707 version number:6 (previous6) crc 0x49fc0a8e (previous:0xd3a293fa)
4707 version number:6 (previous6) crc 0x73de41ba (previous:0x49fc0a8e)
4707 version number:6 (previous6) crc 0x509e86da (previous:0x73de41ba)
4707 version number:6 (previous6) crc 0x11291ee (previous:0x509e86da)

Notice how it cycles back to a CRC of 0xd3a293fa with the same version number.

So it seems that, when the version number doesn't change, we should accumulate all past CRC and check against all past CRC for that version number. If we already saw that CRC *AND* version_number, then we can skip parsing it. Else we parse it (since we never saw it).

If only the version number changes, then we obviously parse it.
Comment 3 Edward Hervey 2013-06-19 18:46:34 UTC
Created attachment 247280 [details] [review]
mpegtspacketizer: Better detect already seen section

In some cases (NIT on highly-populated DVB-C operator for example), there
will be more than one section emitted for the same subtable and version
number.

In order not to lose those updates for the same version number, we checked
against the CRC of the previous section we parsed.

The problem is that, while it made sure we didn't lose any information, it
also meant that if the same section came back (same version, same CRC) later
on we would re-process it, re-parse it and re-emit it.

This version improves on that by keeping a list of previously observed CRC
for identical PID/subtable/version-number and will only process sections if
they really were never seen in the past (as opposed to just before).

On a 30s clip, this brings down the number of NIT section parsing from 4541
down to 663.
Comment 4 Edward Hervey 2013-06-20 05:19:53 UTC
After more investigation, we are actually ignoring the section_number and last_section_number of various SI.

For a same PID/table_id/sub_table, and due to the limitation that sections can't be bigger than 1024bytes (or 4096 for EIT), if you want to carry more information than that you need to split it up in several "sub-sections" (not to be confused with sub-tables).

I guess that could be used as an improvement on the above patch to know how many sub-sections are expected, and ensure the CRC has not changed for that specific section_number ... or not and just satisfy ourselves with a list of CRC :)

I think I remembered there being some sections that did not have those fields though (but can't find it yet :( )
Comment 5 Edward Hervey 2013-06-20 06:37:19 UTC
As a reminder, the section_syntax_indicator 1bit flag is what specifies whether the section_number/last_section_number (and some other fields) are present or not.

Wondering if we shouldn't handle this properly. It would also avoid having misc GList usage (creation, iteration, ...).
Comment 6 Edward Hervey 2013-06-20 15:20:23 UTC
Created attachment 247334 [details] [review]
mpegtspacketizer: Better detect already seen section

In some cases (NIT on highly-populated DVB-C operator for example), there
will be more than one section emitted for the same subtable and version
number.

In order not to lose those updates for the same version number, we checked
against the CRC of the previous section we parsed.

The problem is that, while it made sure we didn't lose any information, it
also meant that if the same section came back (same version, same CRC) later
on we would re-process it, re-parse it and re-emit it.

This version improves on that by keeping a list of previously observed CRC
for identical PID/subtable/version-number and will only process sections if
they really were never seen in the past (as opposed to just before).

On a 30s clip, this brings down the number of NIT section parsing from 4541
down to 663.
Comment 7 Edward Hervey 2013-06-23 07:07:09 UTC
commit 0592bcc3c90125f8a9a68ee8e8940176b9c8cc97
Author: Edward Hervey <edward@collabora.com>
Date:   Wed Jun 19 20:39:54 2013 +0200

    mpegtspacketizer: Better detect already seen section
    
    In some cases (NIT on highly-populated DVB-C operator for example), there
    will be more than one section emitted for the same subtable and version
    number.
    
    In order not to lose those updates for the same version number, we checked
    against the CRC of the previous section we parsed.
    
    The problem is that, while it made sure we didn't lose any information, it
    also meant that if the same section came back (same version, same CRC) later
    on we would re-process it, re-parse it and re-emit it.
    
    This version improves on that by keeping a list of previously observed CRC
    for identical PID/subtable/version-number and will only process sections if
    they really were never seen in the past (as opposed to just before).
    
    On a 30s clip, this brings down the number of NIT section parsing from 4541
    down to 663.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=614479