After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 631200 - flacparse: major performance improvements
flacparse: major performance improvements
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal normal
: 0.10.21
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks: 616921
 
 
Reported: 2010-10-03 00:32 UTC by Felipe Contreras (banned)
Modified: 2010-10-23 09:05 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
trivial caps fix (883 bytes, patch)
2010-10-03 00:33 UTC, Felipe Contreras (banned)
committed Details | Review
delay tags push (1.83 KB, patch)
2010-10-03 00:34 UTC, Felipe Contreras (banned)
rejected Details | Review
fix picture parsing (921 bytes, patch)
2010-10-03 00:37 UTC, Felipe Contreras (banned)
committed Details | Review
get rid of compilation warnings (2.42 KB, patch)
2010-10-03 00:38 UTC, Felipe Contreras (banned)
rejected Details | Review
remove useless stuff (9.63 KB, patch)
2010-10-03 00:39 UTC, Felipe Contreras (banned)
rejected Details | Review
refactor into frame_header_is_valid() (9.68 KB, patch)
2010-10-03 00:39 UTC, Felipe Contreras (banned)
rejected Details | Review
rewrite sync matching (9.97 KB, patch)
2010-10-03 00:40 UTC, Felipe Contreras (banned)
rejected Details | Review
tidying up (10.98 KB, patch)
2010-10-03 00:40 UTC, Felipe Contreras (banned)
rejected Details | Review
simplify frame_header_is_valid (12.45 KB, patch)
2010-10-03 00:41 UTC, Felipe Contreras (banned)
rejected Details | Review
fix some warnings (1.59 KB, patch)
2010-10-03 00:41 UTC, Felipe Contreras (banned)
rejected Details | Review
we don't need no headers (1.38 KB, patch)
2010-10-03 00:43 UTC, Felipe Contreras (banned)
rejected Details | Review

Description Felipe Contreras (banned) 2010-10-03 00:32:31 UTC
These are the patches in gst-maemo-xiph that look like the can be somewhat acceptable in GStreamer in one form or the other.

I wasn't really intending push them here, so there might be some horrible things for GStreamer guidelines.

Also, I didn't test these on top of the latest g-p-bad, so take these with a grain of salt.
Comment 1 Felipe Contreras (banned) 2010-10-03 00:33:46 UTC
Created attachment 171594 [details] [review]
trivial caps fix
Comment 2 Felipe Contreras (banned) 2010-10-03 00:34:43 UTC
Created attachment 171595 [details] [review]
delay tags push
Comment 3 Felipe Contreras (banned) 2010-10-03 00:37:51 UTC
Created attachment 171596 [details] [review]
fix picture parsing
Comment 4 Felipe Contreras (banned) 2010-10-03 00:38:18 UTC
Created attachment 171597 [details] [review]
get rid of compilation warnings
Comment 5 Felipe Contreras (banned) 2010-10-03 00:39:00 UTC
Created attachment 171598 [details] [review]
remove useless stuff

You would probably not like it.
Comment 6 Felipe Contreras (banned) 2010-10-03 00:39:41 UTC
Created attachment 171599 [details] [review]
refactor into frame_header_is_valid()
Comment 7 Felipe Contreras (banned) 2010-10-03 00:40:28 UTC
Created attachment 171600 [details] [review]
rewrite sync matching
Comment 8 Felipe Contreras (banned) 2010-10-03 00:40:42 UTC
Created attachment 171601 [details] [review]
tidying up
Comment 9 Felipe Contreras (banned) 2010-10-03 00:41:05 UTC
Created attachment 171602 [details] [review]
simplify frame_header_is_valid
Comment 10 Felipe Contreras (banned) 2010-10-03 00:41:22 UTC
Created attachment 171603 [details] [review]
fix some warnings
Comment 11 Felipe Contreras (banned) 2010-10-03 00:43:53 UTC
Created attachment 171604 [details] [review]
we don't need no headers
Comment 12 Felipe Contreras (banned) 2010-10-03 00:45:03 UTC
BTW. It took me a full 10 minutes to create this bug report; git send-email would have taken a few seconds.
Comment 13 Sebastian Dröge (slomo) 2010-10-03 06:29:33 UTC
Thanks for the patches, I'll take a look at them later. You could use git-bz to create bugreports like this, this should only take a few seconds too ;)

One thing that should be done additionally to these patches here is, to optimize GstBitReader but converting the get_bits functions to macros and to provide unchecked variants. This would be similar to what was done with GstByteReader some time ago.
Comment 14 Felipe Contreras (banned) 2010-10-03 11:43:19 UTC
(In reply to comment #13)
> Thanks for the patches, I'll take a look at them later. You could use git-bz to
> create bugreports like this, this should only take a few seconds too ;)

Not really. Maybe it would help if it actually worked:
http://felipec.wordpress.com/2010/01/19/why-bugzilla-sucks-for-handling-patches/
Comment 15 Sebastian Dröge (slomo) 2010-10-03 11:50:02 UTC
Well, let's not argue about that here (and FWIW I'm successfully using git-bz).

From a short look at your patches you're still parsing the complete FLAC frames. Didn't you say yesterday on IRC that you're only looking for a) a sync marker and b) that a valid frame header follows and then simply push everything until the next valid sync marker/frame header downstream?

And what's the reason for dropping the optional CRC16 frame check (which defaults to FALSE)? I added it some time ago and planned to post a message on the bus for broken frames to provide a tool for checking FLAC files for corruptions.

Other than that the patches look good from a first glance... will merge them later and do a deeper review then. After optimizing GstBitReader a bit :)
Comment 16 Felipe Contreras (banned) 2010-10-03 11:59:45 UTC
(In reply to comment #15)
> Well, let's not argue about that here (and FWIW I'm successfully using git-bz).

Yes, some project's bugzillas are easier to setup than others. But just the though of re-basing my patches in case I need configurations not supported scares me away.

> From a short look at your patches you're still parsing the complete FLAC
> frames. Didn't you say yesterday on IRC that you're only looking for a) a sync
> marker and b) that a valid frame header follows and then simply push everything
> until the next valid sync marker/frame header downstream?

Yes:
(In reply to comment #7)
> Created an attachment (id=171600) [details] [review]
> rewrite sync matching

See:
 * get_next_sync -> frame_header_is_valid

> And what's the reason for dropping the optional CRC16 frame check (which
> defaults to FALSE)? I added it some time ago and planned to post a message on
> the bus for broken frames to provide a tool for checking FLAC files for
> corruptions.

I was porting the parsing code and I didn't want to port that, I don't find it useful anyway. As I said, I wasn't intending to post the patches.
Comment 17 Sebastian Dröge (slomo) 2010-10-03 13:33:41 UTC
This might be interesting for you:


commit dd762eb49f8e04f9f6fee84d9a00265e0410798f
Author: Sebastian Dröge <sebastian.droege@collabora.co.uk>
Date:   Sun Oct 3 15:27:37 2010 +0200

    bitreader: Add inlined and unchecked versions of the important functions
    
    API: gst_bit_reader_skip_unchecked
    API: gst_bit_reader_skip_to_byte_unchecked
    API: gst_bit_reader_get_bits_uint16_unchecked
    API: gst_bit_reader_get_bits_uint32_unchecked
    API: gst_bit_reader_get_bits_uint64_unchecked
    API: gst_bit_reader_get_bits_uint8_unchecked
    API: gst_bit_reader_peek_bits_uint16_unchecked
    API: gst_bit_reader_peek_bits_uint32_unchecked
    API: gst_bit_reader_peek_bits_uint64_unchecked
    API: gst_bit_reader_peek_bits_uint8_unchecked
    
    This alone makes flacparse about 3 times faster.



Next I'll get to your patches :)
Comment 18 Sebastian Dröge (slomo) 2010-10-03 21:58:53 UTC
Comment on attachment 171595 [details] [review]
delay tags push

This was already fixed some time ago
Comment 19 Sebastian Dröge (slomo) 2010-10-03 22:00:05 UTC
(In reply to comment #4)
> Created an attachment (id=171597) [details] [review]
> get rid of compilation warnings

What are the warnings you're trying to fix here?
Comment 20 Sebastian Dröge (slomo) 2010-10-03 22:03:38 UTC
Comment on attachment 171598 [details] [review]
remove useless stuff

Yes, I don't :) The property was added to add a tool for checking if a FLAC file is valid and it's disabled by default. The requested_frame_size calculations everywhere are there for giving baseparse a better estimate of the possible framesize by looking at the minimum/maximum frame sizes in the flac streaminfo, etc.

The gotos are there to keep the error cases outside the normal code flow.
Comment 21 Sebastian Dröge (slomo) 2010-10-03 22:05:34 UTC
Comment on attachment 171604 [details] [review]
we don't need no headers

The header buffers must be pushed to allow pipelines like
... ! flacparse ! filesink

Otherwise we're dropping information from the flac file
Comment 22 Sebastian Dröge (slomo) 2010-10-03 22:10:39 UTC
Comment on attachment 171599 [details] [review]
refactor into frame_header_is_valid()

This patch uses C99 features but the idea of not parsing every frame twice is good. I'll commit something similar
Comment 23 Felipe Contreras (banned) 2010-10-04 14:46:00 UTC
(In reply to comment #19)
> (In reply to comment #4)
> > Created an attachment (id=171597) [details] [review] [details] [review]
> > get rid of compilation warnings
> 
> What are the warnings you're trying to fix here?

-Wall -Wextra -Wno-unused-parameter
Comment 24 Felipe Contreras (banned) 2010-10-04 14:55:32 UTC
(In reply to comment #20)
> (From update of attachment 171598 [details] [review])
> Yes, I don't :) The property was added to add a tool for checking if a FLAC
> file is valid and it's disabled by default.

Which is useless most of the time. I would have a separate tool for that, a separate branch, or at least add G_UNLIKELY.

> The requested_frame_size
> calculations everywhere are there for giving baseparse a better estimate of the
> possible framesize by looking at the minimum/maximum frame sizes in the flac
> streaminfo, etc.

It's not used at all, anywhere. Not on flacparse, not on baseparse... nowhere.

> The gotos are there to keep the error cases outside the normal code flow.

They do two things, print a message on EOS, and need streaminfo, and return -1 or -2 depending on that, but then, that FALSE is returned in gst_flac_parse_check_valid_frame() regardless. In my resulting code EOS is still printed, so nothing is lost, only the "need streaminfo" message, which is useless IMO; I never saw that, can be done much more easily with a single 'if' at the top, and I don't think it matters after the reorganization.
Comment 25 Felipe Contreras (banned) 2010-10-04 14:56:23 UTC
(In reply to comment #21)
> (From update of attachment 171604 [details] [review])
> The header buffers must be pushed to allow pipelines like
> ... ! flacparse ! filesink
> 
> Otherwise we're dropping information from the flac file

That information is already in the caps.
Comment 26 Felipe Contreras (banned) 2010-10-04 15:03:49 UTC
(In reply to comment #18)
> (From update of attachment 171595 [details] [review])
> This was already fixed some time ago

Where? In 65f620a? From what I can see you pushed that about the same time you made that comment.

And that commit message is very bad; it doesn't answer the question 'why is this needed?' It also doesn't mention any attribution... where did you get the idea from? And did you test this?
Comment 27 Sebastian Dröge (slomo) 2010-10-05 10:13:24 UTC
(In reply to comment #26)
> (In reply to comment #18)
> > (From update of attachment 171595 [details] [review] [details])
> > This was already fixed some time ago
> 
> Where? In 65f620a? From what I can see you pushed that about the same time you
> made that comment.
> 
> And that commit message is very bad; it doesn't answer the question 'why is
> this needed?' It also doesn't mention any attribution... where did you get the
> idea from? And did you test this?

No idea when this was fixed but in latest GIT (before 65f620a) the tags were already pushed not immediately but after the newsegment event and after the caps were known. I only changed it to push the tags *before* the header buffers.
Comment 28 Sebastian Dröge (slomo) 2010-10-07 21:39:12 UTC
commit 812075dc5db7c479a367815e6f386bf63d84504d
Author: Sebastian Dröge <sebastian.droege@collabora.co.uk>
Date:   Thu Oct 7 23:37:36 2010 +0200

    flacparse: Simplify frame header parsing by using lookup tables
    
    Based on a patch by Felipe Contreras.
    
    See bug #631200.

commit 64407ca94b3295bf88dd491d79eeb3aae9b4686a
Author: Sebastian Dröge <sebastian.droege@collabora.co.uk>
Date:   Thu Oct 7 23:28:08 2010 +0200

    flacparse: Don't parse the complete FLAC frames but only look for valid fram
    
    Thanks to Felipe Contreras for the suggestion. This is partially
    based on his patches and makes flacparse more than 3.5 times faster.
    
    Looking for valid frame headers is unlikely to give false positives
    because every frame header is at least 9 bytes long, contains a
    14 bit sync code and a 8 bit checksum over the first 8 bytes.
    
    Fixes bug #631200.
Comment 29 Sebastian Dröge (slomo) 2010-10-07 21:41:44 UTC
Comment on attachment 171597 [details] [review]
get rid of compilation warnings

Except the GValue initializations this is fixed now. I don't see what's wrong with these initializations and GLib and GStreamer is doing that *everywhere*
Comment 30 Sebastian Dröge (slomo) 2010-10-07 21:42:40 UTC
Comment on attachment 171600 [details] [review]
rewrite sync matching

Something similar is committed now.
Comment 31 Sebastian Dröge (slomo) 2010-10-07 21:45:58 UTC
Comment on attachment 171603 [details] [review]
fix some warnings

I don't see anything wrong with this code either.

a) Every list node always contains a valid buffer reference
b) Casting gst_mini_object_unref() to a GFunc is valid and also passing the buffer and NULL to the function is valid because the C calling conventions say that the caller has to clean up the function parameters.
Comment 32 Felipe Contreras (banned) 2010-10-23 01:37:40 UTC
(In reply to comment #29)
> (From update of attachment 171597 [details] [review])
> Except the GValue initializations this is fixed now. I don't see what's wrong
> with these initializations and GLib and GStreamer is doing that *everywhere*

That doesn't make it right.

See bug #577231.
Comment 33 Felipe Contreras (banned) 2010-10-23 01:43:06 UTC
(In reply to comment #31)
> (From update of attachment 171603 [details] [review])
> I don't see anything wrong with this code either.
> 
> a) Every list node always contains a valid buffer reference

Not true, there are some error conditions where this doesn't happen.

I know because I saw the warnings of gst_mini_object_unref(), which why I bothered writing the patch.

BTW. gst_mini_object_unref has a silly API, should be:
void gst_mini_object_unref(void *mini_object);

And should accept NULL, that would make it really useful, similar to free(), and g_object_unref().
Comment 34 Felipe Contreras (banned) 2010-10-23 01:48:41 UTC
For the record:
% git log --author=felipe.contreras -- gst/audioparsers

Shows basically nothing. Way to incentivize contributors.

I have more luck contributing patches to the linux kernel, where they:
 a) actually review the patches
 b) comment the exact changes needed, or what in general would they do
 c) give a chance to the contributor to do the changes themselves
 d) leave the original author in the commit

So, in case this wasn't clear; this process was not rewarding at all.
Comment 35 Sebastian Dröge (slomo) 2010-10-23 07:42:55 UTC
(In reply to comment #32)
> (In reply to comment #29)
> > (From update of attachment 171597 [details] [review] [details])
> > Except the GValue initializations this is fixed now. I don't see what's wrong
> > with these initializations and GLib and GStreamer is doing that *everywhere*
> 
> That doesn't make it right.
> 
> See bug #577231.

Everything after the first 0 will be initialized to 0 too, that warning only means that you didn't explicitely initialize the remaining struct members.

(In reply to comment #33)
> (In reply to comment #31)
> > (From update of attachment 171603 [details] [review] [details])
> > I don't see anything wrong with this code either.
> > 
> > a) Every list node always contains a valid buffer reference
> 
> Not true, there are some error conditions where this doesn't happen.

Which cases?

> I know because I saw the warnings of gst_mini_object_unref(), which why I
> bothered writing the patch.
> 
> BTW. gst_mini_object_unref has a silly API, should be:
> void gst_mini_object_unref(void *mini_object);

There's a gst_buffer_unref() for the correct parameter type.

> And should accept NULL, that would make it really useful, similar to free(),
> and g_object_unref().

g_object_unref() does not accept NULL, neither does free() (in general). g_free() accepts NULL, yes.

(In reply to comment #34)
> For the record:
> % git log --author=felipe.contreras -- gst/audioparsers
> 
> Shows basically nothing. Way to incentivize contributors.

I committed two of your patches which could be included. The caps fix and the picture parsing fix.
I rejected other patches because they were not necessary (as explained above) or the large rewrite of the parsing because you added unrelated changes and also used C99 features everywhere.

Instead I've done the rewrite of the parsing myself based on your idea, which was done faster than splitting your patches, cleaning things up, etc. But if you look at the commit message of that change you'll see your name.

> I have more luck contributing patches to the linux kernel, where they:
>  a) actually review the patches

Not sure why you think nobody reviewed your patches

>  b) comment the exact changes needed, or what in general would they do

... and told you what was wrong with them or why I think they were not necessary.

>  c) give a chance to the contributor to do the changes themselves

You complained that the patch submitting process wasted too much time for you and that you didn't plan to submit the patches at all and also thought that they might not be good for inclusion yet. I assumed that you don't plan to improve them and as such I've simply fixed your patches myself (or have done them in a way that is acceptable for inclusion)

>  d) leave the original author in the commit

And if you took at look at the commit message you would see that the rewritten change still says that it's based on your patch
Comment 36 Felipe Contreras (banned) 2010-10-23 09:05:33 UTC
(In reply to comment #35)
> (In reply to comment #32)
> > (In reply to comment #29)
> > > (From update of attachment 171597 [details] [review] [details] [details])
> > > Except the GValue initializations this is fixed now. I don't see what's wrong
> > > with these initializations and GLib and GStreamer is doing that *everywhere*
> > 
> > That doesn't make it right.
> > 
> > See bug #577231.
> 
> Everything after the first 0 will be initialized to 0 too, that warning only
> means that you didn't explicitely initialize the remaining struct members.

I know that, but -Wextra is pretty essential.

> (In reply to comment #33)
> > (In reply to comment #31)
> > > (From update of attachment 171603 [details] [review] [details] [details])
> > > I don't see anything wrong with this code either.
> > > 
> > > a) Every list node always contains a valid buffer reference
> > 
> > Not true, there are some error conditions where this doesn't happen.
> 
> Which cases?

I don't know, invalid clips?

> > I know because I saw the warnings of gst_mini_object_unref(), which why I
> > bothered writing the patch.
> > 
> > BTW. gst_mini_object_unref has a silly API, should be:
> > void gst_mini_object_unref(void *mini_object);
> 
> There's a gst_buffer_unref() for the correct parameter type.

Yeah, but that's why g_object_unref accepts a void, so you don't need the correct parameter type.

> > And should accept NULL, that would make it really useful, similar to free(),
> > and g_object_unref().
> 
> g_object_unref() does not accept NULL, neither does free() (in general).
> g_free() accepts NULL, yes.

1) g_object_unref() receives a void
2) free accepts NULL

See the C standard on "7.20.3.2 The free function":
If ptr is a null pointer, no action occurs.

> (In reply to comment #34)
> > For the record:
> > % git log --author=felipe.contreras -- gst/audioparsers
> > 
> > Shows basically nothing. Way to incentivize contributors.
> 
> I committed two of your patches which could be included. The caps fix and the
> picture parsing fix.
> I rejected other patches because they were not necessary (as explained above)
> or the large rewrite of the parsing because you added unrelated changes and
> also used C99 features everywhere.
> 
> Instead I've done the rewrite of the parsing myself based on your idea, which
> was done faster than splitting your patches, cleaning things up, etc. But if
> you look at the commit message of that change you'll see your name.

That's not enough. I would have expected you to add review comments so that I could do the changes and remain the author of the patches, that would give the appropriate attribution.

Now you remain as the author of the patch. Even though I'm mentioned on the commit message as making the "suggestion", nobody will really see my name, not on the log, or statistics of any kind:

https://www.ohloh.net/p/gstreamer/contributors

> > I have more luck contributing patches to the linux kernel, where they:
> >  a) actually review the patches
> 
> Not sure why you think nobody reviewed your patches

By "review", I mean go through the code, and as you go, make comments about it.

Not keep the comments to yourself, and afterwards explain why you rewrote, or dropped the patch.

> >  b) comment the exact changes needed, or what in general would they do
> 
> ... and told you what was wrong with them or why I think they were not
> necessary.

Mentioning what was wrong after rewriting the code is not the same as comment on my patch itself.

> >  c) give a chance to the contributor to do the changes themselves
> 
> You complained that the patch submitting process wasted too much time for you
> and that you didn't plan to submit the patches at all and also thought that
> they might not be good for inclusion yet. I assumed that you don't plan to
> improve them and as such I've simply fixed your patches myself (or have done
> them in a way that is acceptable for inclusion)

Well, I did send the patches, didn't I? Maybe I was giving a chance to the process.

Moreover, that's not an excuse to not follow a process that keeps good attribution (if you even had such process in place). I would have expected you to follow this process, and only if I say I'm not interested on doing these changes myself you go ahead. Just like you should do with everybody else.

In fact, even if you rewrite the patch, in a good review process you would post the patch here, and only after I had reviewed the patch too, you commit it with my signed-off-by.

> >  d) leave the original author in the commit
> 
> And if you took at look at the commit message you would see that the rewritten
> change still says that it's based on your patch

"Thanks to Felipe Contreras for the suggestion." Right, that gives a lot of credit.

That's very far from giving me a chance to be the author of the patch, which in a way, I am.

Anyway, I have complained about the patch review and contribution process, and what happened here is a good example of why it doesn't work. All I'm saying is that FWIW, this was not rewarding, and that's a fact.