GNOME Bugzilla – Bug 613416
[performance] slowdown when project resolution != clip resolutions
Last modified: 2013-10-30 18:51:33 UTC
I just tested pitivi 0.13.4's playback performance on a decent machine (Pentium M 1.8 GHz), and if your project resolution !== the clips resolution, the playback performance is destroyed; it can't playback 500x282 theora smoothly if you use 1280x720 as the project resolution, for example. If you set the project resolution to 500x282, then it is smooth enough to playback.
For the record: <nekohayo> gst problem or pitivi problem? <bilboed> hard to say, scaling isn't cheap unfortuantely :( <nekohayo> huh, I thought scaling was done by the hardware/xv/something for free <twi_> ideally it should be using XV during playback <bilboed> oh, then that's just videoscale which is made of fail <bilboed> it can't keep PAR/DAR properly so forces scaling <twi_> oh really <bilboed> twi_, you were the one who added the scaler for playback, right ? <twi_> dunno, maybe? i think i added ffmpegcolorspace back <twi_> i yes i think i added videoscale for pngs * bilboed wants a optimized scale+colorspaceconv+coffee element
Setting the milestone as this will probably bite many users, if you feel like nothing at all can be done in the near future, feel free to remove it.
I just upgraded last night from 0.13.1 to 0.13.4 and saw my performance go from about 40-45 fps, to 0.3 fps!!! When new features are added, sure, I do expect to see some degradation in performance. But definitely not to that level (150x). At this point, this can only be a bug, rather than a normal evolutionary development progression. My own *guess* as to what's going on is the new composite thing. I don't think that it knows that it's on 100% brightness, even when it is, and so it rounds up to about 99% or something. Therefore, the application *thinks* that I have changed the compositing, even if I haven't, and goes into calculation mode. When I actually do change the compositing red line to something like 50%, the app doesn't get any slower, which means compositing is never set to 100%, and so it's always in calculation mode. Hence, the slowness (although having a 150x slowdown is still too much to bare for such a feature). But that's just my guess. My project properties are correctly set to 720p btw. Frame rate is set to 29.97 fps, however the frame rate from these MPEG4-SP Kodak 720p HD digicams I'm using fluctuates from 30.00 to 30.20 fps (each clip is different, therefore I have to force the project at 29.97). Test video here: http://tuxtops.com/images/Z1275.mov Please fix before June if possible. I'm giving that Atom 1.6 Ghz laptop to my mom with Pitivi, and she won't be able to use the app if this is not fixed. I can't fix it at a later date for her either, since she lives in Greece, and I'm in the US. Failing a fix, I wonder if there's a way to downgrade Pitivi on Ubuntu 10.4. As I mentioned above, with the previous Pitivi, these kind of MPEG4-SP videos were *more than real time* on that laptop. There were no performance issues whatsoever (since MPEG4-SP is so easy to decode compared to MPEG4-ASP or h.264).
Hardware-accelerated blending and scaling (i.e. compositing) in GStreamer it not yet mature enough to support compositing in PiTiVi. Currently PiTiVi uses software-only compositing. The performance penalty is mainly caused by scaling. As of 0.13.4, all video streams are scaled to the project resolution. This proved necessary for implementing transitions, which were added in this release. You can improve performance significantly by adjusting project settings to match the width and height of the majority of your input (rather than the poorly-chosen default of 720x576). PiTiVi will not have to scale the input of any video stream which matches the project resolution. I do development and testing on an Asus eeePC 901 which also has a 1.6 Ghz Atom processor. I have found that PiTiVi performs reasonably well with 320x240 (QVGA) resolution files, so long as the project resolution is set to the same; however the machine will drop frames with 640x480 and higher resolution, even if the project resolution is adjusted accordingly.
>Please fix before June if possible. I'm giving that Atom 1.6 Ghz laptop to my >mom with Pitivi, and she won't be able to use the app if this is not fixed. I'm very sorry to hear that. >I >can't fix it at a later date for her either, since she lives in Greece, and I'm >in the US. Failing a fix, I wonder if there's a way to downgrade Pitivi on >Ubuntu 10.4. Consult your distribution's documentation. Failing that, you can install an older release from source. >My project properties are correctly set to 720p btw. Frame rate is set to 29.97 >fps, however the frame rate from these MPEG4-SP Kodak 720p HD digicams I'm >using fluctuates from 30.00 to 30.20 fps (each clip is different, therefore I >have to force the project at 29.97). Test video here: >As I mentioned above, with the previous Pitivi, these kind of >MPEG4-SP videos were *more than real time* on that laptop. There were no >performance issues whatsoever (since MPEG4-SP is so easy to decode compared to >MPEG4-ASP or h.264). These files are simply too large for the current implementation on that processor, since everything is done in software (whereas during playback some part of the processing can still be done in hardware). In the absence of a better machine, a work-around would be to create temporary low-resolution copies of your sources for the purposes of editing, moving the originals some place safe. If you don't change file names, you can move the originals back for the final render.
Brandon, thank you for the reply. However, as I said above, prior to 0.13.4, these files would play *faster than real time* on Pitivi while editing, on that very same netbook. This is NOT a matter of my laptop not being fast enough, or modern enough. That laptop has *demonstrated* that it can playback these 720p files in *faster* than real time inside a video editor (which I know that they're more demanding than just media players). So I just refuse to believe that the project has decided to change its architecture in a way that it's 150 times slower than before. To what end??? To me, it looks like there's either a bug somewhere, or the new architecture is not optimized. What I'm seeing there is not normal. Reading between the lines of your reply, are you telling me that there won't be a fix? That I will never see the kind of performance I had before 0.13.4 on that laptop? You see, what I'm seeing here I can't gasp. It's one thing to add a new feature, or change the architecture a bit and be 2-3 times slower. It happens. It's how software evolution works. But 150 times slower ***is not acceptable***. Your application is currently slower than OpenShot with these files, while it was super-fast with 0.13.1 and was previously leaving Openshot to dust. I implore you to either optimize it, or return to the old way of doing things. I much prefer the speed of editing (even if all I can do is straight cuts), than the ability to do whatever fancy feature you added lately but be so dog slow. Your app went from the fastest Linux video editor to the slowest in a single release, and that's just unacceptable. That's the kind of slowness that should have happened in 10 years time of continuous development, not 2 months. Please take the issue with the rest of the decision leaders of the project. This is too major of a problem. I'm not talking about a little crash bug, or a small optimization here and there. I'm talking about a kind of blocker that makes people uninstall the app and never want to use it again. My mom using the suggested proxy files is out of the question btw. It's too complex for her. Pitivi either works as it used to by June, or she will have to move to the somewhat faster OpenShot. I prefer her to use Pitivi because its UI is much nicer, simpler than Openshot. But the current version is unbearable.
>You can improve performance significantly by adjusting project >settings to match the width and height of the majority of your input I always set the right project properties, since this is also important for my main video editor, Sony Vegas (so I'm well trained for such details). However, this didn't do squat in this case. 720/30p that used to be 40-45 fps (in terms of max speed) on 0.13.1, is now 1 frame per 2-3 seconds on PiTiVi. Same hardware.
I can relate to Eugenia's pain. I agree that it's pretty much the #1 issue with 0.13.4+ right now, and users will either loathe us or restart the pitivi-bashing/negative impression that plagued us in the past. On the technical side, what I understand of this situation is: - In order to be able to do transitions, opacity curves and certain effects, you need "video mixing" (aka compositing?), which, currently, uses video scaling done in software (which is super duper slow) - This would plague us until hardware compositing is implemented Some questions for the record: - How hard is hardware-accelerated compositing? And won't that kind of stuff depend on really good, 100% supported hardware? - Is there not a stopgap trickery that can be used in the meantime? Or a big bottleneck that was overlooked in software-only compositing? (comment #1)
>Reading between the lines of your reply, are you telling me that there won't be >a fix? That I will never see the kind of performance I had before 0.13.4 on >that laptop? You see, what I'm seeing here I can't gasp. We are aware of the performance issues and are working on improvements. It's not explicitly clear to me when it will become available. I was merely trying to be helpful in suggesting alternatives. >Please take the issue with the rest of the decision leaders of the project. >This is too major of a problem. The decision was made prior to the 0.13.4 release to include video mixing and transitions despite the concerns about performance. It was decided that it was better to have slow compositing than none at all. We have been working to improve performance in the mean time. According to Edward: 09:43 < bilboed-pi> there's 3 areas to improve : faster/more-flexible existing elements (slomo is working on that), smarter/more-efficient negotiation in gnlcomposition (working on that) and finally switching to hardware accelerated elements 09:43 < bilboed-pi> improving the first two are definitely within reach Translated into plain English, this means we expect some minor optimizations will become available within the next couple of months, followed by full hardware acceleration within six months.
> The decision was made prior to the 0.13.4 release > to include video mixing and transitions despite > the concerns about performance. Wait a second. I just realized that the new Ubuntu, released today, comes with Pitivi by default. So you chose to go ahead with this dramatic drop in performance, at the one point in time where such a major distro installed your app by default. That must have been one of the worst product management decisions if I ever saw one. Can you imagine the hordes of users crying out that the thing is unusable, and never want to use it again? Even after you "fix it"? Trust me, it's much better to just be able to do straight cuts and nothing else, but do it elegantly, than having transitions and fades but not being able to preview the timeline at all. In my time as a professional editor I have used some formats occasionally that were very slow to edit, and made the conscious decision to not use proxies. None of these cuts came out properly, because the low performance would not clearly show me if a particular take was shaky or not. When viewing the final exported video, it was obvious that many of the takes I used were junk. Resulting my whole video to be just junk. Of course, some formats are just slow, no matter what editor/PC you're using. But this is not one such case. Getting 150x slower in a single release, at least on a format/resolution that used to work perfectly, to me translates as a major gaffe. The right way to deal with the problem is to _remove_ all that code that makes it slow, along the new features it brings (essentially, make it like 0.13.3 would have been, plus a few backported fixes). Then, make a new release and ask Ubuntu to include it in its upcoming 10.04 Updates. Save what you can of Pitivi's reputation. The fact that Pitivi would be a "basic editor" at that point does not diminish further your reputation, because being a basic editor, we already knew. But adding major performance problems on top, it will make further damage. So, create a new branch on your source tree, and put the compositing code there. Continue working on that source tree until you have hardware acceleration put on, and you have optimized it. If it takes 6 months, it takes 6 months. If it takes 1 year, well, it'd take 1 year. We can wait. Right now, you treated your users as rats to experiment on (or maybe you succumbed to Ubuntu's pressures to add features in order to be included??). Maybe all this was semi-acceptable in the past, since we all knew that Pitivi was somewhat experimental by nature. However, from the moment Ubuntu includes Pitivi by default, a distro that has over 10 million users, and you know about that inclusion well ahead of time, what you did with 0.13.4 was just wrong. And I write all this because I care about the project more than you think (I find it to be the most usable editor on Linux for new users). So please don't take it the wrong way. I'm just trying to express my view from the point of the ex-developer (project could do differently), ex-tech journalist (a new review would be a disaster), now filmmaker (impossible to use the app even for fun).
Edward Hervey of Pitivi emailed me, asking me to stop the ranting. A bug database is indeed a technical engineering tool, not a place to editorialize. As a developer, I do know that bug reports with too much blah blah are getting looked at even less. So I apologize for unnecessarily expanding my posts and strain the devs. I just hope that a solution is on the horizon soon. Please let me know if you require any debug info.
I profiled the usage of playing back one 640x480@30fps MJPEG file with the default 720x576@30fps project settings. The cpu breakdown is as follows (using oprofile during playback of said file): 32% ffmpegcolorspace, (from/to I420/AYUV) 14% jpegdec, (decoding the mjpeg file) 10% videomixer, 9% videoscale, (scaling from 640x480 to 720x540) 7% alpha, 5% videobox (converting from 720x540 to 720x576) 3-5% audioconversions So yes, 14% decoding and >60% for conversions of which most could be avoided The detailed graph of the pipeline is available here : http://people.collabora.co.uk/~edward/pitivi1.png Some ideas/pointers for optimisations: * videomixer : Only has one input stream, it should be smart enough to just passthrough (i.e take 0% cpu). * smartvideoscale (videoscale+videobox) : This is used in the source to make sure all input feeds to the videomixer are at the same resolution (so that if you mix two streams/files with different width/heigh/PAR they will be roughly the same size). The problem here is that we've decided to pick the project settings regardless of the current configuration (i.e. how many sources are mixed, are we previewing/rendering,...). We should only have to scale if needed. If not needed, videoscale/videobox should also act passthrough. BTW, forcing the width/height/framerate of the various gnlcomposition is not needed when previewing, only when rendering. * alpha : It would be great if we could do passthrough on any caps when alpha==1.0 , which would be most of the case. That would avoid us having to convert to AYUV that early, but only when needed. * ffmpegcolorspace : We use it for converting to/from AYUV. We should not force unconditionally that common colorspace (which is forced in the input to videomixer with a capsfilter). In this use-case we should have been able to do I420 all the way to the sink. This is related to problems with videomixer not being good at caps negotiation. GNonLin : As you might have noticed, a lot of issues are related to caps negotiation. The problem is that due to the nature of how a gnlcomposition is built (source preroll before they're linked) we need to specify restrictions/capsfilter in various places. I have started implementing some features in gnonlin that will help solve this issue (basically use the caps property of the gnlobject, propagate it down from composition to its childs, and have the various gnlobject use the caps property to make smarter decisions). One place where this could be used is when a videomixer sees that it has more than one input it could *then* specify a common set of caps to be used by the gnlobjects connected to it. I've managed to do lossless rendering of compositions (i.e. avoid decoding of streams when not needed) using that technique and so far it works pretty well. A final note : alpha and video positioning/scaling of each video streams will be a *lot* more efficient when we have buffer metadata implemented in GStreamer, since we'll be able to delay the actual processing as downstream as possible (like in videomixer or even the sinks) : http://cgit.freedesktop.org/gstreamer/gstreamer/tree/docs/design/draft-buffer2.txt
We should move the smartvideoscale into the smartvideomixer and use it only if needed based on caps and number of inputs. That way: * if previewing AND only one stream => smartvideoscale with no limitations => passthrough * if previewing AND multiple streams => choose a smart common size (biggest ? smallest ?) in a capsfilter *after* videomixer => only one stream gets resized
Some of the cpu usage is also related to insane caps negotiation, that issue is being taken care of in GStreamer core.
Would this issue affect rendering as well as playback? I'm experiencing some rendering issues when I mix video files in one project that have different resolutions and I'm not sure if this is the same issue or if I should file/search for a separate bug.
Lots of improvements in dependent libraries: * gst_pad_link_full : allows skipping useless (but expensive checks), we need to use that * videoscale with black-border : this should replace the SmartVideoScale * alpha works passthrough when alpha == 1.0 * ffmpegcolorspace : Most of it was in fact taken by caps negotiation, which has already been speded up. * caps nego in general : Much faster
Another item that made it into the latest gst-plugins-good release is imagefreeze, which is the C implementation of pitivi.elements.imagefreeze.ImageFreeze.
Created attachment 166598 [details] [review] Make SmartVideoScale use videoscale add-borders=true Initial attempt at converting SmartVideoScale's border-padding to maintain DAR feature over to using videoscale add-borders=true. As a consequence a lot of redundant code can be removed.
Created attachment 166599 [details] [review] Remove pitivi's ImageFreeze element and use GStreamer's imagefreeze instead
(In reply to comment #18) > Created an attachment (id=166598) [details] [review] > Make SmartVideoScale use videoscale add-borders=true > > Initial attempt at converting SmartVideoScale's border-padding to maintain DAR > feature over to using videoscale add-borders=true. As a consequence a lot of > redundant code can be removed. We should just remove SmartVideoScale altogether and replace it by: * videoscale add-borders=True * followed by (if not present and needed) a capsfilter with the requested caps
Comment on attachment 166599 [details] [review] Remove pitivi's ImageFreeze element and use GStreamer's imagefreeze instead Committed along with the removal of an import ImageFreeze in the testsuite (which wasn't used).
Created attachment 166760 [details] [review] Remove SmartVideoScale and use videoscale add-borders=true For this patch I also had to fix a bug where the output framerate set in the project/export settings wasn't being set on the capsfilter after videorate in the rendering branch of the pipeline. Because of other reworkings in the code, it made sense to me to keep the fix and the changes as one commit.
Review of attachment 166760 [details] [review]: Some small comments, but looks good overall and works as expected. ::: pitivi/factories/base.py @@ +493,3 @@ + # remove format as we will have converted to AYUV + if structure.has_key("format"): + del structure["format"] Shouldn't you take care of removing the RGB specifiers too ? I still don't get why this method would be called with complex caps that containe YUV or RGB format specifiers. ::: pitivi/factories/operation.py @@ +120,2 @@ idt = gst.element_factory_make("capsfilter") + idt.props.caps = gst.Caps(self.output_streams[0].caps) Why do you do a copy of the caps here ? (i.e. use gst.Caps(...))
(In reply to comment #23) > Review of attachment 166760 [details] [review]: > > Some small comments, but looks good overall and works as expected. > > ::: pitivi/factories/base.py > @@ +493,3 @@ > + # remove format as we will have converted to AYUV > + if structure.has_key("format"): > + del structure["format"] > > Shouldn't you take care of removing the RGB specifiers too ? I think it's rather that I forgot about RGB when writing the comment. :) It will output either AYUV or ARGB or whatever. > I still don't get why this method would be called with complex caps that > containe YUV or RGB format specifiers. In pitivi/project.py in Project the function is called from a couple of methods in there with the caps from, as I understand it, the project settings. There's also a call from the encoder ui dialogue window code too. Is anything done to take the format caps from the encoder? The problem was that if the format cap was present it wasn't linking alpha and the following capsfilter because the format couldn't be satisfied. > ::: pitivi/factories/operation.py > @@ +120,2 @@ > idt = gst.element_factory_make("capsfilter") > + idt.props.caps = gst.Caps(self.output_streams[0].caps) > > Why do you do a copy of the caps here ? (i.e. use gst.Caps(...)) It's probably left over from other changes. It can be a straight assignment without copying. Updated patch incoming...
Created attachment 166763 [details] [review] Remove SmartVideoScale and use videoscale add-borders=true
(In reply to comment #25) > Created an attachment (id=166763) [details] [review] > Remove SmartVideoScale and use videoscale add-borders=true Oops, I forgot to mark the previous one as obsoleted by this one. I also removed smartscale.py from the elements Makefile.am.
The frame rate fix in this patch should also address #625579.
Created attachment 166818 [details] [review] alpha pass-through Allow pass-through when alpha=1.0. I had to remove an AYUV caps being set on the capsfilter inside the videomixer bin. This capsfilter still has video/x-raw-yuv being set on it. I'm not sure why so this should be reviewed. If all input clips are I420 and the output is I420 too, no conversions (according to caps) take place. I also tried with Y444, Y42B and I420 clips with Y444 output. The videomixer bin converted to I420 and then the renderer to Y444. I'm going to have a look at improving the choice made by the videomixer bin as to what conversions should be done at any particular time. For now, the focus will be on performance.
When that patch was committed, it actually broke the use of alpha. For some reason, to be able to use alpha, the capsfilters for each input inside the SmartVideomixerBin need to have AYUV set on them. I have been working on a reasonable fix for this and it got more complicated as it went on. I added some code to check the alpha interpolators of all track objects and then if any of them have alpha < 1.0 I set AYUV on all the aforementioned capsfilters. If alpha is 1.0 on all the interpolators then I make sure AYUV is not set. As mentioned before, I wasn't sure why video/x-raw-yuv was being set on the capsfilters in the videomixer at all. I'm thinking that it should have video/x-raw-yuv; video/x-raw-rgb and then when alpha is used add the alpha format for each cap structure. I pushed two commits (one implementing the above and another unit test) to a github repo at http://github.com/superdump/pitivi Alessandro was reviewing the changes but I also uncovered an old bug he'd seen before that causes a segfault when adding a new keyframe.
Created attachment 167253 [details] [review] Only allow alpha passthrough when all alpha=1.0
Created attachment 167254 [details] [review] Add unit tests for alpha passthrough logic
Those two patches are the ones to which I was referring. Just for ease of review.
The patches were commited but the bug is kept open because videomixer is still slow, and the way we scale is still a little disaster. So further work needs to be done.
Alright, after some more consideration, I'll put this bug report to rest now: - Videomixer is better now that Mathieu spent a while hammering at it - We now have the notion of "restriction caps" in GES (and Pitivi) that size down the image resolution to whatever is set in the project settings, which helps performance a bit - At some point we will have even better perfs by using restriction caps to size down dynamically to the actual dimensions of the viewer widget, which should be bug #572440 - This bug report is nearly four years old and references stuff that is quite likely to not even exist anymore (hello, GStreamer 0.10.x). A whole revolution occurred since then. Please open new bug reports if there still are significant performance issues around.