GNOME Bugzilla – Bug 793338
videometa: add support for encoder parameters to ROI meta
Last modified: 2018-02-21 17:31:08 UTC
See the commit message for the rationale and use cases of this new API.
Created attachment 368187 [details] [review] videometa: add support for downstream parameters to ROI meta The current GstVideoRegionOfInterestMeta API allows elements to detect and name ROI but doesn't tell anything about how this information is meant to be consumed by downstream elements. Typically, encoders may want to tweak their encoding settings for a given ROI to increase or decrease their quality. Each encoder has its own set of settings so that's not something that can be standarized. This patch adds encoder-specific parameters to the meta which can be used to configure the encoding of a specific ROI. A typical use case would be: source ! roi-detector ! encoder with a buffer probe on the encoder sink pad set by the application. Thanks to the probe the application will be able to tell to the encoder how this specific region should be encoded. Users could also develop their specific roi detectors meant to be used with a specific encoder and directly putting the encoder parameters when detecting the ROI.
Created attachment 368189 [details] [review] videometa: add support for downstream parameters to ROI meta The current GstVideoRegionOfInterestMeta API allows elements to detect and name ROI but doesn't tell anything about how this information is meant to be consumed by downstream elements. Typically, encoders may want to tweak their encoding settings for a given ROI to increase or decrease their quality. Each encoder has its own set of settings so that's not something that can be standardized. This patch adds encoder-specific parameters to the meta which can be used to configure the encoding of a specific ROI. A typical use case would be: source ! roi-detector ! encoder with a buffer probe on the encoder sink pad set by the application. Thanks to the probe the application will be able to tell to the encoder how this specific region should be encoded. Users could also develop their specific roi detectors meant to be used with a specific encoder and directly putting the encoder parameters when detecting the ROI.
In gstreamer-vaapi, we're using GstCustomEvent to achieve roi encoding because GstVideoRegionOfInterestMeta is not enough to use all suppored parameters for vaapi encoders. See https://bugzilla.gnome.org/show_bug.cgi?id=768248 I like this idea since each encoder might need its specific parameters.
(In reply to Hyunjun Ko from comment #3) > In gstreamer-vaapi, we're using GstCustomEvent to achieve roi encoding > because > GstVideoRegionOfInterestMeta is not enough to use all suppored parameters > for vaapi encoders. Which this proposal is about, we also have time to revert back to something more "standard" in gstreamer-vaapi before the release. The custom event has some design issues. You must send an event to remove them, you cannot have two ROI on the save rectangle with two different meaning, no hierarchy, it cannot be exercised from gst-launch-1.0 (which means it will get low validation). What we are experimenting on the gst-omx side, is adding a property for the "default" value. You can then from gst-launch-1.0 combine facedetect that always attach ROI and this to exercise the API manually.
What worries me, or perhaps I just don't understand, is with a list of structures, the ROI information will be very coupled between the "detector" and the encoder. If the detector fills the list with pairs of data-value that the encoder doesn't understand, what should happen?
Indeed. We are considering adding a 'default-roi-quality' property on the OMX encoder which will be used if no custom settings have been provided by the user. Note that applications could also use a pad probe at the output of the 'detector' and add the encoder specific params (based on the ROI type) before it reaches the encoder.
Another questions: 1\ A buffer can hold a list for metas of the same type? AFAIK, they don't 2\ An encoder can process several ROIs per frame? AFAIK they do If I am correct we might have a problem, how we could process several regions of interest per frame?
1) Yes they can and we can use gst_buffer_iterate_meta_filtered() it iterate on all of them. But I'm confused with the ROI meta API actually. From its struct it's supposed to have a 'roi_type' (its semantic, like 'face') and an opaque id (as we can have more than one ROI of the same type on a given buffer). But in the implementation both seems to be mixed up: gst_buffer_add_video_region_of_interest_meta() passes the roi_type as ID, and gst_buffer_add_video_region_of_interest_meta_id() doesn't set the id at all. Am I missing something here or is it a bug in the API? 2) Our OMX implementation can yes.
(In reply to Víctor Manuel Jáquez Leal from comment #7) > Another questions: > > 1\ A buffer can hold a list for metas of the same type? AFAIK, they don't Wrong they do. GstMeta are stored in a list, regardless of their type, allowing multiple meta of the same type to be stored. Element can use IDs for RegionOfInterest in order to further identify them, even though this is currently used/needed. See gst_buffer_iterate_meta_filtered() > 2\ An encoder can process several ROIs per frame? AFAIK they do > > If I am correct we might have a problem, how we could process several > regions of interest per frame? VAAPI, OMX/ALG and x264enc supports a pretty large amount of ROI rectangles. Obviously, some of them will be prioritized, so it does not mean they will all be super neat. Sometimes the bitrate is just not high enough. (In reply to Guillaume Desmottes from comment #8) > 1) Yes they can and we can use gst_buffer_iterate_meta_filtered() it iterate > on all of them. > But I'm confused with the ROI meta API actually. From its struct it's > supposed to have a 'roi_type' (its semantic, like 'face') and an opaque id > (as we can have more than one ROI of the same type on a given buffer). But > in the implementation both seems to be mixed up: > gst_buffer_add_video_region_of_interest_meta() passes the roi_type as ID, > and gst_buffer_add_video_region_of_interest_meta_id() doesn't set the id at > all. This is missed named API indeed, should have been gst_buffer_add_video_region_of_interest_meta_quark(). The ID is optionally set by the element producing a certain type of ROI. If the same type of ROI can be added, then it might be important to define an ID convention for it. We could deprecate and rename I guess, nothing urgent though.
Anything else ? I'd like to go forward and make a patch for gstreamer-vaapi, we could then get both gst-omx and gstreamer-vaapi to do the same thing. Later on we could update textoverlay (inspired from gstreamer-vaapi example) to setup a "text" ROI and also add ROI support to handsdetect too.
(In reply to Nicolas Dufresne (stormer) from comment #10) > Anything else ? I'd like to go forward and make a patch for gstreamer-vaapi, > we could then get both gst-omx and gstreamer-vaapi to do the same thing. > Later on we could update textoverlay (inspired from gstreamer-vaapi example) > to setup a "text" ROI and also add ROI support to handsdetect too. I'm OK with it.
Attachment 368189 [details] pushed as f5855d5 - videometa: add support for downstream parameters to ROI meta