GNOME Bugzilla – Bug 704070
video: Add Region Of Interest (roi) meta
Last modified: 2013-07-17 09:43:08 UTC
Created attachment 248990 [details] [review] 1st full patch GstVideoRegionOfInterestMeta and manipulators. Add a region-of-interest meta to gstvideometa.{h,c}. To be used firstly for sending face detection events downstream as buffer-associated meta.
Review of attachment 248990 [details] [review]: ::: gst-libs/gst/video/gstvideometa.c @@ +536,3 @@ +{ + static volatile GType type; + static const gchar *tags[] = { NULL }; Tags should probably be size and orientation as for the crop meta @@ +564,3 @@ + /* we always copy no matter what transform */ + gst_buffer_add_video_region_of_interest_meta (transbuf, emeta->x, emeta->y, emeta->w, + emeta->h); And a transform function similar to the one of the crop meta too ::: gst-libs/gst/video/gstvideometa.h @@ +255,3 @@ + guint w; + guint h; + GQuark roi_type; I guess the type should be in front of x Also add some padding in the end And some defines for pre-defined ROI types, like "face" :) @@ +261,3 @@ +const GstMetaInfo *gst_video_region_of_interest_meta_get_info (void); + +#define gst_buffer_get_video_region_of_interest_meta(b) ((GstVideoRegionOfInterestMeta*)gst_buffer_get_meta((b),GST_VIDEO_REGION_OF_INTEREST_META_API_TYPE)) Maybe another macro to get the number of these metas and a specific one at a index? Similar to GstVideoMeta @@ +264,3 @@ +GstVideoRegionOfInterestMeta *gst_buffer_add_video_region_of_interest_meta (GstBuffer * buffer, + guint x, guint y, + guint w, guint h); Add a const gchar * roi_type parameter before x, y And maybe a second function that takes a GQuark instead of the string
<__tim> might one want/need some kind of shape-extension in future for the ROI? to indicate that it's more than a rectangle? or a probability perhaps? what about points of interest within the ROI? or ROIs (eyes,mouth,nose) within a ROI? <wtay`> with an id you could make parent-child relations <wtay`> not sure about shapes, it sounds maybe too much for now <__tim> right, just saying one might want to leave some possibility of expansion <wtay`> a probability, maybe dunno what it would mean <wtay`> yes, but if you think of them now we can add them now <wtay`> with an id per ROI we can always add more metadata to describe the ROI some more <__tim> like the probability that this ROI resembles a face (though maybe one would just set the threshold on the facedetect element and not comunicate it) <wtay`> or maybe add metadata to group ids <wtay`> but a roi as the parent with a quark that mentions the grouptype etc sounds enough for now <__tim> as for 'shape', I was mostly thinking of whether it's a bounding rectangle, and what kind of shape is inside it <__tim> or whether the ROI is the entire rectangle, sort of <__tim> not sure if it matters, it might if you have an element whose purpose is to blur detected areas (though it could also have a hard-coded list of known ROI types then) <slomo> for an arbitrary shape... you might need more than 4 parameters to define it <__tim> yeah, I don't know if we need to express "arbitrary shapes" <wtay`> let's make sure it is advertized as a ROI bounding box, maybe we can add the exact shape in another meta when needed
Putting comments from Wim from the other bug here: Let's add a unique ID to the ROI and a parent ID: struct ROI { GstMeta meta gint id; gint parent_id; GQuark roi_type; guint x, y, width, height; } this way we can: - add more meta later to describe the ROI bounding rectangle with ID in more detail - Make a big ROI with sub ROI having the parent_id of the big ROI to make subregions. For arbitrary shapes, I would now advertise the ROI as a bounding rectangle containing the object(s) of interest, for the shape of the actual object(s) we would need another meta to describe them.
I think Wim's proposal covers everything that might be of potential interest to be added later
Created attachment 249190 [details] [review] 2nd full patch GstVideoRegionOfInterestMeta and manipulators Ok, I did my best to address all the comments but please have a good look: I'm no familiar with the GQuarks, metas and the like :) - added tags size and orientation to Region of Interest Meta - add a Region of Interest transform meta function - Added quark part for a region of interest of type "bounding-box" - added a method to get the meta indexed by id (but not all of them at once :( ) - changed gst_buffer_add_video_region_of_interest_meta() into two other functions, one with a gchar* type and another with a GQuark type. To make it a bit more generic, the quark is only defined for bounding-box: "gst-video-region-of-interest-bounding-box"
Review of attachment 249190 [details] [review]: Almost ready :) ::: gst-libs/gst/video/gstvideometa.c @@ +532,3 @@ +/* Region of Interest Meta implementation *******************************************/ +GQuark +gst_video_region_of_interest_meta_bounding_box_get_quark (void) I don't think that's what was meant :) Just mention in the documentation of this GstMeta that it only describes a bounding box for the ROI, and not the exact shape @@ +562,3 @@ +{ + GstVideoRegionOfInterestMeta *dmeta, *smeta; + Need to copy the parent_id and id here too @@ +647,3 @@ +GstVideoRegionOfInterestMeta * +gst_buffer_add_video_region_of_interest_meta_by_name (GstBuffer * buffer, + gchar * roi_type, const gchar * @@ +657,3 @@ + meta = (GstVideoRegionOfInterestMeta *) gst_buffer_add_meta (buffer, + GST_VIDEO_REGION_OF_INTEREST_META_INFO, NULL); + meta->roi_type = g_quark_from_string (roi_type); Just let this function call the other one that takes the GQuark @@ +682,3 @@ + meta->y = y; + meta->w = w; + meta->h = h; Initialize the parent_id and id here ::: gst-libs/gst/video/gstvideometa.h @@ +39,3 @@ +#define GST_VIDEO_REGION_OF_INTEREST_META_API_TYPE (gst_video_region_of_interest_meta_api_get_type()) +#define GST_VIDEO_REGION_OF_INTEREST_META_INFO (gst_video_region_of_interest_meta_get_info()) +typedef struct _GstVideoRegionOfInterestMeta GstVideoRegionOfInterestMeta; Remove this
(In reply to comment #6) > Review of attachment 249190 [details] [review]: > > Almost ready :) > > ::: gst-libs/gst/video/gstvideometa.c > @@ +532,3 @@ > +/* Region of Interest Meta implementation > *******************************************/ > +GQuark > +gst_video_region_of_interest_meta_bounding_box_get_quark (void) > > I don't think that's what was meant :) Just mention in the documentation of > this GstMeta that it only describes a bounding box for the ROI, and not the > exact shape > Ok, I added the documentation but left the GQuark part :) > @@ +562,3 @@ > +{ > + GstVideoRegionOfInterestMeta *dmeta, *smeta; > + > > Need to copy the parent_id and id here too Done. > > @@ +647,3 @@ > +GstVideoRegionOfInterestMeta * > +gst_buffer_add_video_region_of_interest_meta_by_name (GstBuffer * buffer, > + gchar * roi_type, > > const gchar * > Done. > @@ +657,3 @@ > + meta = (GstVideoRegionOfInterestMeta *) gst_buffer_add_meta (buffer, > + GST_VIDEO_REGION_OF_INTEREST_META_INFO, NULL); > + meta->roi_type = g_quark_from_string (roi_type); > > Just let this function call the other one that takes the GQuark > > @@ +682,3 @@ > + meta->y = y; > + meta->w = w; > + meta->h = h; > > Initialize the parent_id and id here > Both to 0. > ::: gst-libs/gst/video/gstvideometa.h > @@ +39,3 @@ > +#define GST_VIDEO_REGION_OF_INTEREST_META_API_TYPE > (gst_video_region_of_interest_meta_api_get_type()) > +#define GST_VIDEO_REGION_OF_INTEREST_META_INFO > (gst_video_region_of_interest_meta_get_info()) > +typedef struct _GstVideoRegionOfInterestMeta GstVideoRegionOfInterestMeta; > > Remove this I moved it together with the rest GstVideoRegionOfInterestMeta since I use them in the manipulators, and the other metas have them too :S
Created attachment 249207 [details] [review] 3rd full patch GstVideoRegionOfInterestMeta and manipulators Addressed Sebastian's comments.
Review of attachment 249207 [details] [review]: ::: gst-libs/gst/video/gstvideometa.c @@ +532,3 @@ +/* Region of Interest Meta implementation *******************************************/ +GQuark +gst_video_region_of_interest_meta_bounding_box_get_quark (void) Remove this, the idea is that this is *always* the bounding box, and nothing else. The ROI type would be stuff like "face", "chicken", etc @@ +658,3 @@ + return gst_buffer_add_video_region_of_interest_meta_by_quark(\ + buffer, + gst_video_region_of_interest_meta_bounding_box_get_quark(), g_quark_from_string(roi_type) instead @@ +666,3 @@ +gst_buffer_add_video_region_of_interest_meta_by_quark (GstBuffer * buffer, + GQuark roi_type, + gint id, gint parent_id, I wouldn't expose the ids here ::: gst-libs/gst/video/gstvideometa.h @@ +263,3 @@ +GQuark gst_video_region_of_interest_meta_bounding_box_get_quark (void); +#define GST_VIDEO_REGION_OF_INTEREST_META_IS_BOUNDING_BOX(type) \ + ((type) == gst_video_region_of_interest_meta_bounding_box_get_quark()) Remove these bounding box things @@ +278,3 @@ + guint x, guint y, + guint w, guint h); +GstVideoRegionOfInterestMeta *gst_buffer_add_video_region_of_interest_meta_by_quark (GstBuffer * buffer, meta_from_quark() maybe
Created attachment 249212 [details] [review] 4th full patch GstVideoRegionOfInterestMeta and manipulators Addressed last round of Sebastian's comments (hopefully :D )
Review of attachment 249212 [details] [review]: Almost :) ::: gst-libs/gst/video/gstvideometa.c @@ +1,1 @@ +/* Why did you remove this? :) @@ +536,3 @@ +{ + static volatile GType type; + static const gchar *tags[] = { "location", " size", NULL }; This should be orientation and size, location has no meaning yet. What do you mean with location here? @@ +594,3 @@ +{ + GstVideoRegionOfInterestMeta *emeta = (GstVideoRegionOfInterestMeta *) meta; + emeta->x = emeta->y = emeta->w = emeta->h = 0; Initialize the other fields too
(In reply to comment #11) > Review of attachment 249212 [details] [review]: > > Almost :) > > ::: gst-libs/gst/video/gstvideometa.c > @@ +1,1 @@ > +/* > > Why did you remove this? :) > Oops, well let me blame the tool, I changed emacs with sublime and C-x C-s turns into: delete line, save :) > @@ +536,3 @@ > +{ > + static volatile GType type; > + static const gchar *tags[] = { "location", " size", NULL }; > > This should be orientation and size, location has no meaning yet. What do you > mean with location here? > I understood x,y -> location of the roi (anchor point) w,h -> size of the roi > @@ +594,3 @@ > +{ > + GstVideoRegionOfInterestMeta *emeta = (GstVideoRegionOfInterestMeta *) meta; > + emeta->x = emeta->y = emeta->w = emeta->h = 0; > > Initialize the other fields too Done.
Created attachment 249257 [details] [review] 5th full patch GstVideoRegionOfInterestMeta and manipulators Reverted comment, initialising id and parent_id on "constructor"
(In reply to comment #12) > > @@ +536,3 @@ > > +{ > > + static volatile GType type; > > + static const gchar *tags[] = { "location", " size", NULL }; > > > > This should be orientation and size, location has no meaning yet. What do you > > mean with location here? > > > > I understood > x,y -> location of the roi (anchor point) > w,h -> size of the roi The tags are there to specify to which aspects of a buffer the meta applies, i.e. if the size changes the meta needs to be transformed or removed.
Changed that and pushed commit 95605a79c9c5fecf0f569ce443516fe0cf2e9b02 Author: Sebastian Dröge <slomo@circular-chaos.org> Date: Tue Jul 16 10:09:27 2013 +0200 videometa: Add to the docs and make function names more consistent with others commit 38837bd468713e2b0861fce2b06a2dd288791c63 Author: Miguel Casas-Sanchez <miguelecasassanchez@gmail.com> Date: Tue Jul 16 10:04:00 2013 +0200 videometa: Add Region Of Interest meta https://bugzilla.gnome.org/show_bug.cgi?id=704070
All right thanks Sebastian!
I'd suggest the trivial addition of a float/double angle, so oriented bounding boxes are supported (seems trivial enough of an addition that it doesn't warrant a separate type). Tilted faces are pretty common after all. :)
(In reply to comment #17) > I'd suggest the trivial addition of a float/double angle, so oriented bounding > boxes are supported (seems trivial enough of an addition that it doesn't > warrant a separate type). Tilted faces are pretty common after all. :) Sounds good! Isn't it a bit late to do it now in this bug? (Sadly enough, tilted faces are not commonly searched for or found by face detectors :( )
(In reply to comment #15) > commit 38837bd468713e2b0861fce2b06a2dd288791c63 > Author: Miguel Casas-Sanchez <miguelecasassanchez@gmail.com> > Date: Tue Jul 16 10:04:00 2013 +0200 > > videometa: Add Region Of Interest meta > > https://bugzilla.gnome.org/show_bug.cgi?id=704070 Please add docs too?
Done