After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 691299 - API: GstFileMemAllocator - an allocator that uses disk storage to provide memory space
API: GstFileMemAllocator - an allocator that uses disk storage to provide mem...
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-01-07 16:08 UTC by Krzysztof Konopko
Modified: 2018-11-03 11:23 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Proposed implementation (28.99 KB, patch)
2013-01-07 16:08 UTC, Krzysztof Konopko
none Details | Review
Proposed implementation: added documentation and other minor improvements (31.85 KB, patch)
2013-01-16 14:48 UTC, Krzysztof Konopko
needs-work Details | Review
Moved mapped file allocator to gst-plugins-base (33.88 KB, patch)
2013-07-26 23:21 UTC, Krzysztof Konopko
needs-work Details | Review
Proposed implementation (34.50 KB, patch)
2014-03-15 18:54 UTC, Krzysztof Konopko
none Details | Review
Proposed implementation (34.23 KB, patch)
2014-03-15 19:01 UTC, Krzysztof Konopko
none Details | Review

Description Krzysztof Konopko 2013-01-07 16:08:09 UTC
Created attachment 232911 [details] [review]
Proposed implementation

This is a proposal for a new allocator deriving from GstAllocator: GstFileMemAllocator.

Rationale
---------
The default allocator uses virtual memory which might be inconvenient when there's a demand to keep large amount of buffers (e. g. a media ring buffer) while the amount of physical memory is limited (e. g. an embedded system).

Given that the disk space is available instead, GstFileMemAllocator offers memory blocks which are mapped to file system blocks in a temporary file. The idea is to get memory blocks mapped to chunks of a file (mmap() on *nix). This way it's up to the OS to page them out and in when needed while keeping memory use to the minimum.

This is particularly useful on systems where physical memory is scarce while the address space (VM) is still available.

Reusing freed blocks
--------------------
Currently the allocator doesn't reclaim deallocated file block to make it reusable. This is for simplicity at this stage.

Give the simplicity described above, at least it reclaims the actual disk space when freeing a memory block (FALLOC_FL_PUNCH_HOLE if supported). This also means that the allocation should guarantee that the disk space is available if in the future implementation memory blocks are reusable.

Tests
-----
A number of tests exist. They can be run as follows:

make gst/gstfilememallocator.check
make gst/gstfilememallocator.valgrind

The allocator was also verified in a production environment of an embedded system (see www.youview.com) as a foundation for a ring buffer for the TS timeshifer.

Problems
--------

- portability
What is the best approach to not break other platforms support?
It should be possible to add the support for this allocator on other platform as needed but at the moment ideally the allocator files should be complied only for *nix system. I'll have a look how to best achieve it unless someone can suggest me patterns already used (I haven't been looking for these yet).

- name
I don't feel particularly comfortable about the allocator name, i. e. GstFileMemAllocator. I'll revisit it but if someone can suggest some short and descriptive name that would be great.

- initialization
I feel that the initialization is a bit awkward, i. e. the allocator needs to know some information in advance (the file size and where to store it). It sort of make sense if one think of the allocator as a global resource but I was wondering whether it'd be useful to have an ability to create them on a whim, i. e. not necessarily add them to the registry.

Other considerations
--------------------
A potential downside of the allocator used in some scenarios could be downgrading the performance of the system by paging out pages more critical for process execution. An example scenario is a ring buffer where there's a potentially intensive use of memory pages (reads and writes). This should be dealt with by the application itself (a sit depends on how it uses the allocator) but the allocator should provide facilities for doing it (additional API).

Useful web resources:
http://code.google.com/p/pagecache-mangagement/
http://kerneltrap.org/node/7563
http://lkml.indiana.edu/hypermail/linux/kernel/1005.2/01845.html
http://lkml.indiana.edu/hypermail/linux/kernel/1005.2/01953.html


Any comments and suggestion are very welcomed.
Comment 1 Krzysztof Konopko 2013-01-07 16:13:30 UTC
The repository for this work is here:
https://github.com/kkonopko/gstreamer/tree/filemem-allocator
Comment 2 Krzysztof Konopko 2013-01-16 14:48:01 UTC
Created attachment 233601 [details] [review]
Proposed implementation: added documentation and other minor improvements
Comment 3 Krzysztof Konopko 2013-01-16 15:54:39 UTC
> Problems
> --------
> 
> - portability
> What is the best approach to not break other platforms support?
> It should be possible to add the support for this allocator on other platform
> as needed but at the moment ideally the allocator files should be complied only
> for *nix system. I'll have a look how to best achieve it unless someone can
> suggest me patterns already used (I haven't been looking for these yet).
> 

In the second version of the patch [1] I added simple handling for unsupported platforms. If the required features are not available, the implementation code is not compiled. The allocator initialization function reports an error on the runtime instead.

> - initialization
> I feel that the initialization is a bit awkward, i. e. the allocator needs to
> know some information in advance (the file size and where to store it). It sort
> of make sense if one think of the allocator as a global resource but I was
> wondering whether it'd be useful to have an ability to create them on a whim,
> i. e. not necessarily add them to the registry.
> 

I'm leaning towards the allocator as a global resource. I have a full production use case where the application initializes the allocator in the main context and passed its name around through object properties. The object (element) that wants to use different allocation schemes uses the allocator name as a property. By default the property is set to NULL for such an object which ends up using the default allocator. Otherwise it picks up selected (e. g. "FileMemory") allocator. 

[1] https://bugzilla.gnome.org/review?bug=691299&attachment=233601
Comment 4 Sebastian Dröge (slomo) 2013-07-17 09:46:04 UTC
I think the fd and other parameters should be part of the GstMemory and the allocator can be made a singleton then. It should also be possible to e.g. let filesrc create such memory instead of only using it for random allocations. The name should probably be GstMappedFileMemory or GstMMapMemory.

Take a look at gst-plugins-base/gst-libs/gst/allocators btw, that already contains a dmabuf allocator which is very similar (but should be kept separate). Could you merge your allocator into that library too?
Comment 5 Wim Taymans 2013-07-17 10:30:43 UTC
There are 2 interesting ways to use this:

 1) as a replacement for the system memory allocator.

 2) as an allocator proposed in allocation queries.

I can see two separate APIs for this:

 1) needs something like 

      alloc = gst_filemem_allocator_new(gsize size, const gchar *templ);

    and then 

      gst_allocator_set_default (alloc);
    
    You could also have elements do that and propose the allocator but I can't
    see why they would right now.

 2) needs something like:

     alloc = gst_filemem_allocator_new_from_fd (gint fd, ...);

    Because if you are to propose this in an allocator you probably have your
    own fd somewhere that you want others to write into. I can only think of
    a ringbuffer or maybe a muxer.

Maybe you also want to make it possible to grow the file automatically or never overwrite things (in the case of a muxer).
Comment 6 Krzysztof Konopko 2013-07-18 00:51:02 UTC
(In reply to comment #4)
> I think the fd and other parameters should be part of the GstMemory and the
> allocator can be made a singleton then.
Making it a singleton was my first thought but then I faced scenarios when an app (a long running process or a daemon if you like) created and destroyed GStreamer pipelines quite frequently and I had to add an API to reset the allocator (not included to the attached patch yet).

Passing fd sounds like an interesting idea.

> It should also be possible to e.g. let
> filesrc create such memory instead of only using it for random allocations. The
> name should probably be GstMappedFileMemory or GstMMapMemory.
Sure, I'm strongly considering this.
> 
> Take a look at gst-plugins-base/gst-libs/gst/allocators btw, that already
> contains a dmabuf allocator which is very similar (but should be kept
> separate). Could you merge your allocator into that library too?
Thanks for the hint.  I had a brief look and it seems like a good idea.  I noticed that I should review thread safety in the allocator (I didn't need it at all in my use case due to locking at a different level).
Comment 7 Krzysztof Konopko 2013-07-18 00:55:33 UTC
(In reply to comment #5)
> 
>  2) needs something like:
> 
>      alloc = gst_filemem_allocator_new_from_fd (gint fd, ...);
> 
>     Because if you are to propose this in an allocator you probably have your
>     own fd somewhere that you want others to write into. I can only think of
>     a ringbuffer or maybe a muxer.
> 
> Maybe you also want to make it possible to grow the file automatically or never
> overwrite things (in the case of a muxer).

Thanks for the suggestions, worth considering.  My goal was to extend the allocator as new use cases emerge but considering these features might be a good thing for the allocator design (currently focused only on one use case).  Let me do some hacking first and I'll get back to you as soon as I can.
Comment 8 Krzysztof Konopko 2013-07-26 23:21:27 UTC
Created attachment 250246 [details] [review]
Moved mapped file allocator to gst-plugins-base

For a good start I moved the allocator to gst-plugins-base and renamed classes to GstMappedFileMemory and GstMappedFileallocator as Sebastian suggested in his comment [1].

This is the most tedious part of the work to get this proposal into better shape.  Now it's ready for more work which will hopefully be also more fun.
  
[1] https://bugzilla.gnome.org/show_bug.cgi?id=691299#c4
Comment 9 Sebastian Dröge (slomo) 2013-07-28 15:49:41 UTC
I think there should be a GstMMapMemory that is independent of this allocator and only has an fd and the offset inside it, and the mapping functions would then call mmap() or whatever is required. This very specific allocator here would be used to implement a special version of the GstMMapMemory, but there would also be a GstMMapAllocator that just provides the default functions for handling mmap'd memory.
Comment 10 Sebastian Dröge (slomo) 2013-07-28 15:56:44 UTC
Review of attachment 250246 [details] [review]:

::: gst-libs/gst/allocators/gstmappedfile.c
@@ +162,3 @@
+      maxsize);
+
+  if (allocator->f_offset_next + maxsize > allocator->file_size) {

I think you should implement a better block allocation strategy, that allows to properly deallocate and reallocate blocks

@@ +207,3 @@
+     of blocks available or more advanced if needed. */
+
+#if defined (HAVE_FALLOCATE) && HAVE_DECL_FALLOC_FL_PUNCH_HOLE

So if this is not supported, the memory will slowly run full and then nothing can be allocated anymore? I think the allocator should not be supported at all then

@@ +256,3 @@
+  /* Can't really control the alignment
+     Can we do something about the data offset (GstAllocationParams)? Is it
+     the right place to handle it? */

No the alignment and padding and everything should be handled during allocation

@@ +503,3 @@
+  GST_ERROR ("%s allocator is not supported on this platform",
+      GST_ALLOCATOR_MAPPEDFILE);
+#endif // GST_MAPPEDFILEALLOCATOR_SUPPORTED

No C99 comments please

@@ +532,3 @@
+
+  gst_allocator_register (GST_ALLOCATOR_MAPPEDFILE,
+      GST_ALLOCATOR_CAST (allocator));

Not sure if registering the allocator globally with a fixed temp template makes sense. What if multiple such allocators with different temp templates should be used?

::: tests/check/libs/allocators.c
@@ +29,3 @@
+  GstAllocator *alloc = NULL;
+
+  gst_mappedfile_allocator_init (size, "/tmp/gstmappedfile-XXXXXX");

Use g_get_tmp_dir() here
Comment 11 Krzysztof Konopko 2013-07-30 22:20:20 UTC
(In reply to comment #9)
My gut feeling is that what you suggest makes a lot sense but please see my questions below.

> I think there should be a GstMMapMemory that is independent of this allocator
> and only has an fd and the offset inside it, and the mapping functions would
> then call mmap() or whatever is required.

As I understand data redundancy is acceptable here.  I mean GstMMapMemory (or GstMappedFileMemory as I renamed it to) inherits from GstMemory which already contains a reference to the allocator and GstMMapAllocator (or GstMappedFileAllocator as I renamed it to) has got the file descriptor.  My understanding is that any memory is always somehow related to and dependent on its allocator.  Is your intention to decouple things here?

> This very specific allocator here

Ie. which one as opposed to GstMMapAllocator?

> would be used to implement a special version of the GstMMapMemory, but there
> would also be a GstMMapAllocator that just provides the default functions for
> handling mmap'd memory.

Does it mean that the implementation of all of these would live in the same place (file)?  How would GstMMapMemory be reused/shared after making it independent of its allocator?

BTW, I renamed classes to GstMappedFile* as IMHO this is more generic as opposed to implementation specific MMap (what about Windows folks?).
Comment 12 Krzysztof Konopko 2013-07-30 22:33:30 UTC
(In reply to comment #10)
> Review of attachment 250246 [details] [review]:
> 
> ::: gst-libs/gst/allocators/gstmappedfile.c
> @@ +162,3 @@
> +      maxsize);
> +
> +  if (allocator->f_offset_next + maxsize > allocator->file_size) {
> 
> I think you should implement a better block allocation strategy, that allows to
> properly deallocate and reallocate blocks
> 
> @@ +207,3 @@
> +     of blocks available or more advanced if needed. */
> +
> +#if defined (HAVE_FALLOCATE) && HAVE_DECL_FALLOC_FL_PUNCH_HOLE
> 
> So if this is not supported, the memory will slowly run full and then nothing
> can be allocated anymore? I think the allocator should not be supported at all
> then

This allocator is bound to run out of space sooner or later as it's supposed to be intialised with a specific total size.  The reason for this is that there's no way to reliably resize the file if any of its mmap()-ed blocks have been given out and are in use.  I'd like it wasn't true so please let me know if I'm wrong here.

I agree that not reusing freed blocks is poor but given my reasoning above it's not the end of the world.  I agree reusing blocks should be supported (and should be a condition to accept this allocator at all) but there are at least two ways to approach it:
- declare the allocator to allocate blocks of a certain size only
This makes it dead simple as it boils down to maintaining a list of free blocks

- support arbitrary block size allocations
This requires merging subsequent free blocks (buddy allocation, defragmentation etc.).

> 
> @@ +532,3 @@
> +
> +  gst_allocator_register (GST_ALLOCATOR_MAPPEDFILE,
> +      GST_ALLOCATOR_CAST (allocator));
> 
> Not sure if registering the allocator globally with a fixed temp template makes
> sense. What if multiple such allocators with different temp templates should be
> used?
>

Yes, that hurts me and I'm keen to make it better based on previous suggestions (pass a file descriptor to a file opened somewhere else etc.).

Thanks for the other spots.  I fixed them immediately as they were minors.  I need a bigger chunk of spare time to tackle bigger issues.  Any additional feedback would be very helpful.
Comment 13 Sebastian Dröge (slomo) 2013-08-12 11:25:50 UTC
(In reply to comment #11)
> My understanding is that any memory is always somehow related to and dependent on
> its allocator.  Is your intention to decouple things here?
> 
> > This very specific allocator here
> 
> Ie. which one as opposed to GstMMapAllocator?

Yes, the idea would be to have a generic GstMMapMemory that can be created by multiple allocators. E.g. v4l2 could provide such memory too, or filesrc. You can have multiple allocators implementing the same GstMemory.
Comment 14 Sebastian Dröge (slomo) 2013-08-12 11:34:53 UTC
(In reply to comment #12)
> (In reply to comment #10)
> > Review of attachment 250246 [details] [review] [details]:
> > 
> > ::: gst-libs/gst/allocators/gstmappedfile.c
> > @@ +162,3 @@
> > +      maxsize);
> > +
> > +  if (allocator->f_offset_next + maxsize > allocator->file_size) {
> > 
> > I think you should implement a better block allocation strategy, that allows to
> > properly deallocate and reallocate blocks
> > 
> > @@ +207,3 @@
> > +     of blocks available or more advanced if needed. */
> > +
> > +#if defined (HAVE_FALLOCATE) && HAVE_DECL_FALLOC_FL_PUNCH_HOLE
> > 
> > So if this is not supported, the memory will slowly run full and then nothing
> > can be allocated anymore? I think the allocator should not be supported at all
> > then
> 
> This allocator is bound to run out of space sooner or later as it's supposed to
> be intialised with a specific total size.  The reason for this is that there's
> no way to reliably resize the file if any of its mmap()-ed blocks have been
> given out and are in use.  I'd like it wasn't true so please let me know if I'm
> wrong here.

No, that's true unfortunately. Having a fixed size is ok, as long as blocks can be reused.

> I agree that not reusing freed blocks is poor but given my reasoning above it's
> not the end of the world.  I agree reusing blocks should be supported (and
> should be a condition to accept this allocator at all) but there are at least
> two ways to approach it:
> - declare the allocator to allocate blocks of a certain size only
> This makes it dead simple as it boils down to maintaining a list of free blocks
> 
> - support arbitrary block size allocations
> This requires merging subsequent free blocks (buddy allocation, defragmentation
> etc.).

Yeah I agree that this is not trivial :) But without that this allocator isn't really that useful, it will always go out of memory sooner or later. Going with fixed (configurable) size blocks in the beginning would be a good start I guess.
Comment 15 Krzysztof Konopko 2014-03-15 18:54:08 UTC
Created attachment 272026 [details] [review]
Proposed implementation

Here's my recent attempt to implement the allocator.  This time, as suggested, I implemented reusing allocated blocks.  Memory blocks can be of any size (although they are aligned to page size) and are merged when claimed back if they are continuous.  For example, if the user allocates 2 x 1k, 2k and 1k, and then deallocated the blocks, they get merged into a single 4k block.

I also replaced awkward API with a single function to create the allocator which take a file descriptor and the maximum file size (the allocator resizes the file when created).  The file descriptor gets associated with the allocator rather than memory (as in dmabuf) because the intention is for a single file to provide a memory arena where multiple blocks can be requested from.  For this reason the allocator has to do all the bookkeeping.

Cheers,
Kris
Comment 16 Krzysztof Konopko 2014-03-15 19:01:00 UTC
Created attachment 272027 [details] [review]
Proposed implementation
Comment 17 GStreamer system administrator 2018-11-03 11:23:09 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/78.