After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 729655 - [GSOC] libarchive wrapper: feedback
[GSOC] libarchive wrapper: feedback
Status: RESOLVED FIXED
Product: gnome-boxes
Classification: Applications
Component: general
unspecified
Other All
: Normal enhancement
: 3.22
Assigned To: GNOME Boxes maintainer(s)
GNOME Boxes maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2014-05-06 15:19 UTC by Lasse Schuirmann
Modified: 2016-03-31 13:22 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
libarchive wrapper - WIP (4.30 KB, text/x-vala)
2014-05-06 15:19 UTC, Lasse Schuirmann
Details
Standalone async issue (472 bytes, text/x-vala)
2014-05-18 11:55 UTC, Lasse Schuirmann
Details

Description Lasse Schuirmann 2014-05-06 15:19:12 UTC
Created attachment 275989 [details]
libarchive wrapper - WIP

Hello,

I'm currently writing a libarchive wrapper. Since I am unsure about standards in Boxes I'd like to have a quick feedback which does not have to go into detail since this is work in progress.

Just one or two sentences are enough. And maybe someone knows a better name for the class... Archive is not possible due to the ambiguity with the Archive namespace.

Greetings,

Lasse
Comment 1 Marc-Andre Lureau 2014-05-06 15:41:31 UTC
You are opening up a very common request:
http://www.hadess.net/2013/12/on-beauty-of-libarchive.html

In general, people want a libarchive with a Glib/GIO API (async/cancellable etc), but that's a summer of code on its own I am afraid.

So it looks your class is a pretty good start already (depending on what you want to do with it), but don't try to do everything it's the best way to fail :).
Comment 2 Lasse Schuirmann 2014-05-06 15:53:00 UTC
OK. I'll make it cancelable. This is just a little helper class so I won't implement the whole libarchive functionality.

For my needs this api will probably do, I just want to explicitly extract and insert files from and to archives. This is I needed for the preseed injection and on the fly I can remove the mtools dependency.

Thanks for looking at it.
Comment 3 Lasse Schuirmann 2014-05-16 12:18:30 UTC
Hello again,

I do have a functional wrapper now. You can look at the code here:
https://github.com/sils1297/GSOC-organization-repo/tree/master/example-projects/libarchive-wrapper

If you want, you can run and compile the example with:
make run

(Output should be like this:
  Time: 33.478200 s
  Preseed.cfg is in the new archive.

The time may vary.)

It will extract the initrd from the testiso.iso, and insert the preseed.cfg into it. (This will be checked and you should get a verification on the console. There will be a temporary archive named initrd~ until destruction of the object)

However, you need to replace the libarchive.vapi in /usr/share/vala-*/vapi with mine for the compilation to work. (I already filed a bug for this in vala - the patch is currently in the review process.)

I know zeenix doesn't like unnecessary code in Boxes: I can strip unneeded functions away when integrating the wrapper into Boxes.

I would appreciate any feedback.


A word on asynchronity: when I make e.g. the eatract_files function asynchronous somehow the destructor does not get called (-> the ~ file will not be moved to replace the original). I did research long about this and would appreciate a hint about how to do this.
Comment 4 Zeeshan Ali 2014-05-17 18:26:09 UTC
Its a bit hard to review code that is not attached here as patches. Anyway, looks really good and here is some comments (mostly nitpicks about coding style) that I gathered while reading the code:

* Better naming for classes is usually nouns, not verbs. i-e ArchiveRead -> ArchiveReader.

* When you move the wrapper to Boxes, I assume the namespace will be changed from Utils to Boxes?

* 'throws' and 'requires' (if on newline) needs to be aligned with arguments.

* We decided to not use 'this' unless to avoid ambiguity.

* I think its more useful to have an extract_file than extract_files. That way caller doesn't have to create a hashtable just for calling this function. If caller has a hashtable already, it can simply look on its own or you can provide a extract_files that does this.

* You should always try to treat passed parameters as read-only, especially if its just some container (lists, hashtable). extract_file really should not be removing things from passed hashtable.

* You really shouldn't need to explicitly call malloc from vala. 'var buffer = new uint8[LENGTH];' should do the trick.

* the methods that deal with entries (rather than files) don't seem very useful. Unless you have a particular use case in mind, I recommend not exposing such API as 'public'. Remember, app developers appreciate if API is as slim (and therefore simple) as possible.

* in RW class, instead of checking for readable and writable in each an every method, its better if you do it just once in construction method.

* Coding style is to have only space before '(', not after.

About your async issue, I can only know for sure whats happening after looking at the code but have you ensured that you change the caller of extract_files to either yield on the call or launch it async with 'begin' keyword after you make extract_files async?

As usual, the best way to figure such issues is to create a standalone code that is only aimed at reproducing your issue and hence making it very easy to debug (especially for others who might only be interested in helping with this particular problem but not motivated enough to go through the whole code).
Comment 5 Lasse Schuirmann 2014-05-18 11:54:14 UTC
(In reply to comment #4)

Thank you very much for the feedback!

> Its a bit hard to review code that is not attached here as patches. Anyway,
> looks really good and here is some comments (mostly nitpicks about coding
> style) that I gathered while reading the code:
> 
> * Better naming for classes is usually nouns, not verbs. i-e ArchiveRead ->
> ArchiveReader.

No problem.

> * When you move the wrapper to Boxes, I assume the namespace will be changed
> from Utils to Boxes?

It seemed to me like a general utility, so Utils did make sense for me. However it doesn't make that difference so I'll go with your preference.

> * 'throws' and 'requires' (if on newline) needs to be aligned with arguments.

Ok.

> * We decided to not use 'this' unless to avoid ambiguity.

Ok.

> * I think its more useful to have an extract_file than extract_files. That way
> caller doesn't have to create a hashtable just for calling this function. If
> caller has a hashtable already, it can simply look on its own or you can
> provide a extract_files that does this.

This is a bit psychological: when extracting more thean one file the extract_files method is more performant than invoking an extract_file method twice. (You don't have to iterate through all the headers again.) So instead of letting the user iterate twice through the whole archive I enforce him to create this hashtable which makes it far more convenient to just invoke extract_files once.

In addition we had a similar discussion I had in mind
https://bugzilla.gnome.org/show_bug.cgi?id=700780#c9
where you stated that letting the caller allocating a List (which is the hashtable here) is a good option.

> * You should always try to treat passed parameters as read-only, especially if
> its just some container (lists, hashtable). extract_file really should not be
> removing things from passed hashtable.

You're right, the current version is dirty.
I think I'll put a counter there to ensure that every file that the caller wanted to extract existed. 

> * You really shouldn't need to explicitly call malloc from vala. 'var buffer =
> new uint8[LENGTH];' should do the trick.

Great. One never learns enough.

> * the methods that deal with entries (rather than files) don't seem very
> useful. Unless you have a particular use case in mind, I recommend not exposing
> such API as 'public'. Remember, app developers appreciate if API is as slim
> (and therefore simple) as possible.

This methods are only provided by the write archive and are necessary since the read archive needs access to this low level API in create_writable.

> * in RW class, instead of checking for readable and writable in each an every
> method, its better if you do it just once in construction method.

Maybe I understand you wrong:

You stated on IRC that the preconditions are for the programmer. So if the programmer tries to extract some files from a writable-only archive he should get an assertion error in my opinion. I can't do anything about this in the constructor, can I?

I agree that there are few cases where we can avoid these checks (especially private methods, assuming that they are not erroneously called). I can remove them there.

> * Coding style is to have only space before '(', not after.

Ok.

> About your async issue, I can only know for sure whats happening after looking
> at the code but have you ensured that you change the caller of extract_files to
> either yield on the call or launch it async with 'begin' keyword after you make
> extract_files async?
> 
> As usual, the best way to figure such issues is to create a standalone code
> that is only aimed at reproducing your issue and hence making it very easy to
> debug (especially for others who might only be interested in helping with this
> particular problem but not motivated enough to go through the whole code).

I have tried various variations in the original and a standalone project.

I'll append the file in a minute.
Comment 6 Lasse Schuirmann 2014-05-18 11:55:40 UTC
Created attachment 276725 [details]
Standalone async issue

Output:

make run
valac main.vala --pkg glib-2.0 --pkg gio-unix-2.0
./main
Constructing AsyncTestClass...

(The destructor gets called if I don't call do_something in main)
Comment 7 Zeeshan Ali 2014-05-18 12:53:39 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > * When you move the wrapper to Boxes, I assume the namespace will be changed
> > from Utils to Boxes?
> 
> It seemed to me like a general utility, so Utils did make sense for me. However
> it doesn't make that difference so I'll go with your preference.

We do have a 'Util' namespace so you can use that instead.

> > * I think its more useful to have an extract_file than extract_files. That way
> > caller doesn't have to create a hashtable just for calling this function. If
> > caller has a hashtable already, it can simply look on its own or you can
> > provide a extract_files that does this.
> 
> This is a bit psychological: when extracting more thean one file the
> extract_files method is more performant than invoking an extract_file method
> twice. (You don't have to iterate through all the headers again.) So instead of
> letting the user iterate twice through the whole archive I enforce him to
> create this hashtable which makes it far more convenient to just invoke
> extract_files once.

Ah ok. I think you should rather take two arrays here, one for src and one for dest, where each src and dst pair has the same index in the array. That way caller doesn't need to create a hashtable and put elements into it etc. Also, it'll avoid unnecessary allocations.

Having said all that, in the general API (outside Boxes) you are intending to create would need an function to extract only one file. It could just be a simple wrapper around extract_files.

> > * the methods that deal with entries (rather than files) don't seem very
> > useful. Unless you have a particular use case in mind, I recommend not exposing
> > such API as 'public'. Remember, app developers appreciate if API is as slim
> > (and therefore simple) as possible.
> 
> This methods are only provided by the write archive and are necessary since the
> read archive needs access to this low level API in create_writable.

Why not have a constructor in WriteArchive that takes a ReadArchive instead so you don't need to provide this API in WriteArchive?

BTW, when referring to identifiers, try your best to user their names rather than description, e.g ReadArchive rather than 'read archive' as that could be a bit confusing.
 
> > * in RW class, instead of checking for readable and writable in each an every
> > method, its better if you do it just once in construction method.
> 
> Maybe I understand you wrong:
> 
> You stated on IRC that the preconditions are for the programmer. So if the
> programmer tries to extract some files from a writable-only archive he should
> get an assertion error in my opinion. I can't do anything about this in the
> constructor, can I?

You can have preconditions in constructor too (AFAIK). I was not talking about how to deal with them but rather where to deal with them.
Comment 8 Zeeshan Ali 2014-05-18 13:06:24 UTC
>class AsyncTestClass : GLib.Object {
>    public AsyncTestClass () {
>        stdout.printf ("Constructing AsyncTestClass...\n");
>        stdout.flush ();
>    }
>    ~AsyncTestClass () {
>        stdout.printf ("Destroying AsyncTestClass...\n");
>        stdout.flush ();
>    }
>    public async void do_something () {
>        yield;
>        return;
>    }

Here you are yielding to mainloop without setting up any means to come back from it so of course if you call this, the method will never return.

Do check run_in_thread function in util.vala for an example of how to wrap a sync call into a async wrapper. I think if you can't actually make the I/O of your API actually async, you should be making use of run_in_thread.

>}
>
>int main () {
>    var instance = new AsyncTestClass ();
>    instance.do_something.begin ();
>    yield;

This should be:

You can call 'yield' from a sync function. You want to create a MainLoop object here and use that. Are you sure you have actually read through examples of async in Vala? You really should.

I also recommend learning a bit about mainloop and async a bit. Apart from Gio manual, you might also want to read the recent blog posts from Philip on these: 

https://tecnocode.co.uk/2014/03/27/what-is-gmaincontext/
https://tecnocode.co.uk/2014/04/19/ensuring-functions-are-called-in-the-right-context/
Comment 9 Zeeshan Ali 2014-05-18 13:07:19 UTC
(In reply to comment #8)
>
> This should be:

NM this line. :)
 
> You can call 'yield' from a sync function.

can -> can't.
Comment 10 Lasse Schuirmann 2014-05-18 13:45:44 UTC
(In reply to comment #8)
> >class AsyncTestClass : GLib.Object {
> >    public AsyncTestClass () {
> >        stdout.printf ("Constructing AsyncTestClass...\n");
> >        stdout.flush ();
> >    }
> >    ~AsyncTestClass () {
> >        stdout.printf ("Destroying AsyncTestClass...\n");
> >        stdout.flush ();
> >    }
> >    public async void do_something () {
> >        yield;
> >        return;
> >    }
> 
> Here you are yielding to mainloop without setting up any means to come back
> from it so of course if you call this, the method will never return.
> 
> Do check run_in_thread function in util.vala for an example of how to wrap a
> sync call into a async wrapper. I think if you can't actually make the I/O of
> your API actually async, you should be making use of run_in_thread.
> 
> >}
> >
> >int main () {
> >    var instance = new AsyncTestClass ();
> >    instance.do_something.begin ();
> >    yield;
> 
> This should be:
> 
> You can call 'yield' from a sync function. You want to create a MainLoop object
> here and use that. Are you sure you have actually read through examples of
> async in Vala? You really should.
> 
> I also recommend learning a bit about mainloop and async a bit. Apart from Gio
> manual, you might also want to read the recent blog posts from Philip on these: 
> 
> https://tecnocode.co.uk/2014/03/27/what-is-gmaincontext/
> https://tecnocode.co.uk/2014/04/19/ensuring-functions-are-called-in-the-right-context/

Thanks. I did read through the examples but as you have seen this was not that successful. (And I uploaded in fact the wrong file but that's irrelevant.)

Thanks for the additional hints. I'll study them.
Comment 11 Zeeshan Ali 2014-05-19 10:19:04 UTC
(In reply to comment #10)
> (In reply to comment #8)
> > >class AsyncTestClass : GLib.Object {
> > >    public AsyncTestClass () {
> > >        stdout.printf ("Constructing AsyncTestClass...\n");
> > >        stdout.flush ();
> > >    }
> > >    ~AsyncTestClass () {
> > >        stdout.printf ("Destroying AsyncTestClass...\n");
> > >        stdout.flush ();
> > >    }
> > >    public async void do_something () {
> > >        yield;
> > >        return;
> > >    }
> > 
> > Here you are yielding to mainloop without setting up any means to come back
> > from it so of course if you call this, the method will never return.
> > 
> > Do check run_in_thread function in util.vala for an example of how to wrap a
> > sync call into a async wrapper. I think if you can't actually make the I/O of
> > your API actually async, you should be making use of run_in_thread.
> > 
> > >}
> > >
> > >int main () {
> > >    var instance = new AsyncTestClass ();
> > >    instance.do_something.begin ();
> > >    yield;
> > 
> > This should be:
> > 
> > You can call 'yield' from a sync function. You want to create a MainLoop object
> > here and use that. Are you sure you have actually read through examples of
> > async in Vala? You really should.
> > 
> > I also recommend learning a bit about mainloop and async a bit. Apart from Gio
> > manual, you might also want to read the recent blog posts from Philip on these: 
> > 
> > https://tecnocode.co.uk/2014/03/27/what-is-gmaincontext/
> > https://tecnocode.co.uk/2014/04/19/ensuring-functions-are-called-in-the-right-context/
> 
> Thanks. I did read through the examples but as you have seen this was not that
> successful. (And I uploaded in fact the wrong file but that's irrelevant.)
> 
> Thanks for the additional hints. I'll study them.

No worries. Please keep in mind that first iteration of the API (and Boxes patches) don't need to be async (just put a FIXME there about that). Lets focus on getting sync version in first.
Comment 12 Zeeshan Ali 2014-05-19 10:32:44 UTC
(In reply to comment #7)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > * in RW class, instead of checking for readable and writable in each an every
> > > method, its better if you do it just once in construction method.
> > 
> > Maybe I understand you wrong:
> > 
> > You stated on IRC that the preconditions are for the programmer. So if the
> > programmer tries to extract some files from a writable-only archive he should
> > get an assertion error in my opinion. I can't do anything about this in the
> > constructor, can I?
> 
> You can have preconditions in constructor too (AFAIK). I was not talking about
> how to deal with them but rather where to deal with them.

Thinking more about it, this is likely a runtime error rather than programmer (especially if you move it to constructor): How would a programmer (user of this API) know that if all the archive files are always writable? We throw and error in constructor and programmer can (and will) have a catch around the call to handle the case.

GIO can actually handle the throwing of appropriate error for you if you use GIO to open and/or read/write the file (which IMHO you should if possible, anyway). If you need API to read/write from/to file descriptors, see:

https://developer.gnome.org/gio/stable/GUnixInputStream.html
https://developer.gnome.org/gio/stable/GUnixOutputStream.html