GNOME Bugzilla – Bug 730950
Transaction support
Last modified: 2014-11-05 11:54:08 UTC
At the moment, when I try to INSERT lots of entries in my SQLite DB with gom, things take a long time. This is because each request is run in its own transaction. It would be great if gom had an API to allow adding/updating/deleting lots of items at once. I'd even be willing to give a try at implementing it, as this is pretty much a blocker for me. :) Here's an idea for what the API could look like. GomAdapter *adapter; ItemResource *resource1, *resource2; // Create a new transaction GomTransaction *transaction = gom_transaction_new(); // Add some resources to the transaction resource1 = g_object_new(...); transaction_add_resource(transaction, resource1, &error); resource2 = g_object_new(...); transaction_add_resource(transaction, resource2, &error); // If everything went well transaction_commit(transaction, &error); // If something went wrong transaction_rollback(transaction, &error); What do you think?
You indeed have found the Tao of SQLite :) Transactions and difficult to get "right" (for some definition of right), so until we could get there, I've used https://git.gnome.org/browse/gom/tree/gom/gom-adapter.h#n78 (gom_adapter_queue_write) which will give you a callback in the SQLite thread. You can do everything synchronous from there, including BEGIN/COMMIT. But yes, this is pretty much a hack at the moment. We really need to think through the transaction semantics since someday we'd like to support more than SQLite. My instinct tells me we should have an interface that is shared between the repository and the transaction itself. This would allow us to potentially set the "repository" field on an object (or better name perhaps), and still use the gom_resource_* async functions. Thoughts?
> (gom_adapter_queue_write) which will give you a callback in the SQLite thread. > You can do everything synchronous from there, including BEGIN/COMMIT. I actually tried using it, but didn't really succeed. > My instinct tells me we should have an > interface that is shared between the repository and the transaction itself. > This would allow us to potentially set the "repository" field on an object (or > better name perhaps), and still use the gom_resource_* async functions. Yeah, with the proposal I made, it's not clear what happens when there is an existing transaction and at the same time the application calls gom_resource_save_sync (for example) I'm also not entirely sure what should happen in a multi-threaded app: each thread has its own transaction? Can multiple threads access the same transaction? Etc...
(In reply to comment #2) > Yeah, with the proposal I made, it's not clear what happens when there is an > existing transaction and at the same time the application calls > gom_resource_save_sync (for example) Yeah, we'll need a planning session to figure this stuff out. > I'm also not entirely sure what should happen in a multi-threaded app: each > thread has its own transaction? Can multiple threads access the same > transaction? Etc... I really don't think you want a transaction to live between multiple threads. If we were writing a commercial database with MVCC, we might need that. Certainly not for general purpose desktop/mobile/phone applications.
Discussing this at GUADEC, we thought that we should add something similar to GomResourceGroup, but for writes (as opposed to read queries). Then one would call something like gom_writes_group_save() and it would do all this within a single transaction. Finding a name for this "write" object is probably the hardest part of implementing this though ;)
Created attachment 286744 [details] [review] gom: Export gom_resource_do_save() internally So it can be used for saving when already in worker thread.
Created attachment 286745 [details] [review] gom: Add support for batches inserts XXX: Missing async version
Created attachment 286746 [details] [review] tests: Add test for batched writes
Created attachment 286754 [details] [review] gom: Add support for batched writes
Review of attachment 286754 [details] [review]: Pretty sure we want to allow various types of subclasses to co-exist within the resource-group, but other than that, LGTM. ::: gom/gom-resource-group.c @@ -58,0 +64,11 @@ +GomResourceGroup * +gom_resource_group_new (GomRepository *repository, + GType resource_type) ... 8 more ... Part of me wishes this was an "is-writable" construct param, so that _constructed() could know about it in the future. Not sure it's worth the effort now though. @@ -58,0 +64,21 @@ +GomResourceGroup * +gom_resource_group_new (GomRepository *repository, + GType resource_type) ... 18 more ... g_type_is_a() so that this can work with sub-classes? @@ -58,0 +64,67 @@ +GomResourceGroup * +gom_resource_group_new (GomRepository *repository, + GType resource_type) ... 64 more ... This was a bit deceiving at first, but looks right. Iterate through all items, freeing each as we iterate to avoid a second _list_foreach() to unref, but only execute do_save() until we've failed. Although, we do a list_free() at the end anyway, wish there was a g_list_free_with_destroy().
(In reply to comment #9) > Review of attachment 286754 [details] [review]: > > Pretty sure we want to allow various types of subclasses to co-exist within the > resource-group, but other than that, LGTM. Right, good idea. > ::: gom/gom-resource-group.c > @@ -58,0 +64,11 @@ > +GomResourceGroup * > +gom_resource_group_new (GomRepository *repository, > + GType resource_type) > ... 8 more ... > > Part of me wishes this was an "is-writable" construct param, so that > _constructed() could know about it in the future. Not sure it's worth the > effort now though. Done. > @@ -58,0 +64,21 @@ > +GomResourceGroup * > +gom_resource_group_new (GomRepository *repository, > + GType resource_type) > ... 18 more ... > > g_type_is_a() so that this can work with sub-classes? I've removed that. > @@ -58,0 +64,67 @@ > +GomResourceGroup * > +gom_resource_group_new (GomRepository *repository, > + GType resource_type) > ... 64 more ... > > This was a bit deceiving at first, but looks right. This is to avoid half the list having been processed if we encounter an error, so half the items have been freed after the loop. > Iterate through all items, > freeing each as we iterate to avoid a second _list_foreach() to unref, but only > execute do_save() until we've failed. > > Although, we do a list_free() at the end anyway, wish there was a > g_list_free_with_destroy(). It would probably look nicer with a GptrArray. What do you think?
Created attachment 286756 [details] [review] gom: Add support for batched writes
Created attachment 286757 [details] [review] tests: Add test for batched writes
(In reply to comment #10) > It would probably look nicer with a GptrArray. What do you think? Yeah let's do that. Especially since it only ever prepend/reverse the list.
Created attachment 286759 [details] [review] gom: Add support for batched writes
This has been pushed, but I think that the performance gains aren't that interested when compared to before we landed the very necessary patches from bug 728301. I'll close this as fixed, and we can investigate how long specific parts of the code take.
Yeah, pretty useless actually: /GomRepository/stress: ** Message: writing items sync took 5.998153 OK /GomRepository/stress2: ** Message: preparing batch took 1.872136 ** Message: writing batch took 4.485591 OK
When using real files instead of the memory DBs: /GomRepository/stress: ** Message: writing items sync took 13.392836 OK /GomRepository/stress2: ** Message: preparing batch took 1.857372 ** Message: writing batch took 4.632101 OK So it seems to work fine.