Bug 623891 – last_insert_rowid's performance is bad.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 623891 - last_insert_rowid's performance is bad.


Summary:	last_insert_rowid's performance is bad.


Status:	RESOLVED OBSOLETE

Product:	libgda
Classification:	Other
Component:	SQLite provider
Version:	4.1.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	malerba
QA Contact:	gnome-db Maintainers

URL:
Whiteboard:

Depends on:
Blocks:	565773

Reported:	2010-07-08 22:56 UTC by Massimo Cora'
Modified:	2018-09-21 13:41 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Massimo Cora' 2010-07-08 22:56:10 UTC

After so much benchmarking and hacking on symbol-db population flow trying to understand why it was so slow, I probably found the reason.

last_insert_rowid information retrieved with a gda_connection_statement_execute_non_select () statement is the bottle neck.

When libgda retrieves last_insert_rowid information it becomes 10x times slower than SQLite native apis! 7.70 vs 0.80

When the sqlite3_last_insert_rowid (db) is used on sqlite side it's performance grows up only for some milliseconds, but outperforms libgda in any case.

Please check the benchmark programs here 
http://git.gnome.org/browse/anjuta/tree/plugins/symbol-db/benchmark/libgda?h=sdb-core-trans
and 
http://git.gnome.org/browse/anjuta/tree/plugins/symbol-db/benchmark/sqlite?h=sdb-core-trans

to have some tests.
you don't need to compile Anjuta to have them working, you can just compile them out of the box.

Comment 1 malerba 2010-07-10 20:08:29 UTC

I've committed some changes to improve performances, but it only goes down to 5.5s instead of 8, which still leaves a big gap...
However last_insert_rowid in Libgda actually retreives a complete row of data, which is more than only getting a rowid, so I guess the SQlite test should also include retreiving the complete row.

Anyway, Libgda can't compete on the raw performances with SQLite so my advice would either be to use SQLite directly (but you'll the ability to use another dbms in the future, if you ever wanted to and you'll use the other benefits of Libgda if you use them), and also I don't think it would solve the performances problem completely because you'll always find a project with more symbols and thus a longer time to insert into a db; so IMHO you'll still have, at one point or another, to use one or more worker threads to actually do the job (if this is already the case, please forget this).

Comment 2 malerba 2010-07-10 20:09:12 UTC

Committed to git/master BTW. I'll commit the changes to LIBGDA_4.0 before the next release.

Comment 3 Massimo Cora' 2010-07-12 14:23:09 UTC

(In reply to comment #1)
> I've committed some changes to improve performances, but it only goes down to
> 5.5s instead of 8, which still leaves a big gap...
> However last_insert_rowid in Libgda actually retreives a complete row of data,
> which is more than only getting a rowid, so I guess the SQlite test should also
> include retreiving the complete row.
> 
> Anyway, Libgda can't compete on the raw performances with SQLite so my advice
> would either be to use SQLite directly (but you'll the ability to use another
> dbms in the future, if you ever wanted to and you'll use the other benefits of
> Libgda if you use them), and also I don't think it would solve the performances
> problem completely because you'll always find a project with more symbols and
> thus a longer time to insert into a db; so IMHO you'll still have, at one point
> or another, to use one or more worker threads to actually do the job (if this
> is already the case, please forget this).

thanks for the patch. Now it goes a little bit faster.
I asked to the other Anjuta devs whether they wanted to switch to pure SQLite apis but I stil haven't a answer. I suppose that for them it's ok to use libgda in the core.
We're using libgda master so there's no hurry to committing in LIBGDA_4.0.
I'm wondering anyway why last_insert GdaSet keeps track of all the info just inserted in a table. Wouldn't it be better to just have an id? On my experience I hardly found a case where I had to use the data already inserted.
Maybe this could be a feature request for libgda next version: let the user choice whether to have only an id (fastest method) or having also additional data.
With this feature we can keep libgda without problems on the core because it would automatically become faster than it's now, even if it has a little (but it's ok) overhead due to gobject architecture.

Comment 4 Massimo Cora' 2010-07-12 14:24:00 UTC

feel free to close the bug, but please consider the feature request on comment #3.

Comment 5 malerba 2010-07-12 17:01:10 UTC

(In reply to comment #3)
> 
> thanks for the patch. Now it goes a little bit faster.
> I asked to the other Anjuta devs whether they wanted to switch to pure SQLite
> apis but I stil haven't a answer. I suppose that for them it's ok to use libgda
> in the core.
> We're using libgda master so there's no hurry to committing in LIBGDA_4.0.
> I'm wondering anyway why last_insert GdaSet keeps track of all the info just
> inserted in a table. Wouldn't it be better to just have an id? On my experience
> I hardly found a case where I had to use the data already inserted.
> Maybe this could be a feature request for libgda next version: let the user
> choice whether to have only an id (fastest method) or having also additional
> data.

The problem is once you've got an ID, how do you go to fetch the actual value? SQLite's got a rowID, postgreSQL's got one sometimes, for MySQL there is something slightly different, hence the current API which returns something useable but at a performance cost.

But I agree something should be done to improve performances, so I think it'd be better to:
* actually execute the SELECT statement only if needed
* subclass the GdaSet returned
* allow the user to specify which column of data should be fetched.

I'll add this to the TODO list.

Comment 6 Massimo Cora' 2010-07-12 18:34:24 UTC

Oh well, you're right, there are different dbs to manage...

Thanks in advance anyway

Comment 7 Massimo Cora' 2011-01-09 22:55:09 UTC

Hi Vivien,

have you scheduled some TODO improvements for this bug?
thanks

Comment 8 GNOME Infrastructure Team 2018-09-21 13:41:18 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/libgda/issues/49.