After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 662376 - OnlineGlom: java-libglom: Wrap SqlBuilder and SqlExpr instead of working around it
OnlineGlom: java-libglom: Wrap SqlBuilder and SqlExpr instead of working arou...
Status: RESOLVED FIXED
Product: glom
Classification: Other
Component: OnlineGlom
git master
Other Linux
: Normal normal
: ---
Assigned To: Murray Cumming
Murray Cumming
Depends on:
Blocks:
 
 
Reported: 2011-10-21 08:25 UTC by Murray Cumming
Modified: 2011-12-08 20:39 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Murray Cumming 2011-10-21 08:25:21 UTC
java-libglom has some custom code that is apparently just to work around the fact that it doesn't wrap Gnome::Gda::SqlBuilder and Gnome::Gda::SqlExpr:
http://gitorious.org/online-glom/java-libglom/blobs/master/src/glom_db_utils.i

However, wrapping those types would:

a) Make the code simpler by avoiding the need for that custom code.

b) Remove the possibility of SQL injection caused by the possibility of Glom::Utils::sqlbuilder_get_full_query() not doing proper quoting or escaping, and avoiding the need for us to test that in addition to the regular SQL injection tests that I have recently added to Glom.
Comment 1 Ben Konrath 2011-10-25 13:28:34 UTC
That custom code is also avoiding Gnome::Gda::Value as seen here:

https://gitorious.org/online-glom/java-libglom/blobs/master/src/glom.i#line40

The reason I wanted avoid wrapping these classes is because they bring in a lot of extra dependencies which adds complexity on its own. For example Glib::RefPtr might be difficult to wrap. Of course, it's possible to selectively ignore parts of Gnome::Gda that are not needed but it's hard to predict exactly how difficult it will be.

(In reply to comment #0)
> a) Make the code simpler by avoiding the need for that custom code.

I agree that this is always a good goal.

> b) Remove the possibility of SQL injection caused by the possibility of
> Glom::Utils::sqlbuilder_get_full_query() not doing proper quoting or escaping,
> and avoiding the need for us to test that in addition to the regular SQL
> injection tests that I have recently added to Glom.

Are you suggesting moving from the JDBC for database access to using libgda in Online Glom? As I understand things, if wrappers for SqlBuilder, SqlExpr and Value are added, Glom::Utils::sqlbuilder_get_full_query() would still be needed to get the query to use with java.sql.Statement.executeQuery().
Comment 2 Murray Cumming 2011-10-25 18:52:44 UTC
(In reply to comment #1)
> That custom code is also avoiding Gnome::Gda::Value as seen here:
> 
> https://gitorious.org/online-glom/java-libglom/blobs/master/src/glom.i#line40

Yes, I see that you are hiding it behind a string instead:
https://gitorious.org/online-glom/java-libglom/blobs/master/src/glom_db_utils.i#line58

That's not very wise. It depends on probably-undocumented text representations of the various value types. It's something that we should fix properly eventually.

> The reason I wanted avoid wrapping these classes is because they bring in a lot
> of extra dependencies which adds complexity on its own. For example
> Glib::RefPtr might be difficult to wrap.

Luckily, I think you've solved that problem with sharedptr<>, which is much the same as RefPtr.

> Of course, it's possible to
> selectively ignore parts of Gnome::Gda that are not needed but it's hard to
> predict exactly how difficult it will be.

Yes, but I'd like us to try. Maybe I will do it.
 
> Are you suggesting moving from the JDBC for database access to using libgda in
> Online Glom?

I didn't mean to, but I guess I did suggest that. That wouldn't be wise, obviously.

> As I understand things, if wrappers for SqlBuilder, SqlExpr and
> Value are added, Glom::Utils::sqlbuilder_get_full_query() would still be needed
> to get the query to use with java.sql.Statement.executeQuery().

That's a good point. I guess I need to make sure that it works properly then, though I would still like to remove as many of these hacks as possible.
Comment 3 Ben Konrath 2011-10-26 07:38:10 UTC
(In reply to comment #2)
> Yes, I see that you are hiding it behind a string instead:
> https://gitorious.org/online-glom/java-libglom/blobs/master/src/glom_db_utils.i#line58
> 
> That's not very wise. It depends on probably-undocumented text representations
> of the various value types. It's something that we should fix properly
> eventually.

Ok the DataItem DTO to pass the primary key value from the client to the server and vice versa. The name of that class could actually be changed to Value to match Gda. There will still need to be a conversion from text when a user loads a details page from a bookmark so this is something to be aware of.

> > The reason I wanted avoid wrapping these classes is because they bring in a lot
> > of extra dependencies which adds complexity on its own. For example
> > Glib::RefPtr might be difficult to wrap.
> 
> Luckily, I think you've solved that problem with sharedptr<>, which is much the
> same as RefPtr.

Ah ok, yeah I wasn't sure about that. Then it shouldn't be to bad.

> > Of course, it's possible to
> > selectively ignore parts of Gnome::Gda that are not needed but it's hard to
> > predict exactly how difficult it will be.
> 
> Yes, but I'd like us to try. Maybe I will do it.

I did some stuff when I was investigating this so I'll commit it.
Comment 4 Ben Konrath 2011-10-26 13:59:41 UTC
Here's the commit for wrapping Gnome::Gda::Value:

https://gitorious.org/online-glom/java-libglom/commit/651dc7c507eb1d4e351bcb26a34a8b9dc734887c

I've only added to the libglom-1-18 branch because I'm setup with Glom master.
Comment 5 Ben Konrath 2011-10-26 22:04:25 UTC
Gnome:Gda classes should be put in their own java package. A method to create multiple packages is described at the end of this section:

http://swig.org/Doc2.0/SWIGDocumentation.html#Java_directors_typemaps
Comment 6 Murray Cumming 2011-11-04 21:45:26 UTC
> > Glom::Utils::sqlbuilder_get_full_query() would still be needed
> > to get the query to use with java.sql.Statement.executeQuery().
>
> That's a good point. I guess I need to make sure that it works properly then,
> though I would still like to remove as many of these hacks as possible.

This commit does that, but now that I think about it, you are maybe not using ConnectionPool, even indirectly, so maybe it won't be able to get the GdaConnection, so this might fail now.
http://git.gnome.org/browse/glom/commit/?id=a61a23dc0ecd83c8251ff34727597734bff0f25e

I'll try to make it work without the GdaConnection - feel free to revert that commit if necessary in the meantime.
Comment 7 Ben Konrath 2011-11-08 14:26:47 UTC
Just thinking out loud for a moment ... maybe we should consider using libgda for database access instead of java.sql / JDBC. For this to work, there needs to be a way to access the database with multiple threads concurrently. There should be a pool of connections to the database that can be reused so that the servlet doesn't have to create a new connection for each database request. The characteristics of the connection pool (number of connections, timeouts, etc) needs to be configurable so that sysadmins can configure the servlet for various loads and memory. Is this currently possible with libgda and/or libglom?

Using libgda / libglom for database access would probably be less efficient than using the Java solution but it would allow the the servlet to access the database in the same way that Glom does. This could allow more common code to be pushed into libglom.

Anyway, this is just an idea. Maybe we can discuss this further in a separate bug since it seems to be beyond the scope of this bug. I'll concentrate on removing the custom code in glom_db_utils.i.
Comment 8 Murray Cumming 2011-11-08 14:42:36 UTC
Overall, I think I'd like to use Java's database API to 
a) Make sure we can use whatever things that allows, such as data-bound widgets, connection pooling, and correct handling of threading issues.
b) Avoid having to do that in libgda.

I am concerned that my latest changes require you to make a database connection via libglom even though you don't directly use it, just so libglom can use the correct SQL syntax. Have you noticed that? I think I need to find a way to avoid that.
Comment 9 Ben Konrath 2011-11-09 09:55:05 UTC
Yeah, I agree that using the Java APIs for the database access is good. I just wanted to make sure we're exploring all of the options. :-)

I haven't used the latest code yet. But when I hit the problem I'll just revert that commit. Thanks for the heads up.
Comment 10 Ben Konrath 2011-11-09 20:33:17 UTC
Ok, so I just tried Online Glom with the latest from master. The change you made is working with the Debian Repository Analyzer, I just get this error message on every query:

static Glom::sharedptr<Glom::SharedConnection> Glom::ConnectionPool::get_and_connect(): m_backend is null.
Glib::RefPtr<Gnome::Gda::Connection> Glom::get_connection(): No connection yet.
std::string Glom::Utils::sqlbuilder_get_full_query(const Glib::RefPtr<const Gnome::Gda::SqlBuilder>&): There is no connection, so the SQL statement might not be created correctly.

Should I be concerned about these error messages? Do you know if it will work in all cases?
Comment 11 Murray Cumming 2011-11-09 20:39:55 UTC
(In reply to comment #10)
> Ok, so I just tried Online Glom with the latest from master. The change you
> made is working with the Debian Repository Analyzer, I just get this error
> message on every query:
> 
> static Glom::sharedptr<Glom::SharedConnection>
> Glom::ConnectionPool::get_and_connect(): m_backend is null.
> Glib::RefPtr<Gnome::Gda::Connection> Glom::get_connection(): No connection yet.
> std::string Glom::Utils::sqlbuilder_get_full_query(const Glib::RefPtr<const
> Gnome::Gda::SqlBuilder>&): There is no connection, so the SQL statement might
> not be created correctly.
> 
> Should I be concerned about these error messages? Do you know if it will work
> in all cases?

I'm glad that that's all you hit. There are other utility functions that really need the connection now, but maybe you don't use them.

It will still work, but it will not quote things properly, so table names and fields might not work properly if there are uppercase letters. And there might be a risk of SQL injection. But that's just the same as before.

I'm talking to the libgda maintainer about making this work without an actually-open libgda connection, then everything will be proper and correct.
Comment 12 Murray Cumming 2011-11-14 07:57:11 UTC
(In reply to comment #11)
> I'm talking to the libgda maintainer about making this work without an
> actually-open libgda connection, then everything will be proper and correct.

That is working with libgda and glom from git master. This test shows what you should do:
http://git.gnome.org/browse/glom/tree/tests/test_fake_connection.cc#n66

Hopefully there will be a libgda release soonish so I can get this all into packages.
Comment 13 Ben Konrath 2011-11-14 08:02:35 UTC
(In reply to comment #12)
> Hopefully there will be a libgda release soonish so I can get this all into
> packages.

No need to rush for on my account; I'm setup with a nice jhbuild environment.
Comment 14 Ben Konrath 2011-11-21 11:52:28 UTC
I've got Gnome::Gda::SqlBuilder and Glom:Utils wrapped. Before I commit the changes, I want to make sure that everything is working so I've started to port this test to Java: 

http://git.gnome.org/browse/glom/tree/tests/test_fake_connection.cc#n66

It seems that I need to also wrap the connection pool classes to be able to set the fake connection but they are not part of the libglom's public API (at least the header files aren't installed). Since we're not going to be using the connection pool classes in Online Glom, maybe a utility method to set the fake connection can be added to libglom. Presumably this only has to be set once, right? This also avoids dealing with std::auto_ptr in SWIG. Does this make sense?
Comment 15 Murray Cumming 2011-11-21 12:27:06 UTC
(In reply to comment #14)
> It seems that I need to also wrap the connection pool classes to be able to set
> the fake connection but they are not part of the libglom's public API (at least
> the header files aren't installed). Since we're not going to be using the
> connection pool classes in Online Glom, maybe a utility method to set the fake
> connection can be added to libglom.

Sorry. I didn't realise that ConnectionPool wasn't public API, and I'm happy to keep that non-public. Here is a function for you to use:
http://git.gnome.org/browse/glom/commit/?id=3cc85313bb90fe8d59900f1868bc63afdeda0d8f

> Presumably this only has to be set once,
> right?

Yes.

> This also avoids dealing with std::auto_ptr in SWIG. Does this make
> sense?

Yes.
Comment 16 Ben Konrath 2011-11-24 12:14:36 UTC
I get this error message when I call the set_fake_connection() method from Java:

Glom::Document* Glom::ConnectionPool::get_document(): m_slot_get_document is null.

Is this supposed to happen?
Comment 17 Murray Cumming 2011-11-24 12:24:39 UTC
I wouldn't worry about it. But I'll make sure that it doesn't happen. It's been polluting my test output too.
Comment 18 Ben Konrath 2011-11-25 13:17:00 UTC
Ok, this is mostly done in both java-libglom and gwt-glom. The only remaining issue to sort out is how to properly protect against an sql injection when loading the details view from a URL string (link or bookmark) with string primary key value. For example, this page:

http://bagu.org:8080/OnlineGlom/#details:document=debian_repository_analyzer&table=packages&value=libgcc1

The "libgcc1" string is currently being converted into a Gda Value (using the Value(String) constructor) before it is used in the Glom::Utils build_sql methods. What else do I need to do to ensure this isn't an SQL injection attack vector?
Comment 19 Murray Cumming 2011-11-25 20:42:16 UTC
(In reply to comment #18)
> how to properly protect against an sql injection when
> loading the details view from a URL string (link or bookmark) with string
> primary key value.
[snip]
> The "libgcc1" string is currently being converted into a Gda Value (using the
> Value(String) constructor) before it is used in the Glom::Utils build_sql
> methods.

Firstly, you can trust the libglom API (because it uses GdaSqlBuilder) to take care of escaping and quoting for you, so you should never need to check the string for SQL injection.

However, do be careful that most other IDs are actually numeric, and you should probably convert the URL parameter string to a numeric type in that case, after you have discovered the type of the specified field. If this is working with a string Gda Value then you have just been lucky so far - that arguably shouldn't work.
Comment 20 Ben Konrath 2011-11-25 20:53:23 UTC
(In reply to comment #19) 
> Firstly, you can trust the libglom API (because it uses GdaSqlBuilder) to take
> care of escaping and quoting for you, so you should never need to check the
> string for SQL injection.

OK.

> However, do be careful that most other IDs are actually numeric, and you should
> probably convert the URL parameter string to a numeric type in that case, after
> you have discovered the type of the specified field. If this is working with a
> string Gda Value then you have just been lucky so far - that arguably shouldn't
> work.

It first tries to convert the string to a numeric type and only uses a string type if the numeric conversion fails. This assumption is doubled checked by creating the Gda Value based on the type from the Glom document.

I think that covers things then.
Comment 21 Murray Cumming 2011-11-25 20:58:46 UTC
(In reply to comment #20)
> It first tries to convert the string to a numeric type and only uses a string
> type if the numeric conversion fails. This assumption is doubled checked by
> creating the Gda Value based on the type from the Glom document.

So it only tries to create a numeric type if the field is numeric?

Note that, even though a fallback to a string for a numeric type should not be useful, glom's tests do check for SQL injections via unexpected value types.
Comment 22 Ben Konrath 2011-11-25 21:46:26 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > It first tries to convert the string to a numeric type and only uses a string
> > type if the numeric conversion fails. This assumption is doubled checked by
> > creating the Gda Value based on the type from the Glom document.
> 
> So it only tries to create a numeric type if the field is numeric?

No, there's no way to tell the type without a method call to the server. I didn't want to add a specific call to the server for this because it would increase page load times. And there's no easy way do it with the way the Places framework is setup.

Here's how it works: the type is guessed based on if the string-to-numeric conversion works or not: 

https://gitorious.org/online-glom/gwt-glom/blobs/master/src/main/java/org/glom/web/client/place/DetailsPlace.java#line117

If the conversion works, the PrimaryKeyItem will be a numeric type. If the conversion doesn't work, the PrimaryKeyItem will be a string type. But if that conversion was set to the wrong type, the value will be ignored when it's converted to a Gda Value: 

https://gitorious.org/online-glom/gwt-glom/blobs/master/src/main/java/org/glom/web/server/Utils.java#line52

Note: I just committed a small a change to get this behaviour.

Does this look OK?
Comment 23 Murray Cumming 2011-11-26 09:46:43 UTC
> If the conversion works, the PrimaryKeyItem will be a numeric type. If the
> conversion doesn't work, the PrimaryKeyItem will be a string type

Does this mean, for instance, that in the same table for the same field, an ID of "abc" will be interpreted as "abc", but "123" will be interpreted as 123, and therefore the SQL query will fail?

For instance, maybe there are package IDs in that repository database that have IDs that would convert to a number.

Note also that the string-to-number conversion is locale dependent. That's just another reason to avoid it.
Comment 24 Ben Konrath 2011-11-26 17:12:33 UTC
(In reply to comment #23)
> > If the conversion works, the PrimaryKeyItem will be a numeric type. If the
> > conversion doesn't work, the PrimaryKeyItem will be a string type
> 
> Does this mean, for instance, that in the same table for the same field, an ID
> of "abc" will be interpreted as "abc", but "123" will be interpreted as 123,
> and therefore the SQL query will fail?

Yeah, this will fail loading from a bookmark or link. Good point. I should note that this is only about loading from a URL bookmark or link. Users navigating with the buttons within Online Glom won't see this problem because the URL string is not parsed with each navigation event. The URL is only a string representation of a bookmarkable place that is only used when Online Glom is first started.

I can add type information to the URL string or get the type information from the server before loading the page. Which do think is better? 

Personally, I think adding the type information to the URL string would be the easiest. The actual query would still use the type information from the glom document but parsing the primary key value would use the type from the URL. If the type from the URL string were to be edited, there would be a type mis-match in the server-side code and value would be thrown out. In this case, the first record from the default query of the details view would be displayed. What do you think?

> Note also that the string-to-number conversion is locale dependent. That's just
> another reason to avoid it.

Since a bookmark or link URL is stored as string, this conversion is hard to avoid. Unless you can think of a way to identify a specific record in the details view using a string without using the primary key value. I haven't been able to come up with anything but I'm definitely open to suggestions.

As for the string-to-number conversion being locale dependent, since Online Glom is doing the number-to-string, the string-to-number conversion should be ok. Neither conversion is using locale specific conversion methods.

Obviously it's possible for a user to edit the URL string to add whatever they want so more validation needs to be added:

* If the primary key value isn't parsed correctly, it can be ignored and the first record from the default query of the details view should be displayed.
* If the document isn't found, the document selection page should be loaded.
* If the URL isn't in a recognized format, the document selection page should be loaded.
* etc.

Note that the default table is already loaded when a table has not been found in a given document.

We could also just avoid having the primary key value represented as a URL string and not have details records specifically bookmarkable (only the document and table would be saved). Of course we'd loose this feature which seems useful.
Comment 25 Murray Cumming 2011-11-26 17:26:55 UTC
(In reply to comment #24)
> I can add type information to the URL string or get the type information from
> the server before loading the page. Which do think is better? 

Getting the type from libglom (I don't know what you mean by server - database server or web server) is the only way that would be correct.

> If
> the type from the URL string were to be edited, there would be a type mis-match

Generally it's fine for us to reject what cannot work, and showing the first record instead is fine for now if that happens.

> Since a bookmark or link URL is stored as string, this conversion is hard to
> avoid.

Well, please make sure that the conversion is independent of whatever locale might be set for the linux user on the server or the locale used by the client's browser. Hopefully there is a way to use standard ISO format.

> since Online
> Glom is doing the number-to-string, the string-to-number conversion should be
> ok.

Well, it depends on what part of the libglom API you are using, and if it's doing locale-dependent conversion then it depends on the locale being initialized and depends on what locale the linux user on the server has. It would be best to avoid the ambiguity.

> We could also just avoid having the primary key value represented as a URL
> string and not have details records specifically bookmarkable (only the
> document and table would be saved). Of course we'd loose this feature which
> seems useful.

No, I think it's an essential feature and nothing should be too hard about it.
Comment 26 Ben Konrath 2011-11-28 16:04:13 UTC
This is now fixed. There are a couple of commits after this one that fix a couple of small problems but this is the main commit:

https://gitorious.org/online-glom/gwt-glom/commit/e8e3bdcee81e4f468e79342fa4826f60e1cf63af
Comment 27 Ben Konrath 2011-12-01 15:26:34 UTC
I now have these error message when running with glom-1-20:

Glom::sharedptr<Glom::SharedConnection> Glom::ConnectionPool::connect(): update_meta_store_data_types() failed: Connection is closed
Glom::sharedptr<Glom::SharedConnection> Glom::ConnectionPool::connect(): update_meta_store_table_names() failed: Connection is closed

Is this a problem?
Comment 28 Murray Cumming 2011-12-01 15:43:56 UTC
Is that with a fairly recent libgda?
Comment 29 Ben Konrath 2011-12-01 16:20:30 UTC
I'm using libgda 5.0.2.
Comment 30 Murray Cumming 2011-12-02 09:55:29 UTC
Do you see those warnings during the java-libglom test?
Comment 31 Ben Konrath 2011-12-02 12:38:12 UTC
Yes, with FakeConnectionTest.
Comment 32 Murray Cumming 2011-12-02 12:41:27 UTC
I don't see that during java-libglom's make check, though I am using libgda from git master.

Anyway, I don't think it's a big issue.
Comment 33 Ben Konrath 2011-12-02 13:24:12 UTC
I get the error with libgda master as well. But I'll just ignore it for now. Thanks.
Comment 34 Murray Cumming 2011-12-08 20:39:41 UTC
Those warnings are fixed by this commit:
http://git.gnome.org/browse/glom/commit/?id=f63c2146b3a579eb5056382c898d33186724bb10

I guess that libgda might not be so able to complain or adapt if it does not know the real table structure, based on that meta data, so it would depend on us giving the correct GValue types. But OnlineGlom could only be given false field types via the .glom XML file, not via the browser user, so the risk seems minimal.