GNOME Bugzilla – Bug 721340
Set better names on places returned by forward search
Last modified: 2019-03-20 10:38:19 UTC
Created attachment 265135 [details] Searching for high street with current master I've been bothered by the results returned when searching for a common street name in London. I got results that was hard to distinguish from each others. See the attached image for what I mean. I got to debugging a bit to see what the data from Nominatim was for each item. Data to separate the results from each other is there, but not used by Geocode. Or rather the code filters them out. I will attach a patch that changes how the attributes are filtered in or out. So that results will have the general form of: [name], [suburb], [county], [postcode], [state], [country] If the attributes are available. Also, I will post a patch that fixes a typo that caused postcode never to be included. See image attached soon, of how the same search looks with the patch. The way geocode decides what attributes to include in the name attribute of GeocodePlace is still a bit shaky I feel. Maybe the best approach would just be to set a simple name, e.g. "High Street". And let the application that uses it construct the full name for it self using GeocodePlace attributes. Or maybe add some kind of "simple name" attribute to GeocodePlace for the same reason.
Created attachment 265136 [details] Searching for high street with patch below.
Created attachment 265137 [details] [review] forward: fix typo in nominatim result attribute The check for attribute 'postalcode' in Nominatim search results is wrong. The correct attribute is 'postcode'.
Created attachment 265138 [details] [review] forward: include more attributes in results Make sure to include information enough to distinguish between different search results. Set name on GeocodePlace to: [name], [suburb], [county], [postcode], [state], [country] If the attributes are available.
Created attachment 265258 [details] [review] forward: fix while cond. in insert_place_into_tree It is ok for child->data to be NULL since we insert dummy values of NULL, but it needs to be not-NULL for strcmp check.
Last one fixes cases where for example postcode was not included since it was NULL in some other place.
Created attachment 265261 [details] [review] forward: fix while cond. in insert_place_into_tree It is ok for child->data to be NULL since we insert dummy values of NULL, but it needs to be not-NULL for strcmp check.
Created attachment 265264 [details] [review] forward: fix while cond. in insert_place_into_tree It is ok for child->data to be NULL since we insert dummy values of NULL, but it needs to be not-NULL for strcmp check.
Created attachment 265265 [details] [review] Sorry, got my branches mixed up. forward: fix while cond. in insert_place_into_tree It is ok for child->data to be NULL since we insert dummy values of NULL, but it needs to be not-NULL for strcmp check.
Created attachment 265298 [details] [review] test-gcglib: fix up names to check against So when we change the way we construct the name, we need to change the way we check for them as well. This test did not work on master either. For instance while debugging I couldn't see that France was set as country at all. So that did not work before the patches above. Put the rest seem mandated.
Created attachment 265299 [details] [review] So when we change the way we construct the name, we need to change the way we check for them as well. This test did not work on master either. For instance while debugging I couldn't see that France was set as country at all. So that did not work before the patches above. Put the rest seem mandated by above patches --- test-gcglib: fix up names to check against Switching the way we put together the name of places changes how we must check for them.
Created attachment 265300 [details] [review] Also, this was needed for me to have the test pass. I don't know Czech good enough to know if it's correct. test-gcglib: Fix check against Moscow name
Review of attachment 265137 [details] [review]: ack
Review of attachment 265265 [details] [review]: looks right
Review of attachment 265138 [details] [review]: Other than my comments, looks fine to me but since this changes some code from Bastien that I still haven't completely grasped, I would like him to have a look. ::: geocode-glib/geocode-forward.c @@ -666,3 @@ "state", - "county", - "state_district", why remove this attribute? @@ -671,2 @@ "suburb", - "village", and this?
Review of attachment 265300 [details] [review]: ::: geocode-glib/test-gcglib.c @@ +341,3 @@ place = res->data; + g_assert_cmpstr (geocode_place_get_name (place), ==, "Москва, Ruská federace"); + g_assert_cmpstr (geocode_place_get_state (place), ==, "Москва"); That is not correct! The name of Moscow is "Moskva" in cs_CZ and I corrected this in OSM database. Maybe somebody messsed it up again. Anyway needs to be corrected in OSM.
The main goal of the "name" is to include the minimal amount of information necessary to differentiate the search results. So, in your example, if all the results are in "England, United Kingdom" that part of the name shouldn't appear. In this particular case, the only thing missing seems to be the postcode.
Review of attachment 265138 [details] [review]: ::: geocode-glib/geocode-forward.c @@ -890,3 @@ - - /* If there are other attributes with a different value, - * add those attributes to the string to differentiate them */ Removing this is definitely wrong. We want the search results to be as short as possible, while still allowing differentiation. So if I look for Paris, I don't get *all* the details for Paris, TX with the zipcode, etc. just "Paris, TX, US" and "Paris, France".
Review of attachment 265299 [details] [review]: ::: geocode-glib/test-gcglib.c @@ +247,3 @@ else if (g_strcmp0 (geocode_place_get_state (place), "Texas") == 0 && g_strcmp0 (geocode_place_get_country (place), "United States of America") == 0 && + g_strcmp0 (geocode_place_get_name (place), "Paris, Lamar County, Texas, United States of America") == 0 && That's definitely wrong. There's only one Paris in Texas, so we shouldn't need to include the county name. Ditto for the other test cases.
Please also make sure to use test-gcglib for testing ("./test-gcglib Paris" will show the results for paris). For Paris, we get things like: Paris, Ile-de-France @ 48.856506, 2.352133 Paris, Arkansas @ 35.286876, -93.735488 etc. As "Paris, Ile-de-France" is the only one in France, we probably should show "Paris, France" instead for that result.
(In reply to comment #16) > The main goal of the "name" is to include the minimal amount of information > necessary to differentiate the search results. So, in your example, if all the > results are in "England, United Kingdom" that part of the name shouldn't > appear. In this particular case, the only thing missing seems to be the > postcode. Yes, thank you. Now I understand the code better. The postcode is not the only thing missing tho. I was still bothered by the result while looking at the raw data and I think I found the reason. The way the code treats missing attributes. We add a dummy value of NULL and make that the start of the next attribute. This means that the children of that dummy node can never have siblings. I will post a patch that deals with that. And it will make the tests work (a part for Moscow, which needs OSM fix as Zeeshan said). The tests for Bonneville and Paris did not work on master either. The attributes given from Nominatim are as follows: Bonneville: state=Rhône-Alpes county=Bonneville city=Bonneville Paris: state=Ile-de-France county=Paris name: Paris Country is not set in either of them. And for Paris I needed to add the number of results in order for it to differentiate on state. Will post pics of the high street example with the new patch, as well as the patch itself.
Created attachment 265518 [details] Searching for high street with patch: equality for missing attributes in result.
Created attachment 265519 [details] [review] forward: equality for missing attributes in result The function insert_place_into_tree creates a GNode based tree that is built up by the search results attributes. The idea is that if places share attributes they share a path and we can later minimize what we include in the name. This patch fixes a bug where missing attributes where inserted as completly new nodes, creating subtrees that could never get siblings. It also fixes up the gclib test.
Created attachment 265520 [details] [review] forward: equality for missing attributes in result The function insert_place_into_tree creates a GNode based tree that is built up by the search results attributes. The idea is that if places share attributes they share a path and we can later minimize what we include in the name. This patch fixes a bug where missing attributes where inserted as completly new nodes, creating subtrees that could never get siblings. It also fixes up the gclib test.
(In reply to comment #20) <snip> > The tests for Bonneville and Paris did not work on master either. > > The attributes given from Nominatim are as follows: > Bonneville: > state=Rhône-Alpes > county=Bonneville > city=Bonneville > > > Paris: > state=Ile-de-France > county=Paris > name: Paris > > Country is not set in either of them. And for Paris I needed to add the number > of results in order for it to differentiate on state. Zeeshan can help you fix that in the Nominatim database.
Review of attachment 265520 [details] [review]: Much better, but I'd rather not take the test-gcglib.c changes and have them fixed in the upstream database.
Created attachment 266151 [details] [review] forward: equality for missing attributes in result The function insert_place_into_tree creates a GNode based tree that is built up by the search results attributes. The idea is that if places share attributes they share a path and we can later minimize what we include in the name. This patch fixes a bug where missing attributes where inserted as completly new nodes, creating subtrees that could never get siblings.
(In reply to comment #25) > Review of attachment 265520 [details] [review]: > > Much better, but I'd rather not take the test-gcglib.c changes and have them > fixed in the upstream database. Ok! Zeeshan, maybe you can teach me the way to edit these things correctly at FOSDEM?
(In reply to comment #27) > (In reply to comment #25) > > Review of attachment 265520 [details] [review] [details]: > > > > Much better, but I'd rather not take the test-gcglib.c changes and have them > > fixed in the upstream database. > > Ok! > > Zeeshan, maybe you can teach me the way to edit these things correctly at > FOSDEM? I am really no expert but I can show you the tools I've used in the past to edit stuff. If I understood correctly, you shouldn't need to explicitly set a country in the db but containment of the area in another should be enough. The tools I've used: * ID editor (in browser): You get that by default when you hit 'Edit' button on openstreetmap.org. Its very simple and thats what you should use when editing of small areas. * JOSM: http://josm.openstreetmap.de/: A java based client that I used when editing large areas. You gotta use its option to download a particular 'object' from db only, otherwise it will either refuse to download too large an area or it won't be able to handle it. Hope that helps. BTW are you sure the country is not provided at all and its not just that they've changed the name back to 'France Metropolitan' rather than just 'France. I'd check the reply string you get from Nominatim.
(In reply to comment #28) > (In reply to comment #27) > > (In reply to comment #25) > I am really no expert but I can show you the tools I've used in the past to > edit stuff. If I understood correctly, you shouldn't need to explicitly set a > country in the db but containment of the area in another should be enough. The > tools I've used: > > * ID editor (in browser): You get that by default when you hit 'Edit' button on > openstreetmap.org. Its very simple and thats what you should use when editing > of small areas. > * JOSM: http://josm.openstreetmap.de/: A java based client that I used when > editing large areas. You gotta use its option to download a particular 'object' > from db only, otherwise it will either refuse to download too large an area or > it won't be able to handle it. > Thanks! > Hope that helps. BTW are you sure the country is not provided at all and its > not just that they've changed the name back to 'France Metropolitan' rather > than just 'France. I'd check the reply string you get from Nominatim. Yes, I checked with g_prints and you can see here as well: http://nominatim.openstreetmap.org/search?q=paris&format=xml&addressdetails=1 There is some magic going on, Metropolitan France is in the display_name but none of the other attributes.
(In reply to comment #29) > (In reply to comment #28) > > (In reply to comment #27) > > > (In reply to comment #25) > > > I am really no expert but I can show you the tools I've used in the past to > > edit stuff. If I understood correctly, you shouldn't need to explicitly set a > > country in the db but containment of the area in another should be enough. The > > tools I've used: > > > > * ID editor (in browser): You get that by default when you hit 'Edit' button on > > openstreetmap.org. Its very simple and thats what you should use when editing > > of small areas. > > * JOSM: http://josm.openstreetmap.de/: A java based client that I used when > > editing large areas. You gotta use its option to download a particular 'object' > > from db only, otherwise it will either refuse to download too large an area or > > it won't be able to handle it. > > > > Thanks! > > > Hope that helps. BTW are you sure the country is not provided at all and its > > not just that they've changed the name back to 'France Metropolitan' rather > > than just 'France. I'd check the reply string you get from Nominatim. > > Yes, I checked with g_prints and you can see here as well: > http://nominatim.openstreetmap.org/search?q=paris&format=xml&addressdetails=1 > > There is some magic going on, Metropolitan France is in the display_name but > none of the other attributes. Ah, then its more likely a bug in Nominatim. I recall once encountering a bug where some field was available but not provided. Could you please file a ticket?
(In reply to comment #26) > Created an attachment (id=266151) [details] [review] > forward: equality for missing attributes in result > > The function insert_place_into_tree creates a GNode > based tree that is built up by the search results attributes. > The idea is that if places share attributes they share a path > and we can later minimize what we include in the name. > > This patch fixes a bug where missing attributes where inserted > as completly new nodes, creating subtrees that could never get > siblings. Are we waiting for Nominatim to be fixed before this can be applied?
Comment on attachment 265137 [details] [review] forward: fix typo in nominatim result attribute Attachment 265137 [details] pushed as f8b797e - forward: fix typo in nominatim result attribute
Review of attachment 266151 [details] [review]: ::: geocode-glib/geocode-forward.c @@ +891,3 @@ + sibling = g_node_first_sibling (node); + while (sibling) { + if (sibling != node && sibling->data) { Not sure if the sibling->data should be there.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/geocode-glib/issues/9.