GNOME Bugzilla – Bug 777731
Document missing characters
Last modified: 2017-06-10 08:44:19 UTC
Created attachment 344199 [details] xps file The attached document misses characters when opening it with evince.
Created attachment 344201 [details] xps small Reduced the data in the document to try to find out the problem. In this document you are suppose to see the word "ics" instead you will see only "i s". For what I can see it is not handling this line: <Path Data="{StaticResource R738}" RenderTransform="1,0,0,1,80.16,131.22" Fill="#ff000000" /> I guess StaticResource is not handled?
Created attachment 344355 [details] [review] resources path Initial support for resource dictionaries and adding support for paths. Based on the patch from Jason Crain.
Review of attachment 344355 [details] [review]: ::: libgxps/gxps-resources.h @@ +28,3 @@ + +#define GXPS_TYPE_RESOURCES (gxps_resources_get_type ()) +G_DECLARE_FINAL_TYPE (GXPSResources, gxps_resources, GXPS, RESOURCES, GObject) I just realized this would make us depend on glib 2.44. Is that ok or should I just use the classic way to declare classes?
Created attachment 344631 [details] [review] resources path 2 This fixes 2 issues in relation to the previous patch: - it uses the old way to create classes - it does not use sscanf anymore since that does not play well on windows
Review of attachment 344631 [details] [review]: This is the code in the wip/nacho/res-sep branch, right? See if Carlos comments, but it mostly looks good to me. I don't really like programatically generating XML this way, but it works, and it is certainly more general than what I was doing. A couple of comments, particularly regarding how the patches handle brushes. ::: libgxps/gxps-resources.c @@ +289,3 @@ + if (strcmp (element_name, resource_dict_ctx->first_element_name) == 0) { + g_string_append_printf (resource_dict_ctx->xml, "</%s>\n</Resource.%s>", + element_name, element_name); The way resource_dict_end_element finds the end of a resource by searching for the element name in a closing tag can cause problems if the resource is complex. For example, a VisualBrush contains nested elements and could contain another VisualBrush, though this is probably unlikely to happen in the wild. I would prefer if resource_dict_start_element/resource_dict_end_element instead pushed and popped another context for the string concatenation to avoid this. @@ +402,3 @@ + + parse_ctx = g_markup_parse_context_new (&remote_resource_parser, + 0, resources, NULL); File paths can be specified relative to the current file. If a remote resource dictionary contains an ImageBrush, the image file path can be specified relative to the remote resource dictionary. Right now, the code will only use the current page's location to look for relative files, so it may sometimes not find the file. In addition to saving the XML, a resource will probably also need to save the current directory.
Created attachment 345083 [details] NestedVisualBrush.xps This document contains a static resource VisualBrush, which itself contains another VisualBrush. If both the resource dictionary patch in this bug and the brush resource patch in bug #742630 are applied, this document will crash libgxps due to a null pointer.
Created attachment 345084 [details] ImageBrushRemotePath.xps This document contains a remote resource dictionary with an ImageBrush. The brush's image file is specified relative to the remote resource dictionary. Even with both the resource dictionary patch in this bug and the brush resource patch in bug #742630 applied, the document is still not rendered correctly because libgxps looks in the wrong location for the image.
Created attachment 350557 [details] [review] Add support for resource dictionaries It parses the resources and keeps a cache in the form of xml. Once we are parsing the real xml if we find a resource we get the cached xml resource and we parse it to properly handle it. For now only Path Data resources are supported. Based in a patch from Jason Crain.
This new patch fixes the seg fault as suggested by Jason. I think we should not block on the second issue though... that could be fixed with a separate patch.
Review of attachment 350557 [details] [review]: Thanks for working on this. I agree that the relative path issue is not worth blocking over. I confirm that this fixes the segfault. ::: libgxps/gxps-resources.c @@ +305,3 @@ + g_string_append_printf (resource_dict_ctx->xml, "<Resource.%s>\n", + element_name); + push_context = TRUE; I don't understand what this push_context is for. It looks like it's going to always be set to true because resource_dict_ctx->key is initially NULL and it's reset to NULL in _end_element. Under what circumstance will it be false?
Created attachment 350680 [details] [review] Add support for resource dictionaries It parses the resources and keeps a cache in the form of xml. Once we are parsing the real xml if we find a resource we get the cached xml resource and we parse it to properly handle it. For now only Path Data resources are supported. Based in a patch from Jason Crain.
(In reply to Jason Crain from comment #10) > Review of attachment 350557 [details] [review] [review]: > > Thanks for working on this. I agree that the relative path issue is not > worth blocking over. I confirm that this fixes the segfault. > > ::: libgxps/gxps-resources.c > @@ +305,3 @@ > + g_string_append_printf (resource_dict_ctx->xml, "<Resource.%s>\n", > + element_name); > + push_context = TRUE; > > I don't understand what this push_context is for. It looks like it's going > to always be set to true because resource_dict_ctx->key is initially NULL > and it's reset to NULL in _end_element. Under what circumstance will it be > false? you are right, that was a leftover from a previous try. I removed it now and seems to work fine for me.
Review of attachment 350557 [details] [review]: This looks good to me. The null pointer crash is fixed, this fixes the rendering in this bug and bug #703639, and after the patch in bug #742630 is added, the rendering on that document is fixed too.
Time to ping here
Review of attachment 350680 [details] [review]: Thanks for the patch, I've finally found some time to look at it. It looks good to me in general, but I have a few questions/concerns. I think we could merge the patch in bug #742630 into this one, to add also support for brush resources. ::: libgxps/gxps-page-private.h @@ +53,3 @@ GHashTable *anchors; + + GXPSResources *resources; This GXPSResources also includes remote resources, so maybe instead of being a page member, it should belong to GXPSArchive and the page would push/pop like the canvas does. ::: libgxps/gxps-page.c @@ +277,3 @@ gdouble opacity; cairo_pattern_t *opacity_mask; + gboolean pop_resource; I would use pop_resource_dict to clarify it's the dict. @@ +462,3 @@ + } + + resource = gxps_resources_get_resource (page->priv->resources, resource_key); We should emit an error in this case. @@ +471,3 @@ + ret = g_markup_parse_context_parse (context, resource, strlen (resource), NULL) && + g_markup_parse_context_end_parse (context, NULL); + g_markup_parse_context_free (context); I'm not sure we should parse here too. Maybe we need to split this methods in get_resource and expand_resource. See below. @@ +499,3 @@ + if (!expand_resource (ctx->page, values[i], path)) { + path->data = g_strdup (values[i]); + } I think using data directly or with a resource should be equivalent, but this is not, because in the case of resource we are parsing the data here, while in the other case we save the data that is parsed after all other properties have been parsed. So, in case of something like this: <Path Data="{StaticResource Foo}" RenderTransform="1,0,0,1,20,30"> The data geometry is expanded before the render transform is applied, no? I think this should be equivalent to: <Path RenderTransform="1,0,0,1,20,30"> <Path.Data> So, I think we should expand the resource after properties are handled. We could first get the resource name or data string. Then before gxps_path_parser_push, we resolve the resource and parse it. This way we can also emit an error in case the value is a StaticResource but we don't find it. It's equivalent to a missing required property in the end. @@ +1640,3 @@ + + page->priv->resources = gxps_resources_new (page->priv->zip, + page->priv->source); I would build this on demand, there are a lot of documents without resource dictionaries. ::: libgxps/gxps-resources.c @@ +35,3 @@ + gchar *source; + + gboolean remote; I'm not sure this is the best place for this. It's not about the GXPSResources object, but specific to the currently processed dictionary. If you don't wan to add a struct context only for this, at least we could renamed it as current_dictionary_is_remote or something like that. If we make the resource global, the source should also be context of the current parser. So, maybe it makes sense to add a struct with a ref to the resource, the source and is_remote and pass that to the parser instead. @@ +189,3 @@ + gxps_resources_push_dict (resources); + + ht = g_queue_peek_head (resources->queue); I guess that remote resources should be always added to the toplevel hash table, no? @@ +190,3 @@ + + ht = g_queue_peek_head (resources->queue); + if (g_hash_table_contains (ht, key)) Can this really happen? I guess, it could happen if we make resources global, and only for remote resources referenced from different pages. Note that value is leaked if this happens. @@ +202,3 @@ + + gchar *key; + gchar *first_element_name; This is unused. @@ +302,3 @@ + if (!resource_dict_ctx->xml) { + resource_dict_ctx->xml = g_string_new (NULL); + g_string_append_printf (resource_dict_ctx->xml, "<Resource.%s>\n", I guess this is a trick to know what parser to use later, right? I guess we don't need the Resource. prefix in any case, no? @@ +449,3 @@ + + resource_dict = gxps_resource_dict_context_new (resources); + g_markup_parse_context_push (context, &resource_dict_parser, resource_dict); This is duplicated in remote_resource_start_element, we could probably add helper functions to push pop the dictionary parser. @@ +472,3 @@ + if (!resources->remote) { + resource_dict = g_markup_parse_context_pop (context); + gxps_resource_dict_context_free (resource_dict); Same for the pop
Created attachment 353348 [details] [review] Add support for resource dictionaries It parses the resources and keeps a cache in the form of xml. Once we are parsing the real xml if we find a resource we get the cached xml resource and we parse it to properly handle it. For now only Path Data resources are supported. Based in a patch from Jason Crain.
(In reply to Carlos Garcia Campos from comment #15) > Review of attachment 350680 [details] [review] [review]: > > Thanks for the patch, I've finally found some time to look at it. It looks > good to me in general, but I have a few questions/concerns. I think we could > merge the patch in bug #742630 into this one, to add also support for brush > resources. > > ::: libgxps/gxps-page-private.h > @@ +53,3 @@ > GHashTable *anchors; > + > + GXPSResources *resources; > > This GXPSResources also includes remote resources, so maybe instead of being > a page member, it should belong to GXPSArchive and the page would push/pop > like the canvas does. Done > > ::: libgxps/gxps-page.c > @@ +277,3 @@ > gdouble opacity; > cairo_pattern_t *opacity_mask; > + gboolean pop_resource; > > I would use pop_resource_dict to clarify it's the dict. Done > > @@ +462,3 @@ > + } > + > + resource = gxps_resources_get_resource (page->priv->resources, > resource_key); > > We should emit an error in this case. > > @@ +471,3 @@ > + ret = g_markup_parse_context_parse (context, resource, strlen (resource), > NULL) && > + g_markup_parse_context_end_parse (context, NULL); > + g_markup_parse_context_free (context); > > I'm not sure we should parse here too. Maybe we need to split this methods > in get_resource and expand_resource. See below. > > @@ +499,3 @@ > + if (!expand_resource (ctx->page, values[i], path)) { > + path->data = g_strdup (values[i]); > + } > > I think using data directly or with a resource should be equivalent, but > this is not, because in the case of resource we are parsing the data here, > while in the other case we save the data that is parsed after all other > properties have been parsed. So, in case of something like this: > > <Path Data="{StaticResource Foo}" RenderTransform="1,0,0,1,20,30"> > > The data geometry is expanded before the render transform is applied, no? I > think this should be equivalent to: > > <Path RenderTransform="1,0,0,1,20,30"> > <Path.Data> > > So, I think we should expand the resource after properties are handled. We > could first get the resource name or data string. Then before > gxps_path_parser_push, we resolve the resource and parse it. This way we can > also emit an error in case the value is a StaticResource but we don't find > it. It's equivalent to a missing required property in the end. While it is true you could expand the resource after, it would be a pain in case you have several properties with resources, i.e imagine the case of: <Path Data="{StaticResource Foo}" RenderTransform="{StaticResource Bar}"> I changed the code a bit and now it handles the resources for all the properties. > > @@ +1640,3 @@ > + > + page->priv->resources = gxps_resources_new (page->priv->zip, > + page->priv->source); > > I would build this on demand, there are a lot of documents without resource > dictionaries. Done > > ::: libgxps/gxps-resources.c > @@ +35,3 @@ > + gchar *source; > + > + gboolean remote; > > I'm not sure this is the best place for this. It's not about the > GXPSResources object, but specific to the currently processed dictionary. If > you don't wan to add a struct context only for this, at least we could > renamed it as current_dictionary_is_remote or something like that. > If we make the resource global, the source should also be context of the > current parser. So, maybe it makes sense to add a struct with a ref to the > resource, the source and is_remote and pass that to the parser instead. Used a struct > > @@ +189,3 @@ > + gxps_resources_push_dict (resources); > + > + ht = g_queue_peek_head (resources->queue); > > I guess that remote resources should be always added to the toplevel hash > table, no? > > @@ +190,3 @@ > + > + ht = g_queue_peek_head (resources->queue); > + if (g_hash_table_contains (ht, key)) > > Can this really happen? I guess, it could happen if we make resources > global, and only for remote resources referenced from different pages. Note > that value is leaked if this happens. Fixed the leak > > @@ +202,3 @@ > + > + gchar *key; > + gchar *first_element_name; > > This is unused. Removed > > @@ +302,3 @@ > + if (!resource_dict_ctx->xml) { > + resource_dict_ctx->xml = g_string_new (NULL); > + g_string_append_printf (resource_dict_ctx->xml, "<Resource.%s>\n", > > I guess this is a trick to know what parser to use later, right? I guess we > don't need the Resource. prefix in any case, no? Removed > > @@ +449,3 @@ > + > + resource_dict = gxps_resource_dict_context_new (resources); > + g_markup_parse_context_push (context, &resource_dict_parser, > resource_dict); > > This is duplicated in remote_resource_start_element, we could probably add > helper functions to push pop the dictionary parser. > > @@ +472,3 @@ > + if (!resources->remote) { > + resource_dict = g_markup_parse_context_pop (context); > + gxps_resource_dict_context_free (resource_dict); > > Same for the pop Done
(In reply to Ignacio Casal Quinteiro (nacho) from comment #17) > > I think using data directly or with a resource should be equivalent, but > > this is not, because in the case of resource we are parsing the data here, > > while in the other case we save the data that is parsed after all other > > properties have been parsed. So, in case of something like this: > > > > <Path Data="{StaticResource Foo}" RenderTransform="1,0,0,1,20,30"> > > > > The data geometry is expanded before the render transform is applied, no? I > > think this should be equivalent to: > > > > <Path RenderTransform="1,0,0,1,20,30"> > > <Path.Data> > > > > So, I think we should expand the resource after properties are handled. We > > could first get the resource name or data string. Then before > > gxps_path_parser_push, we resolve the resource and parse it. This way we can > > also emit an error in case the value is a StaticResource but we don't find > > it. It's equivalent to a missing required property in the end. > > While it is true you could expand the resource after, it would be a pain in > case you have several properties with resources, i.e imagine the case of: > > <Path Data="{StaticResource Foo}" RenderTransform="{StaticResource Bar}"> In this case, to make it equivalent we would resolve the path, resolve and expand the transform, and then expand the path. I don't know if that's what we should really do, but it's definitely what we are currently doing. And I think the result should be exactly the same whether a resource is used or not. If you don't want to fix it now, at least leave a FIXME there explaining that what we do in case of resources is different.
Attachment 353348 [details] pushed as 29ddc50 - Add support for resource dictionaries
Thanks for the reviews!