After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 131585 - Problems with XML attributes containing dashes
Problems with XML attributes containing dashes
Status: RESOLVED FIXED
Product: intltool
Classification: Deprecated
Component: general
unspecified
Other Linux
: Normal major
: ---
Assigned To: intltool maintainers
intltool maintainers
Depends on:
Blocks: 130604
 
 
Reported: 2004-01-15 18:20 UTC by David Malcolm
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: 2.6.next
GNOME version: ---


Attachments
patch with fix (3.12 KB, patch)
2004-01-21 20:34 UTC, Brian Cameron
none Details | Review
patch fixing problem with spaces getting lost (6.45 KB, patch)
2004-01-22 20:15 UTC, Brian Cameron
none Details | Review

Description David Malcolm 2004-01-15 18:20:38 UTC
I've been seeing problems with CVS intltool; the intltool-merge seems to be
losing some XML attributes when merging.  There seemed to be problems with
attributes containing either internal dashes or underscores.  So I hacked
the test file tests/cases/merge4.xml.in to read:

<sect testatt1="foo" test_att2="bar" test-att3="baz">
    <_title.-_:>Fax</_title.-_:>
</sect>

The result, when I run tests/selftest is:

<?xml version="1.0" encoding="UTF-8"?>
<sect testatt1="foo" test_att2="bar">
    <title.-_:>Fax</title.-_:>
    <title.-_: xml:lang="test">Translated Fax</title.-_:>
</sect>

i.e. the attribute test-att3 has been lost.  I reversed the order of
test_att2 and test-att3 and tried again; this time both attributes were
lost (though testatt1 came through OK).  So it looks like it's throwing
away all attributes after (and including) one containing a dash in its name.  

I believe this is a bug: it's well-formed XML AFAIK; I can't read Perl so
hopefully someone else can look into this and check that I'm not going mad.
 I believe it worked a month or so ago (last time I grabbed intltool from CVS)

My doctype (for Conglomerate) uses attributes with names like these, so
it's a major pain for me, currently blocking a release.
Comment 1 Kenneth Rohde Christiansen 2004-01-20 19:01:27 UTC
Some extra related information:

On Tue, 2004-01-20 at 13:43 +0100, Alexander Larsson wrote:
I'm having some pretty bad problems with the new XML::Parser version
of intltool.

Its seems that its not really using an XML parser, but instead doing
things like:
 
     # Grab attribute key/value pairs and push onto @origlist array.
     #
     while ($source =~ s|^[ \t]*([\w:]+)[\t]*[=][\t]*["]([^"]*)["]||s)
     {
        push @origlist, $1;
        push @origlist, $2;
 
This has some issues for me when building shared-mime-info:
 a) It removes any attributes using ' as quote
 b) It strips out multiple whitespaces inside an attribute value,
breaking the mime file
 c) (part of a) really), doesn't handle " quotes inside '-quoted
attributes
Comment 2 Brian Cameron 2004-01-21 20:34:40 UTC
Created attachment 23609 [details] [review]
patch with fix
Comment 3 Brian Cameron 2004-01-21 20:37:07 UTC
This patch fixes the following problems:

1) Now allows attributes to be quoted with a single quote (')
2) Works fine if double-quotes are inside single-quotes 
   ('text "text" text') or if single-quotes are inside double-quotes
   ("text 'text' text")
3) Allows attributes to have the dash (-) character.
Comment 4 Kenneth Rohde Christiansen 2004-01-21 21:39:46 UTC
The patch is straigh forward - looks fine to commit, so please go
ahead! :-)
Comment 5 Brian Cameron 2004-01-21 21:42:44 UTC
comitted
Comment 6 Alexander Larsson 2004-01-22 08:08:10 UTC
It still:

* Removes comments
* reorders attributes (not horrible, but strange)
* adds space before \>, i.e. foo="bar"\> turns into foo="bar" \> (also
just strange)
* Compresses whitespace in attributes. E.g:
<match type="string" mask="0xffffffff000000000000000000000000ff"
value="\177ELF            \004" offset="0"/>
turns into:
<match type="string" offset="0"
mask="0xffffffff000000000000000000000000ff" value="\177ELF \004" />
Which is pretty bad.
Comment 7 Alexander Larsson 2004-01-22 08:13:27 UTC
Maybe it could just use the whole original string for tags that don't
start with underscore?
Comment 8 Kenneth Rohde Christiansen 2004-01-22 09:14:51 UTC
That would be an idea yes - but won't that just hide the issues. They
still need to get fixed.

Then you could still write <_tag a1="string" a2="\177ELF           
\004"/>

though, that might not be very well designed XML.
Comment 9 Brian Cameron 2004-01-22 20:15:22 UTC
Created attachment 23646 [details] [review]
patch fixing problem with spaces getting lost
Comment 10 Brian Cameron 2004-01-22 20:16:05 UTC
I've attached a patch that fixes the problem of spaces getting
lost in attributes and CDATA., which seems to be Alexander's only
remaining serious problem.
Comment 11 Kenneth Rohde Christiansen 2004-01-22 20:58:20 UTC
Looks good - please commit.
Comment 12 Brian Cameron 2004-01-22 21:19:49 UTC
Just committed fix so that intltool-merge no longer compresses 
whitespace in attribute values or CDATA sections.  I think this bug 
can probably be marked as fixed.  The only remaining problems 
(reordering of attributes and loss of comments) are probably not
significant enough to keep this bug open.  Whatever makes the most
sense, though.
Comment 13 Geert Stappers 2004-01-23 08:12:52 UTC
With CVS intltool files in the conglomerate tree,
I get a clean output from the validate script in the examples directory.
( previous I got lots of
    element property-dialog: validity error : Element property-dialog
does not carry attribute plugin-id
those are gone )

I consider the bug also FIXED and marking it so.

This bug has seen a REOPENED before, but I expect a VERIFIED next.
Comment 14 Danilo Segan 2004-02-18 03:29:06 UTC
Second of the patches breaks intltool-merge when building
gnome-desktop/gnome-version.xml (no translations are merged).

Discussed at
http://pdx.freedesktop.org/pipermail/intltool/2004-February/000029.html
as well.
Comment 15 Brian Cameron 2004-02-18 16:05:35 UTC
Hmmm.  Is the problem with not compressing spaces in attribute values
or in CDATA sections, or both?  Kenneth's original problem was just with
spaces being compressed in attribute values, but my patch perhaps went
a little too far by also removing them from CDATA sections.

So it would be nice if we only need to replace such compression in
CDATA sections.  Then I think everyone will be happy.  Let me know
and I am happy to fix the code.
Comment 16 Kjartan Maraas 2004-02-20 00:09:50 UTC
Adding keywords. This needs to be fixed for 2.6.0.
Comment 17 Danilo Segan 2004-02-20 00:19:05 UTC
The problem here seems to be twofold:
- intltool-extract is not updated in the same way (so it extracts
"compressed" strings into PO files)
- this change is *incompatible* change, so doing this at this point is
very bad -- we don't know what could all this affect, and
string-freeze time in Gnome 2.6 is really not the right time for that
(basically, untranslated XML files will pop out from all places)

My suggestion would be to simply revert this change, and reconsider
after 2.6.0 has been released (with a large *WARNING* sign to all
developers using intltool for translating XML files).
Comment 18 Kenneth Rohde Christiansen 2004-03-02 11:58:10 UTC
Danilo can you answer Brians question?

> Is the problem with not compressing spaces in attribute values
or in CDATA sections, or both? 
Comment 19 Brian Cameron 2004-03-02 16:10:02 UTC
Now that I reread it, perhaps I didn't state the question too clearly.
Kenneth wanted the code to not compress spaces in attributes, so I
created a patch so spaces are no longer compressed in both attributes
and CDATA sections.  We can put back the space compression for CDATA
sections and I think both Kenneth and Danilo would be happy.

It's simply a matter of reversing this part of the last patch.

@@ -850,9 +849,8 @@ sub parseTree
                 #
                 my $string = $sub;
 
-                $string =~ s/\s+/ /g;
-                $string =~ s/^ //;
-                $string =~ s/ $//;
+                $string =~ s/^[\s]+//;
+                $string =~ s/[\s]+$//;
 
                 push(@translation_strings, $string);
Comment 20 Kenneth Rohde Christiansen 2004-03-02 16:55:46 UTC
Please try that Danilo and if everything works for you feel free to
commit it to cvs.
Comment 21 Danilo Segan 2004-03-02 17:02:37 UTC
Yes, that would do the work for CDATA sections (that's exactly what I
posted in the mail referenced above).

The thing is that I *do* know that it breaks CDATA translations, but I
*don't* know if it breaks attribute translations.  That's why I wasn't
"brave" enough to say a definite "yes, only CDATA should not be
compressed", so I said "at least CDATA should not be compressed".

So, YES, we should revert that small part as soon as possible.

(I got a collision with Kenneth when posting to Bugzilla [first time I
saw this happen], so yes this does work -- I already said that on
intltool list :)
Comment 22 Brian Cameron 2004-03-02 17:11:21 UTC
I believe that reverting the portion of the patch quoted above
will return xml-merge to the behavior that it had before the XML
parser was added to the code.  So people who want it to behave
like "it used to" will appreciate the change.  It was my error to
"correct" the problem in both the CDATA and attribute sections
when the prior bug was only complaining about a problem in the
attribute section.
Comment 23 Danilo Segan 2004-03-02 18:21:09 UTC
Brian, it's not necessarilly an error assuming that, but it's an error
not changing intltool-extract to preserve spaces as well.  For
example, try the following with intltool-extract and -merge:

<test>
<_something>This  bails
out</_something>
</test>

intltool-extract would put "This bails out" into PO file, yet
intltool-merge would look for translations of "This  bails\nout" and
wouldn't find them -- that's where the problem was (and, adjusting
intltool-extract now would seriously damage the string freeze).

Do you want me to commit this, or you'll go ahead? ;)
Comment 24 Brian Cameron 2004-03-02 21:37:55 UTC
I wasn't suggesting that this needs to be fixed *right now*. 
Obviously, the commit may need to wait until various freezes are over
and the time
is right for such a change. 
Comment 25 Danilo Segan 2004-03-02 21:49:19 UTC
Brian,
I think we're misunderstanding each other -- this (reverting those
two/three lines) must be fixed right *NOW*, before 2.6 is released. 
intltool, as it stands, doesn't work correctly for gnome-desktop
module, file gnome-version.xml.in.in.  This means that Gnome 2.6 would
have untranslated gnome-version.xml, which means that data in program
gnome-about (right click on panel, then choose "About Gnome") will be
untranslated.  There are probably more instances of where this bug
(compressing spaces in intltool-extract, but not in intltool-merge)
comes up, but this is one very visible example.
Comment 26 Brian Cameron 2004-03-02 21:57:47 UTC
Could you please commit this.  I do not really have time at the moment.

Thanks!
Comment 27 Danilo Segan 2004-03-02 22:10:35 UTC
No problem, it's now fixed in CVS.
Comment 28 Alexander Larsson 2004-03-03 17:47:25 UTC
Is this supposed to be fixed now? HEAD still compresses whitespace for
me, breaking the build of shared-mime-info.

I.e. turning:
<match mask="0xffffffff000000000000000000000000ff" offset="0"
type="string" value="\177ELF            \004" />
Into:
<match mask="0xffffffff000000000000000000000000ff" offset="0"
type="string" value="\177ELF \004" />
Comment 29 Brian Cameron 2004-03-03 20:20:28 UTC
Alexander - I am confused.  The merge6.xml test (run "make check" in
intltool/tests directory) does a test to check this.  The
cases/merge6.xml.in file has this line:

<foo value="&lt;!DOCTYPE\ xbel" value-dash='testme' quotetest='a
"quote" test' valuetesttwo="\177ELF            \004">

And I notice that the spaces are not compressed in the output.  I'm
not sure why you are seeing this problem.  Are you sure you are using
CVS head?  Does this test pass for you?
Comment 30 Alexander Larsson 2004-03-04 09:29:06 UTC
Oh, I had HEAD built, but the intltool copy in the module i was
building was an old version. Sorry about that.
Comment 31 Luis Villa 2004-03-05 02:01:01 UTC
So is this resolved then? [It looks resolved to me, at least with the
gnome-about problem.]
Comment 32 Danilo Segan 2004-03-05 03:21:41 UTC
Yes, I think it's safe to mark it as "resolved/fixed" (I don't have
enough Bugzilla privileges to do that myself).
Comment 33 Luis Villa 2004-03-05 03:23:42 UTC
Whoa! No way. Marking the bug FIXED, and giving you the permissions
for bugzilla. Should have just asked, dude. :)