After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 570592 - Cannot open or save files or paths with localized characters
Cannot open or save files or paths with localized characters
Status: RESOLVED FIXED
Product: dia
Classification: Other
Component: general
0.97
Other Windows
: Normal major
: 0.97.1
Assigned To: Dia maintainers
Dia maintainers
: 552463 561234 576737 584115 591302 (view as bug list)
Depends on: 522131 574393
Blocks:
 
 
Reported: 2009-02-05 06:07 UTC by Otto Kekäläinen
Modified: 2010-01-24 19:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Error message when opening a file having special characters in the filename (296 bytes, text/plain)
2009-11-26 12:56 UTC, Daniel Klär
  Details
Error message when opening a file having special characters in the filename (297 bytes, text/plain)
2009-11-26 12:58 UTC, Daniel Klär
  Details
Use win32 wide character API to support localized file names (4.89 KB, patch)
2009-12-04 20:20 UTC, Steffen Macke
none Details | Review
Allow to pass wide character filenames to diaw.exe on the commandline (3.70 KB, patch)
2009-12-30 12:32 UTC, Steffen Macke
none Details | Review
Additional patch to address the problems identified by Hans (1.69 KB, patch)
2009-12-31 14:12 UTC, Steffen Macke
none Details | Review
Allow to pass wide character filenames to dia-win-remote.exe on the commandline (9.21 KB, patch)
2010-01-17 16:05 UTC, Steffen Macke
none Details | Review
Unfinished patch interpreting URIs (1.07 KB, patch)
2010-01-20 22:23 UTC, Hans Breuer
needs-work Details | Review
URI-encodes filenames passed from dia-win-remote.exe to diaw.exe (8.76 KB, patch)
2010-01-23 17:48 UTC, Steffen Macke
none Details | Review

Description Otto Kekäläinen 2009-02-05 06:07:42 UTC
Please fix Dia so that it can open and save files that have any characters that the files system accepts in their filename or path.

For example, at the moment Dia can't open or save files like these:
- C:\Töllö\my-file.dia
- C:\My Files\määrittely.dia

It's really annoying..

I already wrote a very extensive bug report about this, but the Bugzilla bugged and lost everything I wrote, so my motivation dropped - hence this brief bug report.
Comment 1 Hans Breuer 2009-02-06 17:00:37 UTC
There is a bug in Dia 0.96 with writing to "partial" write-protected directories, i.e. everything below "My Documents" - see bug #504469. (the error message I get is "Not allowed to wriote temporary files ...".
I'm not aware - and have just tested again with Dia-0.96.1-9 of problems with localized filenames. Of course you mileage may vary. But than you need to find out what's different on your system.
Comment 2 Otto Kekäläinen 2009-02-06 19:57:28 UTC
I'm sure this issue was due to localized characters since saving didn't work with a folder named "Projektikäsikirja", but when I changed the folder name to "Projektikasikirja" everything worked fine.

I'm using Dia-0.96.1-8. I'll upgrade sometime next week and try again.
Comment 3 Otto Kekäläinen 2009-02-11 10:04:41 UTC
I just installed the latest Dia-0.96.1-9 and tested it:

1. Opened the file "toimintakaavio.dia" => works
2. Saved the file with name "toimintakaavö.dia" => works
3. Closed Dia and reopened the file "toimintakaavö.dia" => Error: unknown filetype (translated from Finnish)

So the issue still presists partially. I'd be very happy if you can fix this in time for the release of 0.97!
Comment 4 Hans Breuer 2009-02-14 21:01:02 UTC
This may as well be fixed already, e.g. by:

2007-03-17  Hans Breuer  <hans@breuer.org>

	* app/app_procs.c app/autosave.c app/commands.c app/diaconv.c
	  app/export_png.c app/filedlg.c app/load_save.c 
	  app/paginate_psprint.c app/preferences.c app/render_eps.c 
	  app/sheets_dialog.c app/sheets_dialog_callbacks.c
	  lib/dia_dirs.c lib/dia_xml.c lib/diagdkrenderer.c 
	  plug-ins/cgm/cgm.c plug-ins/dxf/dxf-export.c 
	  plug-ins/dxf/dxf-import.c plug-ins/hpgl/hpgl.c 
	  plug-ins/metapost/render_metapost.c plug-ins/pgf/render_pgf.c 
	  plug-ins/pstricks/render_pstricks.c plug-ins/python/pydia-render.c 
	  plug-ins/shape/shape-export.c plug-ins/svg/render_svg.c 
	  plug-ins/vdx/vdx-export.c plug-ins/vdx/vdx-import.c 
	  plug-ins/wmf/wmf.cpp plug-ins/wpg/wpg.c 
	  plug-ins/xfig/xfig-export.c plug-ins/xfig/xfig-import.c
	  plug-ins/xslt/xslt.c : use <glib/gstdio.h> to match GLib's filename
	encoding to the io functions used, that is: g_open, g_fopen, g_stat, 
	g_unlink, g_mkdir, g_rename (, g_access, g_lstat, g_remove, g_freopen, 
	g_chdir, g_rmdir). Also replace gzopen() with gzdopen(g_open(), ...)
	to properly handle unicode filenames; finally use g_mkstemp().
	Fixes bug #131210 and bug #397159. To make this fully work on win32
	a recent enough version of libxml2 is required - tested with 2.6.27 -
	but anything from 2.6.24 should do.
Comment 5 Hans Breuer 2009-03-01 15:26:41 UTC
Strangely enough I can reproduce this with 0.96.1-9 from dia-installer and I dont know ahwt going on here:

D:\graph\Dia-0.96.1-9\bin>dia --verbose

(dia.exe:2524): Gtk-WARNING **: GtkSpinButton: setting an adjustment with non-ze
ro page size is deprecated
Redirecting output to win32trace remote collector
file:///C:/Dokumente%20und%20Einstellungen/hb/zu%20bl%C3%B6d.dia:1: parser error
 : Start tag expected, '<' not found
[Invalid UTF-8] \x1f\x8b\x08
^

My first assumption of a slightly broken libxml used in the setup does not hold (I checked with the same version which works for my build) but still I get the above message. 
BUT: when uncompressing the file to something still including the localized characters it can be loaded!
Steffen any idea what's wrong with your build?
Comment 6 Steffen Macke 2009-03-01 19:10:09 UTC
Unfortunately, not really.

I've recently added the Windows Platform SDK and noted that it adds things to the PATH. I'll try to rebuild with these things removed.

I tried with the latest libxml, zlib and iconv DLLs from Igor Zlatkovic, but the problem is still there.
Comment 7 Steffen Macke 2009-03-01 19:28:52 UTC
Hans, using your working set of binaries: Does the problem originate from dia-app.dll or libdia.dll?
Comment 8 Steffen Macke 2009-03-01 19:29:51 UTC
Rebuild without any Windows Platform SDK locations in the path didn't help.
Comment 9 Hans Breuer 2009-03-06 16:55:22 UTC
Not sure why it ever worked for me before, after switching to libxml 2.7.3 (build from svn) I see the same problem. Reported as libxml2 bug #574393 - not sure yet if there needs to be a workaround in Dia.
Comment 10 Steffen Macke 2009-03-25 16:58:35 UTC
*** Bug 576737 has been marked as a duplicate of this bug. ***
Comment 11 Hans Breuer 2009-04-03 19:39:03 UTC
*** Bug 552463 has been marked as a duplicate of this bug. ***
Comment 12 Steffen Macke 2009-05-06 18:45:43 UTC
Hans, 
what do you think about the following workaround (win32 only):

* If the file is compressed, gunzip it into a memory buffer and use xmlParseMemory() instead of xmlParseFile()

Did I miss something?
Comment 13 Hans Breuer 2009-05-06 20:08:15 UTC
I don't think adding a work-around is leading us anywhere, and that one would certainly be much too huge. Why should everyone who does *not* use the broken combination suffer from the defect in libxml? 

The right thing would of course be to write a patch for libxml. 

But did you notice that I already gave an answer?

http://mail.gnome.org/archives/dia-list/2009-April/msg00062.html

At 17.04.2009 22:36, Hans Breuer wrote:
>> http://bugzilla.gnome.org/show_bug.cgi?id=570592
>>
> It certainly would be nice to have this fixed, but I will not make the
> dia-0.97 release depend on it. Apparently our definitions of "showstopper"
> are very different.
> For me a showstopper does not have a simple workaround. This issue has two:
> don't use localized filenames (or directories) and commpressed diagrams
> together.
[...]
> The work-around I'm pondering could be to convert from utf-8 filenames to
> locale filename before talking to libxml2, but that still would not work if
> the choosen filename is not convertable into the locale encoding, e.g.
> saving with a japanese filename on a german windows. (I'm uncertain if the
> work-around would work at all on a japanese windows version.)
> 
Comment 14 Steffen Macke 2009-05-28 17:11:00 UTC
*** Bug 561234 has been marked as a duplicate of this bug. ***
Comment 15 Steffen Macke 2009-05-28 17:15:56 UTC
*** Bug 584115 has been marked as a duplicate of this bug. ***
Comment 16 Hans Breuer 2009-09-21 20:41:33 UTC
libxml2 2.7.4 does include the fix for this issue (bug #574393)
Comment 17 Otto Kekäläinen 2009-10-20 10:28:20 UTC
I just tested this with the latest Dia 0.97 and the error still exists. This has _not_ been fixed yet.

The behavior is exactly the same as I described for version 0.96-9 in my post from 2009-02-11 10:04:41 UTC. Opening files with locale characters is impossible!
Comment 18 Steffen Macke 2009-10-20 18:27:06 UTC
The current Dia 0.97 release does not yet contain libxml2 2.7.4. 
I'm still waiting for the "official" win32 binaries of the new libxml2 versions.
Note that the problem does not occur if you do not use file compression (uncompressing the files also solves the problem).
Comment 19 Steffen Macke 2009-10-31 08:02:22 UTC
As there's still no "official" libxml2 win32 binaries with the fix available,
I've compiled the library myself:

http://sourceforge.net/projects/dia-installer/files/libxml2/2.7.6/libxml2-2.7.6-bin.zip/download

Just replace the file dia/bin/libxml2.dll of your Dia installation with the one from the zip file.

Feedback is welcome. Note that the new DLL seems to cause a problem with diashapes.exe (unable to download the sheets.xml file).

Sorry that this is taking so long.
Comment 20 Otto Kekäläinen 2009-11-06 14:37:00 UTC
I just tried with the new libxml2.dll from
http://sourceforge.net/projects/dia-installer/files/libxml2/2.7.6/libxml2-2.7.6-bin.zip/download
        
Saving files with localized characters now works!
        
However opening does still not.
Comment 21 Hans Breuer 2009-11-06 15:49:06 UTC
If you are opening the file via explorer, please see bug #591302. Otherwise I don't have any idea - and it is contradictory to other people's tests:
http://mail.gnome.org/archives/dia-list/2009-November/msg00000.html
Comment 22 Steffen Macke 2009-11-06 19:11:44 UTC
Could you describe as detailed as possible how you try to open the file?

If you're familiar with the commandline, could you try to open the program
with dia.exe instead of diaw.exe as this might provide additional error messages.

Are there any error messages?

What is the exact path/filename you're using? Is this some kind of special drive?

Which OS do you use exactly?
Comment 23 Otto Kekäläinen 2009-11-06 20:25:56 UTC
The steps to reproduce the error is the same as I've described in this bug post before. I'll repeat with a bit more detail:

1. Open Dia
2. Make graph
3. Select from menu File > Save and make filename to "etäkäyttö.dia" and choose "My Documents" as folder.
=> This works now with the new libxml2.dll!

4. Close Dia.
5. Open "My Documents" with the file browser, double click on the file with localized characters that you made in step 3.
=> Nothing happends. I suspect Dia launches in the background but is unable to open the file.

6. Open Dia.
7. Select from menu File > Open, browse to "My Documents"
8. Select the file you made in step 3.
=> Dia compalins that the filetype is unknown and file opening fails.

My OS is Windows XP.
Comment 24 Daniel Klär 2009-11-26 12:56:47 UTC
Created attachment 148530 [details]
Error message when opening a file having special characters in the filename
Comment 25 Daniel Klär 2009-11-26 12:58:19 UTC
Created attachment 148531 [details]
Error message when opening a file having special characters in the filename
Comment 26 Daniel Klär 2009-11-26 13:15:01 UTC
(In reply to comment #22)
> Could you describe as detailed as possible how you try to open the file?
> 
> If you're familiar with the commandline, could you try to open the program
> with dia.exe instead of diaw.exe as this might provide additional error
> messages.
> 
> Are there any error messages?

First I want confirm this bug for both a French Windows XP and a German Windows 2000 with the current stable version (0.97). And I can reproduce it on different machines.
As requested I started dia.exe from command line. See attachments for output.

> What is the exact path/filename you're using? Is this some kind of special
> drive?

The two attached examples were created in virtual machines, but the behaviour is the same on standard hardware (DELL PC, IDE/SATA harddisk).
I had admin permissions when testing.
 
> Which OS do you use exactly?

Windows XP Professionel SP3 (5.1, Build 2600) French
Windows 2000 Professional SP4 (5.0, Build 2195) German

Both are localized versions (not international + language pack) and both are fully patched.

HTH
Comment 27 Steffen Macke 2009-12-04 20:20:54 UTC
Created attachment 149117 [details] [review]
Use win32 wide character API to support localized file names

I've tested this patch successfully with 
* German Windows XP
  * cmd.exe, German and Arabic filename (not displayed correctly in cmd.exe)
  * explorer.exe, German and Arabic filename (displayed correctly)
* Windows 7
  * explorer.exe
  * Powershell.exe 

What is still missing: Update dia-win-remote.exe accordingly
Comment 28 Steffen Macke 2009-12-04 20:33:20 UTC
Have to test on Linux...
Comment 29 Hans Breuer 2009-12-05 18:11:17 UTC
The patch does not look quite right to me, e.g. I think it will break with most of the other commandline options. I've just commited a different approach to trunk extending the use of GOption (ideas inspired form The GIMP's sources).

The filename given on the commandline needs to be representable in the locale encoding, so this wont work for Japanese filenames on German windows. For these cases the GUI needs to used, which does not share the commandline restrictions.

If some testing does not reveal regressions this could be merged to dia-0-97.
Comment 30 Steffen Macke 2009-12-06 13:24:30 UTC
You're right, I focused just on the filenames.
But I think that using the win32 wide character API is a must, because the ordinary user will take explorer.exe as the reference and not cmd.exe. After installation of the east-asian fonts, my German XP displays Japanese filenames nicely in explorer.exe. And handling/opening of these files worked with my patch, even when the fonts aren't installed.

And I think that we'll not be able to avoid the #ifdefs, because this is really Windows specific - e.g. Linux shells use UTF-8 by default.

The perception of the ordinary user that Dia is unable to open the saved file is a serious issue - we should definitely avoid this.
Comment 31 Hans Breuer 2009-12-08 23:18:10 UTC
There must be a way to make it work with explorer and cmd. But probably the win32 wchar version can not use GOption at all, because there every string is converted by g_locale_to_utf8() (see: glib/goption.c).
Maybe app_init() should be split further to make the necessary parts available for some reimplmentation of WinMain - with it's own wchar command line parsing and conversion to utf8. Thus dia.exe could work as is, but not support filenames with non-locale encoding. And diaw.exe would support the limited subset necessary for explorer 'integration': filenames and some fileformat conversions (the context menu created by the installer).
Although I'm not considering the non-locale filenames important, I will review possible patches.
Comment 32 Steffen Macke 2009-12-22 19:37:23 UTC
I'm giving the whole thing another try:

* diaw.exe recoding using wide character commandline options and recoding everything to UTF-8

I'm not sure though, what's the best way to pass UTF-8 filenames to app_init() - do you have an idea?
Comment 33 Hans Breuer 2009-12-22 21:49:05 UTC
There already is an utf-8 list of strings in app_init() - you could make it an extra, optional(default: NULL) parameter of app_init. But still this sounds like asking for trouble, at least the filenames in that list need to be removed from argc/argv - I still like the splitting idea from above better...
Comment 34 Steffen Macke 2009-12-30 12:32:47 UTC
Created attachment 150573 [details] [review]
Allow to pass wide character filenames to diaw.exe on the commandline

What do you think about the attached patch?
Comment 35 Hans Breuer 2009-12-30 19:56:52 UTC
BTW: a better way solving this would be an accepted patch for bug #522131.

I only looked at your patch, did not test it myself:

 * the current combination of WinMain() files and still passing the full argv into app_init() looks like every file found by the former will be opened twice by the latter (except the files which can not be represented in the local encoding, but they would spit a warning still)

 * every string conversion which is not a valid filename leaks 'utf8'

 * I'm uncertain if checking for existance is the right way to perform
   list adding, I think we need at least some command line parsing, i.e.:
   - ignore everything starting with a dash
   - remove converted filenames from argv and argc, but only if they are
     not a parameter of -e (is that used by your shell integration?)
Comment 36 Steffen Macke 2009-12-31 08:25:39 UTC
Currently the shell integration is only using -t, but still -e should be ignored.
I'll rework the patch to overcome the above problems. Do you know if it's "evil" to manipulate __argv and __argc directly? Should I make a copy instead?
Comment 37 Steffen Macke 2009-12-31 14:12:54 UTC
Created attachment 150611 [details] [review]
Additional patch to address the problems identified by Hans

* Fixed utf8 leak
* Ignore arguments starting with "-"
* Ignore files passed after "-e"
* Remove "verified" files from args passed to app_init()

If you would prefer a single patch, let me know. This is how it came out of git.
Comment 38 Steffen Macke 2010-01-17 16:05:44 UTC
Created attachment 151608 [details] [review]
Allow to pass wide character filenames to dia-win-remote.exe on the commandline

Additional patch that adds wide character support to dia-win-remote.exe. 
From my point of view, this solves the localized character problem in Dia.
Of course, a fix for #522131 is the cleaner solution. 
I'll rework the code once glib is updated in this respect.
Comment 39 Hans Breuer 2010-01-20 22:19:01 UTC
The new patch (for app_init) looks like it has a problem converting the parameter directly after a conversion. If at all I think it would be easier to steadily fill a new array with non-filenames (better: not converted), rather than modifying __argv in place.

But I was pondering a different idea, namely allowing URIs on the standard command line. This should have multiple advantages:
 - the code in question could be tested by simple shell scripts
   (of course the percent sign needs proper escaping)
 - not platform specific and slightly simpler code
 - should also be possible to extend for the input and output directory 
   switches as well as -e

Given that non-locale filenames on the command line would (and can) only be created by dia-win-remote there should be no problem to produce correct URIs and the user would not need to be bothered with them.
Comment 40 Hans Breuer 2010-01-20 22:23:05 UTC
Created attachment 151877 [details] [review]
Unfinished patch interpreting URIs

The current implementation of the patch only deals with filenames. Also it has a problem when the GLib filename encoding is not utf-8.
Comment 41 Steffen Macke 2010-01-21 20:41:00 UTC
To fix this bug for the limited Dia installer/dia-win-remote interface,
the current URI patch is sufficient. 
I'm currently working on a dia-win-remote patch that passes URIs.

For "problem when the GLib filename encoding is not utf-8": why not simply pass the g_filename_from_uri() result through g_filename_to_utf8() - if that's not the problem, could you explain in more details?
Comment 42 Steffen Macke 2010-01-23 17:48:46 UTC
Created attachment 152096 [details] [review]
URI-encodes filenames passed from dia-win-remote.exe to diaw.exe
Comment 43 Hans Breuer 2010-01-24 14:15:12 UTC
there is a leak passing the result g_filename_from_utf8() directly to g_filename_to_uri(), but the conversion is simply not necessary on win32, so I'll remove it when commiting.
Regarding comment #41: the problem I see in my patch is nothing new. We were always converting filename to utf8 and passing the result to g_file_test(). Now if the Glib file encoding is "on-disk file name bytes on Unix" [1] and that is not utf-8 we would need an additional conversion before the g_file_test(). Given that noone seems to have run into this problem I may do the "conversion" as you say - although it is quite pointless for the case where the GLib filename encoding is utf-8 (just a strdup) and wrong for the other case;)
Comment 45 Hans Breuer 2010-01-24 16:13:42 UTC
Steffen, something is missing with your patch (id=152096), it gives:

dia-win-remote.c(276) : warning C4133: 'function' : incompatible types - from 'c
har *' to 'unsigned short *'

and the glib-2.0.lib should not be hardcoded in makefile.msc. If the DnD case worked for you just adding (LPWSTR) should be enough.
Comment 46 Steffen Macke 2010-01-24 16:19:42 UTC
I also got the warning, things worked for me nevertheless.
The DnD case works - please use diaw.exe --integrated for the initial start.
I'll have a look again at the warning and the hardcoded glib-2.0.lib
Comment 47 Hans Breuer 2010-01-24 16:30:29 UTC
Not need to look at makefile.msc. I've already change it to use $(GLIB_LIBS)
Comment 48 Steffen Macke 2010-01-24 16:49:27 UTC
Thanks. The strange thing is that if I change line 276 to use LPWSTR instead of LPSTR, dia-win-remote.exe crashes on me with a NULL pointer. Switching back to LPSTR, things just work.
Comment 49 Hans Breuer 2010-01-24 17:03:45 UTC
Not strange but C;) By s/LPSTR/LPWSTR/ the resulting pointer get twice as much bytes incremented. I was just adding an extra cast to avoid the warning. Pushed to master and soon to dia-0-97 branch.
Comment 50 Hans Breuer 2010-01-24 17:43:04 UTC
*** Bug 591302 has been marked as a duplicate of this bug. ***
Comment 51 Hans Breuer 2010-01-24 19:14:29 UTC
I've just released dia-0.97.1 - windows version should follow soon;)