After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 445316 - Read psd layer names longer than 31 characters.
Read psd layer names longer than 31 characters.
Status: RESOLVED FIXED
Product: GIMP
Classification: Other
Component: Plugins
2.2.x
Other All
: Normal enhancement
: 2.4
Assigned To: GIMP Bugs
GIMP Bugs
Depends on:
Blocks:
 
 
Reported: 2007-06-07 22:54 UTC by Eric Ross
Modified: 2008-01-15 13:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch for gimp-2.2.15/plug-ins/common/psd.c (4.69 KB, patch)
2007-06-07 22:57 UTC, Eric Ross
needs-work Details | Review
Patch against svn. Adds support for long layer names in Photoshop files. (8.62 KB, patch)
2007-06-11 19:54 UTC, Eric Ross
needs-work Details | Review
Adds support for long layer names in Photoshop files (converts unicode correctly). (8.50 KB, patch)
2007-06-11 22:53 UTC, Eric Ross
committed Details | Review
Patch to write long layer names to psd files. (2.61 KB, patch)
2007-06-13 03:11 UTC, Eric Ross
none Details | Review
PSD with broken layer names in WinXP (730.42 KB, application/octet-stream)
2007-06-21 08:07 UTC, John Marshall
  Details
sScreenshot of layer names under windows (29.02 KB, image/png)
2007-06-21 21:28 UTC, John Marshall
  Details
Patch to fix layer name display under windows (961 bytes, patch)
2007-06-26 16:18 UTC, John Marshall
needs-work Details | Review
patch to change psd-load and psd-save to use UTF16 instead of UCS-2 (2.04 KB, patch)
2007-07-06 12:57 UTC, Sven Neumann
committed Details | Review

Description Eric Ross 2007-06-07 22:54:55 UTC
Currently, the psd plugin only reads the short layer name (31 chars max).
By parsing the long unicode name block contained within the layer extra data, we can get the longer layer name.

The layer extra data section also contains more blocks of info related to that layer. For example, layer color, layer id, layer effects, and layer folders are all contained in blocks similar to the long unicode name block.

I'll include a patch for psd.c that implements parsing a few of these blocks.
Comment 1 Eric Ross 2007-06-07 22:57:06 UTC
Created attachment 89577 [details] [review]
Patch for gimp-2.2.15/plug-ins/common/psd.c
Comment 2 Sven Neumann 2007-06-08 09:36:04 UTC
Please try to follow the GIMP coding style as defined in the file HACKING. We would also very much appreciate a patch against SVN trunk or against a recent 2.3 release as we will not add any new features to gimp-2.2.
Comment 3 Eric Ross 2007-06-11 19:52:26 UTC
Well, SVN would explain why I couldn't find the cvs server described in http://www.gimp.org/source/howtos/stable-cvs-get.html didn't work for me.

Ok, so I think that I've created a more useful patch this weekend. I tried to follow the style correctly but the file seems a little inconsistent so let me know if I've missed something.

I know how to parse the layer styles info found in the blocks lrFX and lfx2 but they're pretty complex and I think it may clutter up the code before gimp is ready to use the data from them.
Comment 4 Eric Ross 2007-06-11 19:54:37 UTC
Created attachment 89768 [details] [review]
Patch against svn. Adds support for long layer names in Photoshop files.
Comment 5 Sven Neumann 2007-06-11 20:16:06 UTC
This looks a lot better. The routine getunicodepascalstring() does not seem to do the right thing though. I assume that the names are encoded as UCS-2 (see http://en.wikipedia.org/wiki/UTF-16). So instead of skipping the higher-order byte, you should use g_convert() to convert to UTF-8 (the character encoding used in GIMP). There's code in app/core/gimpbrush-load.c that does something similar. You may want to use that as an example.
Comment 6 Eric Ross 2007-06-11 22:53:41 UTC
Created attachment 89778 [details] [review]
Adds support for long layer names in Photoshop files (converts unicode correctly).

You're right but I wasn't sure how to do that correctly. Thanks for the reference. Here's a new patch to correct getunicodepascalstring().
Comment 7 Sven Neumann 2007-06-11 22:58:50 UTC
Is the encoding really UTF-16? From my experience with PS, it's more likely UCS-2. This doesn't make a difference for characters in the Basic Multilingual Plane but I'd prefer if we could get this right. Is there any documentation on the file format that could help us to answer this question?
Comment 8 Eric Ross 2007-06-11 23:27:46 UTC
I don't know of any public docs about it. I only use a hex editor to look for patterns in the psd files, so I can't answer that. I just picked the newer one on the assumption that Adobe would be adapting it as they release newer versions.
Comment 9 Sven Neumann 2007-06-12 06:38:36 UTC
They can't really adapt a newer encoding in a file format without introducing new tags. So it's safer to assume that the names are in UCS-2 encoding. I have applied your patch, changed the encoding to UCS-2 and did some minor coding style cleanups.

Thanks a lot for this contribution. Now I wonder if we should also add support for writing long layer names to psd-save.

2007-06-12  Sven Neumann  <sven@gimp.org>

	* plug-ins/common/psd-load.c: applied slightly modified patch from
	Eric Ross that adds support for loading long layer names from the
	extra layer data section (bug #445316).
Comment 10 Eric Ross 2007-06-12 08:54:28 UTC
Thanks for adding that. Shouldn't be too difficult to add support for saving them. I'll look into it this week.
Comment 11 Eric Ross 2007-06-13 03:11:47 UTC
Created attachment 89865 [details] [review]
Patch to write long layer names to psd files.

Here's my first attempt at writing long layer names. Seems to work for everything I throw at it. Not quite sure about style yet.
Comment 12 Sven Neumann 2007-06-13 07:08:30 UTC
Should the long name always be written? It would perhaps make sense to only write it if the layer name is longer than 31 characters or contains non-ASCII characters. What does PS do?

Your code is problematic because it doesn't check if the UTF-8 to UCS-2 conversion has succeeded. Not all strings encoded in UTF-8 are representable in UCS-2 encoding.
Comment 13 Sven Neumann 2007-06-13 07:09:08 UTC
Oh, and could you please open a new bug report for saving.
Comment 14 John Marshall 2007-06-21 08:02:31 UTC
Current versions of PS always save the layer name in the current character set in an image resource and as UTF-16 in a layer resource block. Attached below is an example of a file where the layer names do not display under windows xp.
Comment 15 John Marshall 2007-06-21 08:07:27 UTC
Created attachment 90381 [details]
PSD with broken layer names in WinXP
Comment 16 Eric Ross 2007-06-21 19:35:06 UTC
(In reply to comment #14)
> Current versions of PS always save the layer name in the current character set
> in an image resource and as UTF-16 in a layer resource block. Attached below is
> an example of a file where the layer names do not display under windows xp.
> 

I can see where the slice name is in the image resources but not the layer name. I viewed this file with PS in Windows and Gimp in Linux and I can see 2 layers in this image, one named "Background" and another named "Color Fill 1". What does gimp show in Windows for the layer names?
Comment 17 John Marshall 2007-06-21 21:27:50 UTC
Sorry, I was being dim.  It's alpha channel names that are stored in the image resource, ascii layer names are stored in the layer record. Screenshot of windows layer names follows.
Comment 18 John Marshall 2007-06-21 21:28:30 UTC
Created attachment 90417 [details]
sScreenshot of layer names under windows
Comment 19 Eric Ross 2007-06-22 01:05:59 UTC
(In reply to comment #18)
> Created an attachment (id=90417) [edit]
> sScreenshot of layer names under windows
> 

Interesting. My patch is making a call to g_convert() to convert the Unicode string into a UTF-8 string. I think that maybe your font isn't supporting the UTF-8 string that's being returned. I'm not sure which part is at fault here but I'm inclined to think that it's the font being used. I'll have to look into it some more.
Comment 20 Sven Neumann 2007-06-22 06:07:06 UTC
The string only consists of ASCII characters so it's very unlikely that the font is to blame here.

More likely what's happening is that iconv on Windows doesn't support the conversion we are asking for and what you are seeing are the fallback characters (and apparently you don't have a font to render those).

Eric, please try to be more precise when it comes to Unicode and encodings. An UTF-8 encoded string is also an Unicode string.
Comment 21 Sven Neumann 2007-06-22 06:17:04 UTC
John, can you make out the numbers written into the boxes? They are unreadable on your screenshot. But perhaps if you increased the font size, you might be able to make out the numbers. These are the code points and they might give us a hint on what's going wrong here.
Comment 22 Eric Ross 2007-06-22 06:29:56 UTC
(In reply to comment #20)
> The string only consists of ASCII characters so it's very unlikely that the
> font is to blame here.
> 
> More likely what's happening is that iconv on Windows doesn't support the
> conversion we are asking for and what you are seeing are the fallback
> characters (and apparently you don't have a font to render those).
> 
> Eric, please try to be more precise when it comes to Unicode and encodings. An
> UTF-8 encoded string is also an Unicode string.
> 

I used 'Unicode string' to refer to the string provided by Photoshop and used 'UTF-8 string' to refer to the string that was being returned from g_convert(). I suppose that I should avoid such shortcuts if since it's confusing.
Comment 23 John Marshall 2007-06-22 20:03:04 UTC
(In reply to comment #21)
> John, can you make out the numbers written into the boxes? They are unreadable
> on your screenshot. But perhaps if you increased the font size, you might be
> able to make out the numbers. These are the code points and they might give us
> a hint on what's going wrong here.
> 
They are all four zeros.  If this is real and not just pretty pictures from windows it might suggest that the byte swapping is not working as it should be for the double byte unicode characters.  Also I would suspect that recent versions of ps use utf-16 not ucs-2 as the character encoding as this is what is specified in the xmp documentation.
Comment 24 Sven Neumann 2007-06-22 21:35:03 UTC
How does the XMP specification apply here? For the strings used here, it should also not make a difference.
Comment 25 John Marshall 2007-06-22 21:45:49 UTC
The XMP spec is the most recent freely available spec from adobe relating to PSD files, however as you say it would not make a difference with these strings.
Comment 26 Sven Neumann 2007-06-23 09:34:29 UTC
Can you please add a link to that spec to this bug-report then?
Comment 27 Kevin Cozens 2007-06-23 15:57:08 UTC
A google search turned up the URL http://www.adobe.com/devnet/xmp/ which has some information about the XMP format and a link to the PDF file containing the spec.
Comment 28 John Marshall 2007-06-26 16:18:11 UTC
Created attachment 90671 [details] [review]
Patch to fix layer name display under windows

This patch fixes the display of unicode layer names under windows (also tested on Fedora 7) and fixes  a crash if the UTF8 representation of the short layer name contains multibyte characters (a regression from 2.2).
Comment 29 Sven Neumann 2007-07-06 12:47:04 UTC
If we use g_utf16_to_utf8() then we should probably also use g_utf8_to_utf16() in psd-save.c. For now I have committed the uncontroversial part of this patch:

2007-07-06  Sven Neumann  <sven@gimp.org>

	* plug-ins/common/psd-load.c (do_layer_record): applied part of a
	patch from John Marshall that fixes handling of the short layer
	name (bug #445316).
Comment 30 Sven Neumann 2007-07-06 12:55:20 UTC
Reopening this bug since we still need to fix the encoding issue.
Comment 31 Sven Neumann 2007-07-06 12:57:20 UTC
Created attachment 91299 [details] [review]
patch to change psd-load and psd-save to use UTF16 instead of UCS-2

This is an untested patch that I propose as a solution for this issue. Please review it and comment.
Comment 32 Sven Neumann 2007-07-06 14:26:34 UTC
With this patch applied, I can save a PSD file with a long chinese layer name. Opening this file in GIMP yields the same layer name. Now we need someone to open such a file in PS and to test the loader with a file written by PS.
Comment 33 John Marshall 2007-07-06 16:25:12 UTC
Sven,

With the patch applied to my windows build of Gimp I can save PSD files with layer names containing 2 byte characters which display correctly in photoshop CS3.  I can also confirm that files saved from ps with 2 byte characters in the layer names load correctly in GIMP.
Comment 34 Sven Neumann 2007-07-06 18:03:36 UTC
Thanks for testing, I have committed it then.

2007-07-06  Sven Neumann  <sven@gimp.org>

	* plug-ins/common/psd-load.c 
	* plug-ins/common/psd-save.c: use UTF-16 encoding instead of UCS-2
	for layer names (bug #445316).