Bug 743642 – No way to see Unicode data

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 743642 - No way to see Unicode data


Summary:	No way to see Unicode data


Status:	RESOLVED OBSOLETE

Product:	gnome-characters
Classification:	Other
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	GNOME Characters maintainer(s)
QA Contact:	GNOME Characters maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-01-28 12:57 UTC by Bastien Nocera
Modified:	2018-02-08 13:13 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
screenshot of character dialog (500.90 KB, image/png) 2015-02-09 23:49 UTC, Daiki Ueno	Details

Description Bastien Nocera 2015-01-28 12:57:14 UTC

As done in gucharmap, which is useful for programmers

Comment 1 Daiki Ueno 2015-02-09 23:49:03 UTC

Created attachment 296420 [details]
screenshot of character dialog

I'm working on this in wip/properties branch:
https://git.gnome.org/browse/gnome-characters/log/?h=wip/properties
and attaching a screenshot.

Comment 2 Allan Day 2015-02-10 18:09:33 UTC

(In reply to Daiki Ueno from comment #1)
...
> I'm working on this in wip/properties branch:
> https://git.gnome.org/browse/gnome-characters/log/?h=wip/properties
> and attaching a screenshot.

Thanks Daiki! I can help with the design. Looking at the screenshot, the Unicode code seems obviously useful, but I'm not too sure about the other pieces of information there. Can you explain what they might be used for?

Comment 3 Daiki Ueno 2015-02-11 01:37:19 UTC

Thanks for looking into it.  I agree that some fields are not useful.

"Script:" and "Block:" are there to provide a possible link to those pages (bug 743643).  They would be redundant if we had a better presentation of them.  Perhaps the simplest way might be to make "search" match against script name / block name.

"Decomposition:" would be useful for characters in complex scripts, where character composition is not so obvious (e.g. "한" = "ㅎ" + "ㅏ" + "ㄴ", in Korean).  However, for other simpler cases (e.g. "è" = "e" + "`"), they could be moved to "See also".

"Category:" (Unicode general category), however, would be useful for programmers working on Unicode text processing (for example, to determine whether the character may constitute a word).

Comment 4 Allan Day 2015-02-11 16:04:45 UTC

Thanks for the information. What is the difference between script and block? In your screenshot, they are identical.

Also, it would be really nice if the Unicode character names could be in sentence case rather than all capitals. Would that be possible, do you think?

Comment 5 Allan Day 2015-02-12 17:26:48 UTC

(In reply to Bastien Nocera from comment #0)
> As done in gucharmap, which is useful for programmers

What Unicode data do you use from gucharmap?

Comment 6 Matthias Clasen 2015-08-18 12:43:01 UTC

The various other representations can be useful when you need to insert a unicode character in different contexts (I have used octal and utf8 that way in the past).

The category can be useful when trying to figure out why things render the way they do (I've used this when trying to understand the various complications with the ratio character).

The Unicode version a character appeared in can be relevant when trying to decide whether it is safe to use on older platforms.

Comment 7 Jerry Casiano 2015-11-11 03:27:16 UTC

Hmm, I assumed gnome-characters was intended to be dead simple. Which is perfect for most users. They don't care about any of this.

If you do implement all this then you'll just end up with Gucharmap, but worse. So maybe users that require this should just use Gucharmap. Unless it's going away completely, hope not, I use the library.

Comment 8 Matthias Clasen 2015-11-11 13:46:03 UTC

I don't think 'character map' is a complex enough area to justify having a 'beginners' and an 'expert' tool as separate apps, tbh.

Comment 9 Jerry Casiano 2015-11-11 20:31:22 UTC

(In reply to Matthias Clasen from comment #8)
> I don't think 'character map' is a complex enough area to justify having a
> 'beginners' and an 'expert' tool as separate apps, tbh.

Completely agree.

But with the introduction of gnome-characters that is exactly what we have.

gnome-characters looks fantastic, and is perfect for average users. Someone who's just looking for an emoticon or something silly.

gucharmap, at least the default interface, really is overkill for that use case.

But for people who actually required something like gucharmap, who need to be able to find glyphs by Unicode Script or Block, see whether a particular font contains the glyphs they require, etc. gnome-characters completely misses the point and will likely never be a suitable replacement. I could be wrong about this, just my impression.

It's unfortunate that gucharmap and the library it provides weren't reworked to allow for a simplified view by default, while retaining the more advanced aspects. But I'm sure there were good reasons not to, not trying to question that.

It's also unfortunate gnome-characters is not built on a library that other programs could use. Would be great for consistency.

Just my 2c.

Comment 10 Bastien Nocera 2015-11-12 10:35:44 UTC

(In reply to Jerry Casiano from comment #9)
> (In reply to Matthias Clasen from comment #8)
> > I don't think 'character map' is a complex enough area to justify having a
> > 'beginners' and an 'expert' tool as separate apps, tbh.
> 
> Completely agree.
> 
> But with the introduction of gnome-characters that is exactly what we have.

That's definitely not the goal. gnome-characters is supposed to replace gucharmap with a UI that better matches GNOME 3.

<snip>
> But for people who actually required something like gucharmap, who need to
> be able to find glyphs by Unicode Script or Block,

This isn't what's requested.

> see whether a particular
> font contains the glyphs they require, etc.

Neither is that.

> gnome-characters completely
> misses the point and will likely never be a suitable replacement. I could be
> wrong about this, just my impression.

You're wrong. You're making assumptions based on UI that's not written yet, and not even designed.

> It's unfortunate that gucharmap and the library it provides weren't reworked
> to allow for a simplified view by default, while retaining the more advanced
> aspects. But I'm sure there were good reasons not to, not trying to question
> that.
> 
> It's also unfortunate gnome-characters is not built on a library that other
> programs could use. Would be great for consistency.

Whether or not it's a library has nothing to do with how we're going to integrate a feature in the UI.

> Just my 2c.

(In reply to Daiki Ueno from comment #3)
> Thanks for looking into it.  I agree that some fields are not useful.
<snip>

None of the properties added to the dialogue are useful to me (as a programmer). I'd need 4 things:
- the UTF-16 representation (for using the character in Javascript)
- C octal escaped UTF-8 (for using in C)
- XML decimal entity (for using in HTML/XML)
- How to type the character based on the current keymap (that's going to be a harder one).

I don't need the block, the decomposition, or any of that. Somebody else might, but not my limited "programming".

Comment 11 Jerry Casiano 2015-11-12 13:26:18 UTC

(In reply to Bastien Nocera from comment #10)
> (In reply to Jerry Casiano from comment #9)
> 
> > gnome-characters completely
> > misses the point and will likely never be a suitable replacement. I could be
> > wrong about this, just my impression.
> 
> You're wrong. You're making assumptions based on UI that's not written yet,
> and not even designed.

Actually, I'm making assumptions based on UI that's running on my desktop right now.

> Whether or not it's a library has nothing to do with how we're going to
> integrate a feature in the UI.

Of course not, I was just hoping it would be shared.


In any case, I'll see myself out. Was not my intention to annoy or derail.

Comment 12 Daiki Ueno 2015-11-12 20:19:41 UTC

(In reply to Bastien Nocera from comment #10)
> None of the properties added to the dialogue are useful to me (as a
> programmer). I'd need 4 things:
> - the UTF-16 representation (for using the character in Javascript)
> - C octal escaped UTF-8 (for using in C)
> - XML decimal entity (for using in HTML/XML)

I feel it a bit redundant to have them, since they can be easily calculated from a codepoint, and some text editors have support for that (I use encode-coding-region on Emacs).

> - How to type the character based on the current keymap (that's going to be
> a harder one).

Couldn't we always suggest Shift-Ctrl-u + codepoint?

Comment 13 Bastien Nocera 2015-11-17 11:02:13 UTC

(In reply to Daiki Ueno from comment #12)
> (In reply to Bastien Nocera from comment #10)
> > None of the properties added to the dialogue are useful to me (as a
> > programmer). I'd need 4 things:
> > - the UTF-16 representation (for using the character in Javascript)
> > - C octal escaped UTF-8 (for using in C)
> > - XML decimal entity (for using in HTML/XML)
> 
> I feel it a bit redundant to have them, since they can be easily calculated
> from a codepoint, and some text editors have support for that (I use
> encode-coding-region on Emacs).

Well, that's the only thing I'd use gnome-characters for, I have no use for any of the metadata you showed in the screenshot.

> > - How to type the character based on the current keymap (that's going to be
> > a harder one).
> 
> Couldn't we always suggest Shift-Ctrl-u + codepoint?

That'd be fairly useless, especially for accents or currency symbols which might already be available through AltGr, and similar modifier keys.

Comment 14 Daiki Ueno 2015-11-17 12:26:45 UTC

(In reply to Bastien Nocera from comment #13)
> Well, that's the only thing I'd use gnome-characters for,

I see your point, but I personally think it is not a good idea to present encoded byte sequence to a user, even if she is a programmer, since modern programming languages (including C11) provide \u or \U escapes.  The decimal representation for XML might be useful though.

> I have no use for any of the metadata you showed in the screenshot.

Certainly, such information might be useless for some scripts, e.g. Latin.  However, they are useful for CJK where they even have input methods based on "radicals":
https://en.wikipedia.org/wiki/Cangjie_input_method

I think the useful set of properties depend on the character itself.  We can hide or show them according to circumstances.

> > Couldn't we always suggest Shift-Ctrl-u + codepoint?
> 
> That'd be fairly useless, especially for accents or currency symbols which
> might already be available through AltGr, and similar modifier keys.

Yes, we can suggest the shortest key sequence, depending on the current keymap.

Comment 15 GNOME Infrastructure Team 2018-02-08 13:13:24 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gnome-characters/issues/1.