After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 784246 - Unset encoding in text-highlight module output
Unset encoding in text-highlight module output
Status: RESOLVED FIXED
Product: evolution
Classification: Applications
Component: general
3.24.x (obsolete)
Other Linux
: Normal normal
: ---
Assigned To: Evolution Shell Maintainers Team
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2017-06-27 12:56 UTC by Volker Sobek (weld)
Modified: 2017-06-29 10:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Preview of UTF-8 encoded file with scrambled chars (24.63 KB, image/png)
2017-06-27 12:56 UTC, Volker Sobek (weld)
Details
Effect of different file endings (29.88 KB, image/png)
2017-06-27 13:07 UTC, Volker Sobek (weld)
Details

Description Volker Sobek (weld) 2017-06-27 12:56:40 UTC
Created attachment 354566 [details]
Preview of UTF-8 encoded file with scrambled chars

When attaching any UTF-8 encoded plain text file, the attachment preview does not recognize UTF-8 and uses some other encoding and scrambles the text. For an example see the attachment. Saving the file from evolution results in a valid UTF-8 file again.

evolution-3.24.3-1.fc26.x86_64
Comment 1 Volker Sobek (weld) 2017-06-27 13:07:16 UTC
Created attachment 354567 [details]
Effect of different file endings

It seems to be related to the file ending, since evolution uses syntax highlight, etc. according to the language it guesses from the file ending and then seems to use a different encoding in some cases. This screenshot shows the same UTF-8 text file attached a couple of times, the only difference between them being the file name endings.
Comment 2 Volker Sobek (weld) 2017-06-27 13:08:50 UTC
(In reply to Volker Sobek (weld) from comment #1)
> Created attachment 354567 [details]
> Effect of different file endings
> 
> It seems to be related to the file ending, since evolution uses syntax
> highlight, etc. according to the language it guesses from the file ending
> and then seems to use a different encoding in some cases. This screenshot
> shows the same UTF-8 text file attached a couple of times, the only
> difference between them being the file name endings.

Forgot to mention that the test.txt preview shows the correct text.
Comment 3 Milan Crha 2017-06-29 09:45:29 UTC
Thanks for a bug report. I can reproduce it too. I agree with you that the issue can be with the way evolution calls highlight. Interestingly, right-click in the body->Format As->Plain Text and then back to the previous format shows the text properly.
Comment 4 Milan Crha 2017-06-29 10:28:38 UTC
The highlight generates also <meta charset=""> in the <header/> tag of the HTML output and if there is no --encoding argument it picked ISO-8859-1 for some reason. I set it to 'none', thus the 'mata' element is missing in the output and the content is shown properly. I also made the module to recognize whether anything had been written out by the highlight, and if not, then it outputs the original content. It's useful for cases when some highlight won't understand the parameters given by the evolution module.

Created commit 5e4a3d1 in evo master (3.25.4+)
Created commit 449bf48 in evo gnome-3-24 (3.24.4+)