After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 754887 - Loading a binary file cause a crash
Loading a binary file cause a crash
Status: RESOLVED DUPLICATE of bug 738504
Product: gtksourceview
Classification: Platform
Component: File loading and saving
3.17.x
Other Linux
: Normal minor
: ---
Assigned To: GTK Sourceview maintainers
GTK Sourceview maintainers
Depends on:
Blocks:
 
 
Reported: 2015-09-11 15:25 UTC by Mantas Mikulėnas (grawity)
Modified: 2015-09-12 09:42 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
example file (2.80 KB, application/octet-stream)
2015-09-11 15:25 UTC, Mantas Mikulėnas (grawity)
Details

Description Mantas Mikulėnas (grawity) 2015-09-11 15:25:20 UTC
Created attachment 311156 [details]
example file

Some programs crash when trying to display non-UTF-8 files (mainly, binary files):

> Gtk:ERROR:gtktextsegment.c:195:_gtk_char_segment_new: assertion failed: (gtk_text_byte_begins_utf8_char (text))
> Aborted (core dumped)

This affects gEdit and more importantly gnome-terminal (where it's not uncommon to accidentally `cat` a binary file).

gtk3 3.17.8.r116.g4993b02 (git master)
Comment 1 Emmanuele Bassi (:ebassi) 2015-09-11 15:34:40 UTC
The crash is inside GtkCharSegment, which is used by the GtkTextBTree, i.e. GtkTextBuffer — something that is not used by gnome-terminal, so it has no bearing on any gnome-terminal crash.

GtkTextView is a GTK+ widget, and as such it has to receive UTF-8 text only. It's up to the caller to ensure that the text is UTF-8, as adding UTF-8 validation to all text-related API would be excessively expensive.

If gedit is crashing because of an invalid UTF-8 file, then the issue lies in gedit and/or GtkSourceView.

Re-assigning to the right component.
Comment 2 Sébastien Wilmet 2015-09-11 16:06:16 UTC
g_utf8_validate() is called in gtk_text_buffer_emit_insert() and returns TRUE.
But later, gtk_text_byte_begins_utf8_char() returns FALSE.

Anyway, opening binary files with GtkSourceFileLoader (and thus gedit) is known to be buggy. Re-assigning to GtkSourceView.

The backtrace:
  • #0 raise
    from /lib64/libc.so.6
  • #1 abort
    from /lib64/libc.so.6
  • #2 g_assertion_message
  • #3 g_assertion_message_expr
  • #4 _gtk_char_segment_new
    at gtktextsegment.c line 195
  • #5 _gtk_text_btree_insert
    at gtktextbtree.c line 1182
  • #6 gtk_text_buffer_real_insert_text
    at gtktextbuffer.c line 894
  • #7 gtk_source_buffer_real_insert_text
    at gtksourcebuffer.c line 1052
  • #8 _gtk_marshal_VOID__BOXED_STRING_INT
    at gtkmarshalers.c line 3255
  • #9 g_type_class_meta_marshal
    at gclosure.c line 994
  • #10 g_closure_invoke
    at gclosure.c line 801
  • #11 signal_emit_unlocked_R
    at gsignal.c line 3654
  • #12 g_signal_emit_valist
    at gsignal.c line 3372
  • #13 g_signal_emit
    at gsignal.c line 3428
  • #14 gtk_text_buffer_emit_insert
    at gtktextbuffer.c line 917
  • #15 gtk_text_buffer_insert
    at gtktextbuffer.c line 948
  • #16 validate_and_insert
    at gtksourcebufferoutputstream.c line 680
  • #17 gtk_source_buffer_output_stream_write
    at gtksourcebufferoutputstream.c line 1046
  • #18 g_output_stream_write
    at goutputstream.c line 219
  • #19 write_file_chunk
    at gtksourcefileloader.c line 516
  • #20 read_cb
    at gtksourcefileloader.c line 621
  • #21 async_ready_callback_wrapper
    at ginputstream.c line 529
  • #22 g_task_return_now
    at gtask.c line 1104
  • #23 complete_in_idle_cb
    at gtask.c line 1118
  • #24 g_idle_dispatch
    at gmain.c line 5441
  • #25 g_main_dispatch
    at gmain.c line 3154
  • #26 g_main_context_dispatch
    at gmain.c line 3769
  • #27 g_main_context_iterate
    at gmain.c line 3840
  • #28 g_main_context_iteration
    at gmain.c line 3901
  • #29 g_application_run
    at gapplication.c line 2311
  • #30 main
    at gedit/gedit.c line 146

Comment 3 Mantas Mikulėnas (grawity) 2015-09-11 18:01:26 UTC
Seems like GtkSourceView is fine – bisected down to 3188b8e in glib:

commit 3188b8ee791a38ac3dd7e477f30761344442f745
Author: Mikhail Zabaluev <mikhail.zabaluev@gmail.com>
Date:   Tue Oct 14 01:18:57 2014 +0300

    Optimized branching in g_utf8_validate()
Comment 4 Sébastien Wilmet 2015-09-11 18:41:17 UTC
Ok thanks for the git bisect.

See bug #738504.
Comment 5 Mikhail Zabaluev 2015-09-11 19:57:11 UTC
Can you isolate the sequence passed to g_utf8_validate() that's causing this?
Comment 6 Mantas Mikulėnas (grawity) 2015-09-11 20:52:24 UTC
(In reply to Mikhail Zabaluev from comment #5)
> Can you isolate the sequence passed to g_utf8_validate() that's causing this?

Causing the crash I don't know, but mistakenly being accepted as valid UTF-8 – a few examples:

 * d2 a1 2f 03 b2 88
 * 55 9d b7 85 86 58
 * 5b 01 28 88 91 24
 * a6 a3 30 64 06 03
Comment 7 Sébastien Wilmet 2015-09-12 09:29:46 UTC
From the backtrace, the sequence is: "\232\251I"
Comment 8 Sébastien Wilmet 2015-09-12 09:42:17 UTC

*** This bug has been marked as a duplicate of bug 738504 ***