Bug 612082 – Do not expose off_t in public API, use goffset instead

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 612082 - Do not expose off_t in public API, use goffset instead


Summary:	Do not expose off_t in public API, use goffset instead


Status:	RESOLVED FIXED

Product:	evolution
Classification:	Applications
Component:	Mailer
Version:	2.28.x (obsolete)
Hardware:	Other All

Importance:	Normal critical
Target Milestone:	---
Assigned To:	evolution-mail-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Duplicates:	612174 614600 614824 615200 616755 617305 617817 617886 619059 619108 619125 619159 619166 619178 619259 619375 619427 619582 620597 621101 621104 (view as bug list)
Depends on:
Blocks:

Reported:	2010-03-07 12:52 UTC by paultirk
Modified:	2010-10-01 13:07 UTC

See Also:
GNOME target:	---
GNOME version:	2.27/2.28

Attachments
Fix NULL pointer dereference (1.29 KB, patch) 2010-03-24 16:07 UTC, Michel Dänzer	committed	Details \| Review
Two mails that are causing a segfault (218.17 KB, application/mbox) 2010-04-28 15:55 UTC, Gert Kulyk		Details
screenshot when opening one of the mails (125.16 KB, image/png) 2010-04-28 17:39 UTC, Gert Kulyk		Details
Valgrind log (444.71 KB, application/octet-stream) 2010-04-28 19:52 UTC, Gert Kulyk		Details
test app (1.80 KB, text/plain) 2010-07-13 11:42 UTC, Milan Crha		Details
eds patch (22.18 KB, patch) 2010-07-13 12:46 UTC, Milan Crha	committed	Details \| Review

Description paultirk 2010-03-07 12:52:52 UTC

Version: 2.30.x

What were you doing when the application crashed?
I just changed from one local mail folder to another.


Distribution: Debian squeeze/sid
Gnome Release: 2.28.2 2009-12-18 (Debian)
BugBuddy Version: 2.28.0

System: Linux 2.6.32 #5 PREEMPT Fri Feb 19 14:20:50 CET 2010 i686
X Vendor: The X.Org Foundation
X Vendor Release: 10604000
Selinux: No
Accessibility: Disabled
GTK+ Theme: QtCurve
Icon Theme: Mist
GTK+ Modules: globalmenu-plugin, globalmenu-gnome, gnomebreakpad, canberra-gtk-module

Memory status: size: 183586816 vsize: 183586816 resident: 35954688 share: 23240704 rss: 35954688 rss_rlim: 18446744073709551615
CPU usage: start_time: 1267966142 rtime: 502 utime: 455 stime: 47 cutime:39 cstime: 8 timeout: 0 it_real_value: 0 frequency: 100

Backtrace was generated from '/usr/bin/evolution'

[Thread debugging using libthread_db enabled]
[New Thread 0xadc9bb70 (LWP 4653)]
[New Thread 0xb17feb70 (LWP 4652)]
[New Thread 0xae49cb70 (LWP 4559)]
[New Thread 0xaec9db70 (LWP 4558)]
[New Thread 0xaf7fab70 (LWP 4530)]
[New Thread 0xb0ffdb70 (LWP 4529)]
[New Thread 0xafffbb70 (LWP 4528)]
[New Thread 0xb07fcb70 (LWP 4527)]
[New Thread 0xb29d6b70 (LWP 4523)]
[New Thread 0xb31d7b70 (LWP 4522)]
[New Thread 0xb39fcb70 (LWP 4521)]
[New Thread 0xb41fdb70 (LWP 4520)]
0xffffe424 in __kernel_vsyscall ()

+ Trace 220846

Thread 2 (Thread 0xadc9bb70 (LWP 4653))

#0 __kernel_vsyscall
#1 __lll_lock_wait
from /lib/i686/cmov/libpthread.so.0
#2 _L_lock_881
from /lib/i686/cmov/libpthread.so.0
#3 pthread_mutex_lock
from /lib/i686/cmov/libpthread.so.0
#4 segv_redirect
at main.c line 284
#5 <signal handler called>
#6 em_format_snoop_type
at em-format.c line 2021
#7 em_format_part_as
at em-format.c line 659
#8 make_part_attachment
at prefer-plain.c line 72
#9 export_as_attachments
at prefer-plain.c line 110
#10 org_gnome_prefer_plain_multipart_alternative
at prefer-plain.c line 191
#11 plugin_lib_invoke
at e-plugin-lib.c line 116
#12 e_plugin_invoke
at e-plugin.c line 692
#13 emfh_format_format
at em-format-hook.c line 78
#14 em_format_part_as
at em-format.c line 675
#15 em_format_part
at em-format.c line 704
#16 efh_format_message
at em-format-html.c line 2775
#17 efh_format_exec
at em-format-html.c line 216
#18 mail_msg_proxy
at mail-mt.c line 471
#19 g_thread_pool_thread_proxy
at /build/buildd-glib2.0_2.22.4-1-i386-jRfNZE/glib2.0-2.22.4/glib/gthreadpool.c line 265
#20 g_thread_create_proxy
at /build/buildd-glib2.0_2.22.4-1-i386-jRfNZE/glib2.0-2.22.4/glib/gthread.c line 635
#21 start_thread
from /lib/i686/cmov/libpthread.so.0
#22 clone
from /lib/i686/cmov/libc.so.6



---- Critical and fatal warnings logged during execution ----

** evolution **: categories_icon_theme_hack: assertion `filename != NULL && *filename != '\0'' failed 


----------- .xsession-errors (9 sec old) ---------------------
(epiphany:3063): GLib-GObject-CRITICAL **: g_object_ref: assertion `object->ref_count > 0' failed
(epiphany:3063): GLib-GObject-CRITICAL **: g_object_ref: assertion `object->ref_count > 0' failed
(gnome-panel:3029): GLib-GObject-CRITICAL **: g_object_ref: assertion `object->ref_count > 0' failed
empathy: /usr/lib/libxslt.so.1: no version information available (required by /usr/lib/libwebkit-1.0.so.2)
empathy: /usr/lib/libxslt.so.1: no version information available (required by /usr/lib/libwebkit-1.0.so.2)
empathy: /usr/lib/libxslt.so.1: no version information available (required by /usr/lib/libwebkit-1.0.so.2)
empathy: /usr/lib/libxslt.so.1: no version information available (required by /usr/lib/libwebkit-1.0.so.2)
Gtk-Message: Failed to load module "globalmenu-gnome": libglobalmenu-gnome.so: Kann die Shared-Object-Datei nicht \xf6ffnen: Datei oder Verzeichnis nicht gefunden
** (empathy:4549): WARNING **: _nm_object_get_property: Error getting 'WwanHardwareEnabled' for /org/freedesktop/NetworkManager: (16) No such property WwanHardwareEnabled
kdeinit4: preparing to launch /usr/lib/kde4/kio_pop3.so
Gtk-Message: Failed to load module "globalmenu-gnome": libglobalmenu-gnome.so: Kann die Shared-Object-Datei nicht \xf6ffnen: Datei oder Verzeichnis nicht gefunden
--------------------------------------------------

Comment 1 paultirk 2010-03-07 20:49:41 UTC

I realized that it happens when I had selected the "Unwanted" or "Spam" folder and then try to change to another.

Comment 2 Michel Dänzer 2010-03-24 16:07:51 UTC

Created attachment 156995 [details] [review]
Fix NULL pointer dereference

This patch fixes these crashes here. Can you confirm?

Comment 3 Peter Sääf 2010-03-28 12:18:37 UTC

I can confirm that the patch works. Also, bug #612174 appears to be a dupe.

Comment 4 Akhil Laddha 2010-04-02 04:37:40 UTC

*** Bug 614600 has been marked as a duplicate of this bug. ***

Comment 5 Akhil Laddha 2010-04-14 03:37:43 UTC

*** Bug 615200 has been marked as a duplicate of this bug. ***

Comment 6 Milan Crha 2010-04-27 13:12:56 UTC

(In reply to comment #3)
> I can confirm that the patch works.

Peter, are you sure that this patch is a reason for the fix? What I see in the source code is that there is pretty much no difference between before and after the patch, because the mem->buffer shouldn't be NULL for all the life time of the memory stream. Note that:
> CamelStream *
> camel_stream_mem_new (void)
> {
> 	return camel_stream_mem_new_with_byte_array (g_byte_array_new ());
> }
>...
>CamelStream *
>camel_stream_mem_new_with_byte_array (GByteArray *buffer)
>{
>	CamelStreamMem *stream_mem;
>
>	stream_mem = CAMEL_STREAM_MEM (camel_object_new (CAMEL_STREAM_MEM_TYPE));
>	stream_mem->buffer = buffer;
>	stream_mem->owner = TRUE;
>
> 	return CAMEL_STREAM (stream_mem);
> }

Comment 7 Michel Dänzer 2010-04-27 16:53:26 UTC

(In reply to comment #6)
> Peter, are you sure that this patch is a reason for the fix? What I see in the
> source code is that there is pretty much no difference between before and after
> the patch, because the mem->buffer shouldn't be NULL for all the life time of
> the memory stream.

With the patch, mem->buffer isn't used in this case. I fully suspect that this merely papers over the real issue, but it definitely fixes these crashes.

Comment 8 Gert Kulyk 2010-04-28 05:51:22 UTC

I can confirm that the patch works, too. It resolves the issue described in Bug 616755 (where you'll find a backtrace with debugging symbols).

Comment 9 Milan Crha 2010-04-28 10:26:43 UTC

*** Bug 616755 has been marked as a duplicate of this bug. ***

Comment 10 Milan Crha 2010-04-28 10:27:14 UTC

*** Bug 612174 has been marked as a duplicate of this bug. ***

Comment 11 Milan Crha 2010-04-28 11:53:30 UTC

Could anyone of you give me the exact message structure with which it is reproducible, please? The best whole test message, but as it can contain confidential information, then I do not want to ask for it. I would like to know what's wrong here, because it doesn't make much sense to me. Also, what is the account type you are seeing this on? (IMAP/Local On This Computer/...)

I tried with IMAP with this message structure and it works just fine for me, no issus so far, neither from valgrind:

To: xxx
...
Content-Type: multipart/mixed; boundary="_005_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_"

--_005_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_
Content-Type: multipart/alternative; boundary="_000_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_"


--_000_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable


--_000_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
...
</html>

--_000_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_--

--_005_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_
Content-Type: application/pdf; name="DefaultID.pdf"
Content-Description: DefaultID.pdf
Content-Disposition: attachment; filename="DefaultID.pdf"; size=84253; creation-date="Wed, 16 Aug 2006 10:08:50 GMT"; modification-date="Wed, 16 Aug 2006 10:08:50 GMT"
Content-Transfer-Encoding: base64

JVBERi0xLjYNJeLjz9MNCjEzIDAgb2JqDTw8L0xpbmVhcml6ZWQgMS9MIDgwNjUxL08gMTYvRSA5
...
MDAwIG4NCnRyYWlsZXINCjw8L1NpemUgMTM+Pg0Kc3RhcnR4cmVmDQoxMTYNCiUlRU9GDQo=

--_005_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_
Content-Type: application/pdf; name="AdobeID.pdf"
Content-Description: AdobeID.pdf
Content-Disposition: attachment; filename="AdobeID.pdf"; size=85672; creation-date="Wed, 16 Aug 2006 10:09:20 GMT"; modification-date="Wed, 16 Aug 2006 10:09:20 GMT"
Content-Transfer-Encoding: base64

JVBERi0xLjYNJeLjz9MNCjEzIDAgb2JqDTw8L0xpbmVhcml6ZWQgMS9MIDgyMDcwL08gMTYvRSAx
...
CnRyYWlsZXINCjw8L1NpemUgMTM+Pg0Kc3RhcnR4cmVmDQoxMTYNCiUlRU9GDQo=

--_005_19A951631061BE428ED3215371653D6A7000windows2003r2exchan_--

Comment 12 Gert Kulyk 2010-04-28 15:55:07 UTC

Created attachment 159803 [details]
Two mails that are causing a segfault

Here two mails that are causing a segfault/crash. These are not the only ones, but the only ones I can post due to reasons of privacy. The account type does not matter, it always segfaults/crashes when opening such mails.

Comment 13 Milan Crha 2010-04-28 16:34:59 UTC

Thanks for test messages. I just tried with 2.30.1 and I do not see any crash with them, no matter what I have set in Edit->Preferences->Mail Preferences->tab "HTML Messages", section "Plain Text Mode". Either I'm doing something wrong, or it got fixed by something else meanwhile.

Comment 14 Gert Kulyk 2010-04-28 17:20:56 UTC

I can always reproduce it, running evo 2.30.1.2. Only when applying the patch I'm able to open the above mentioned messages.

E.g. Dienstagsbrief Nr. 16 causes the following:

[Thread debugging using libthread_db enabled]
[New Thread 0xae2acb70 (LWP 12097)]
[New Thread 0xafcfeb70 (LWP 12090)]
[New Thread 0xb0e27b70 (LWP 12088)]
[New Thread 0xb1628b70 (LWP 12087)]
[New Thread 0xb1e4eb70 (LWP 12086)]
[New Thread 0xb264fb70 (LWP 12085)]
0xb7829424 in __kernel_vsyscall ()

+ Trace 221596

Thread 2 (Thread 0xae2acb70 (LWP 12097))

#0 __kernel_vsyscall
#1 __lll_lock_wait
at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S line 142
#2 _L_lock_881
from /lib/i686/cmov/libpthread.so.0
#3 __pthread_mutex_lock
at pthread_mutex_lock.c line 61
#4 <signal handler called>
#5 em_format_snoop_type
at em-format.c line 2021
#6 em_format_part_as
at em-format.c line 659
#7 em_format_part
at em-format.c line 704
#8 emf_multipart_mixed
at em-format.c line 1435
#9 em_format_part_as
at em-format.c line 675
#10 em_format_part
at em-format.c line 704
#11 efh_format_message
at em-format-html.c line 2782
#12 efh_format_exec
at em-format-html.c line 216
#13 mail_msg_proxy
at mail-mt.c line 471
#14 g_thread_pool_thread_proxy
at /build/buildd-glib2.0_2.24.0-1-i386-o5zIuQ/glib2.0-2.24.0/glib/gthreadpool.c line 315
#15 g_thread_create_proxy
at /build/buildd-glib2.0_2.24.0-1-i386-o5zIuQ/glib2.0-2.24.0/glib/gthread.c line 1893
#16 start_thread
at pthread_create.c line 300
#17 clone
at ../sysdeps/unix/sysv/linux/i386/clone.S line 130

Comment 15 Gert Kulyk 2010-04-28 17:39:37 UTC

Created attachment 159815 [details]
screenshot when opening one of the mails

Sometimes evo does not segfault, it simply freezes...

Comment 16 Gert Kulyk 2010-04-28 19:52:49 UTC

Created attachment 159824 [details]
Valgrind log

Here a valgrind log when opening the mail mentioned, maybe it helps you to track the bug down. Strange that you can't reproduce the segfault/crash with the mails attached because I'm testing evo compiled from upstream-tarballs without 3rd-party patches applied and for me it is always reproducible.

Comment 17 Milan Crha 2010-04-29 18:50:43 UTC

Thanks for the update. It's really accessing invalid memory, but the memory wasn't allocated for some reason (see the list commented line):
> Thread 7:
> Invalid read of size 4
>    at 0x63183C0: em_format_snoop_type (em-format.c:2021)
>    by 0x631A21A: em_format_part_as (em-format.c:659)
>    by 0x631A354: em_format_part (em-format.c:704)
>    by 0x631AD98: emf_multipart_mixed (em-format.c:1435)
>    by 0x631A20A: em_format_part_as (em-format.c:675)
>    by 0x631A354: em_format_part (em-format.c:704)
>    by 0x64FE238: efh_format_message (em-format-html.c:2782)
>    by 0x64FC623: efh_format_exec (em-format-html.c:216)
>    by 0x6510EC7: mail_msg_proxy (mail-mt.c:471)
>    by 0x517F62B: g_thread_pool_thread_proxy (gthreadpool.c:315)
>    by 0x517D71E: g_thread_create_proxy (gthread.c:1893)
>    by 0x47CE584: start_thread (pthread_create.c:300)
>    by 0x52B029D: clone (clone.S:130)
>  Address 0x3 is not stack'd, malloc'd or (recently) free'd

Do you think your distribution is using any patches to the official release? This is for evolution-data-server and I suppose you do not compile it yourself, do you? Maybe some level of compiler optimization involved here? I'll try and report back tomorrow or so.

Comment 18 Gert Kulyk 2010-04-29 21:57:04 UTC

> Do you think your distribution is using any patches to the official release?
The distro (debian) does and I use the debian config as a base for compiling. 
The patches to eds are not worth to mention, the only one I'm using from debian is the one that is relocating camel-provider-dir to be compatible to the distro as a whole.

> This is for evolution-data-server and I suppose you do not compile it yourself,
> do you?
As said, I do. 

> Maybe some level of compiler optimization involved here?
No, only defaults.

But after you've pointed me to eds, I've started to play around with the configure switches. Debian passes "--enable-largefile" to evolution-data-server. 

After building an eds package without that switch, evo stopped segfaulting. So there seems to be something wrong with largefile-support.

Comment 19 Andreas Proschofsky 2010-04-29 22:23:37 UTC

(In reply to comment #18)

> But after you've pointed me to eds, I've started to play around with the
> configure switches. Debian passes "--enable-largefile" to
> evolution-data-server. 
> 
> After building an eds package without that switch, evo stopped segfaulting. So
> there seems to be something wrong with largefile-support.

good catch, same situation here with Gentoo. After rebuilding without largefile support my crashes on sending Exchange messages are gone, too. See:

https://bugzilla.gnome.org/show_bug.cgi?id=612178#c9

Comment 20 Matthew Barnes 2010-04-30 00:04:35 UTC

Well, that answers the question of whether turning on large file support breaks existing installs.  Apparently it does.  Damn.

For the record, large file support itself is not broken.  As I understand it, one of our binary cache files (I still haven't figured out which) has a field or fields whose byte size depends on sizeof(offset_t).  Toggling large file support changes the result of sizeof(offset_t).  If my understanding is correct, it boils down to one of the binary files being misread and not having sufficient input validation.

In which binary files is this is issue is the million dollar question.  That's what we need to hunt down.

Comment 21 Milan Crha 2010-04-30 14:33:04 UTC

Hrm, it's probably something with that optimisation too, because I enabled the largefile support and nothing wrong happened, everything same good. I use -O0.
This is Fedora 12, i686.

Comment 22 Gert Kulyk 2010-04-30 16:22:42 UTC

I've done a jhbuild of evolution (gnome 2.30.1 tarball release moduleset) with CFLAGS set to '-O0 -g', so only upstream tarballs, no patches, no optimization, nothing skipped in favour of system libs, the only change is to enable largefile support in eds. 

What shall I say: evo segfaults for me (yes, now "Dienstagsbrief" opens, but I still have a lot of mails that aren't). Like said, does not happen when either applying the patch of Michel Dänzer or skipping "--enable-largefile" on eds build.

Comment 23 Sven Arvidsson 2010-05-05 20:57:26 UTC

*** Bug 617817 has been marked as a duplicate of this bug. ***

Comment 24 Sven Arvidsson 2010-05-05 20:59:51 UTC

*** Bug 617305 has been marked as a duplicate of this bug. ***

Comment 25 Akhil Laddha 2010-05-06 12:15:02 UTC

*** Bug 617886 has been marked as a duplicate of this bug. ***

Comment 26 Claudio Saavedra 2010-05-18 11:14:19 UTC

*** Bug 614824 has been marked as a duplicate of this bug. ***

Comment 27 Claudio Saavedra 2010-05-18 11:17:16 UTC

'm hitting this in Debian since they upgraded to 2.30.1.2.

Comment 28 Giuseppe Sacco 2010-05-18 19:45:08 UTC

I got the same SEGV on Debian unstable today.
What I was doing is: click on unread email, on email list window. Crash is repeteable using the very same message. The specific message is available.

More information and details are available at the Debian bug #582087: http://bugs.debian.org/582087

Comment 29 Yves-Alexis Perez 2010-05-19 06:18:46 UTC

It's worth noting that, as far as I know, all reports are on i386 installs. Nothing about x86_64 (or other arches where evolution might be installed, like ppc).

Comment 30 Michel Dänzer 2010-05-19 10:07:27 UTC

(In reply to comment #29)
> It's worth noting that, as far as I know, all reports are on i386 installs.
> Nothing about x86_64 (or other arches where evolution might be installed, like
> ppc).

I'm on powerpc.

Comment 31 Yves-Alexis Perez 2010-05-19 10:14:34 UTC

(In reply to comment #30)
> I'm on powerpc.

Thanks for letting us know :)

Comment 32 André Klapper 2010-05-19 11:41:51 UTC

*** Bug 619059 has been marked as a duplicate of this bug. ***

Comment 33 Matthew Barnes 2010-05-19 12:22:53 UTC

I'm applying Michel's patch to gnome-2-30 because with Camel's API being sealed up for 3.0, the patch is essentially what the code looks like now in master.  You can't say "mem->buffer" anymore because the "buffer" member is now private.

I think this only fixes a symptom of a deeper problem though, so leaving the bug open until we get to the bottom of this.

http://git.gnome.org/browse/evolution/commit/?h=gnome-2-30&id=cacfd2114e7dd56cc12613d625bac450cc69b4ae

Comment 34 Matthew Barnes 2010-05-19 20:41:05 UTC

*** Bug 619108 has been marked as a duplicate of this bug. ***

Comment 35 Fabio Durán Verdugo 2010-05-20 00:26:05 UTC

*** Bug 619125 has been marked as a duplicate of this bug. ***

Comment 36 Fabio Durán Verdugo 2010-05-20 15:06:57 UTC

*** Bug 619166 has been marked as a duplicate of this bug. ***

Comment 37 Fabio Durán Verdugo 2010-05-20 15:07:27 UTC

*** Bug 619159 has been marked as a duplicate of this bug. ***

Comment 38 Matthew Barnes 2010-05-22 13:02:09 UTC

I found a couple spots in Camel where we might be getting into trouble with large file support. Again, I don't think the problem is with large file support itself, but with the fact that on a 32-bit system, sizeof(off_t) changes from 4 bytes to 8 bytes.

The first thing I spotted is probably insignificant but I'll mention it anyway. The mbox backend was converting an off_t value to a string improperly:

http://git.gnome.org/browse/evolution-data-server/commit/?id=777c55b67ea450834e53faf72fa6b325c9347071

The second is probably more significant. Camel has a set of functions for encoding and decoding values for use when saving to and loading from binary files:

http://library.gnome.org/devel/camel/stable/camel-camel-file-utils.html

We leaned on these functions much more heavily prior to the introduction of SQLite message summary databases. Of particular note are camel_file_util_encode_off_t() and camel_file_util_decode_off_t(), which uses sizeof(off_t) to write to or read from a binary file. The problem here is if you encode an off_t value with large file support disabled on a 32-bit system, it will write 4 bytes to the file. Then if you rebuild Camel with large file support enabled, decoding a off_t value will read 8 bytes from the file.

However, after grepping for and examining places where these functions are called, it turns out most of the call sites are in dead code -- code for the old disk-based message summaries that's still present but appears to be disabled. After Evolution 2.31.2 ships on Monday, I'll be ripping out all this dead code so I can get a better look at the situation.

If we still have binary files that are holding off_t values, and (hopefully) if those files have a file format version identifier embedded in them, then the solution will likely be to bump the file format version and rewrite the off_t encode/decode functions to always convert off_t values to 64-bits.

All of this is theoretical, however. I still don't have any direct evidence that links the crashes reported here to the bugs I just described.

Comment 39 Fabio Durán Verdugo 2010-05-22 19:11:21 UTC

*** Bug 619375 has been marked as a duplicate of this bug. ***

Comment 40 Fabio Durán Verdugo 2010-05-22 19:11:57 UTC

*** Bug 619178 has been marked as a duplicate of this bug. ***

Comment 41 Milan Crha 2010-05-24 09:33:55 UTC

(In reply to comment #38)
> http://git.gnome.org/browse/evolution-data-server/commit/?id=777c55b67ea450834e53faf72fa6b325c9347071

Did the above commit introduce this warning? I see it when compiling actual master. Maybe some of my libraries is old?

camel-mbox-summary.c: In function ‘message_info_to_db’:
camel-mbox-summary.c:436: warning: format ‘%lli’ expects type ‘long long int’, but argument 2 has type ‘off_t’

Comment 42 Matthew Barnes 2010-05-24 11:14:05 UTC

Yeah, looks like it did.  The value needs to be cast to a goffset.

I guess you're seeing it because you're on a 64-bit machine, and I'm on 32-bit.

Comment 43 Matthew Barnes 2010-05-24 11:21:16 UTC

Following up on my earlier analysis, I've now removed all the unused methods from CamelFolderSummary and the various providers.  The only remaining call to camel_file_util_decode_off_t() is in some mbox migration method, so it looks like that function is not the source of these crashes.

So either it has something to do with the printf thing that I didn't think was relevant or I'm back to the drawing board.

It would be helpful if Debian and any other distro that's enabled large file support could apply the commit in comment #41 (along with a (goffset) type cast) and see if it makes any difference.

Comment 44 Yves-Alexis Perez 2010-05-24 17:48:16 UTC

Hey,

I plan to upload to debian an evolution patched with bc054c94cb46e4f8f8881c2a1b0268e2f05b307b and 4a2343cb34498c701e71679e3c50c9fc81dd5b80 to fix the segfault on 32 bits. Is there anything else I should apply? 777c55b67ea450834e53faf72fa6b325c9347071 I guess, and then?

Comment 45 Yves-Alexis Perez 2010-05-24 17:49:26 UTC

(In reply to comment #44)
> Hey,
> 
> I plan to upload to debian an evolution patched with
> bc054c94cb46e4f8f8881c2a1b0268e2f05b307b and
> 4a2343cb34498c701e71679e3c50c9fc81dd5b80 to fix the segfault on 32 bits. Is
> there anything else I should apply? 777c55b67ea450834e53faf72fa6b325c9347071 I
> guess, and then?

Sorry, the two first ones are unrelated, I meant cacfd2114e7dd56cc12613d625bac450cc69b4ae

Comment 46 Yves-Alexis Perez 2010-05-24 17:50:59 UTC

(And cacfd211 is against evolution while 777c55b6 is against eds, too)

Comment 47 Akhil Laddha 2010-05-25 09:53:41 UTC

*** Bug 619582 has been marked as a duplicate of this bug. ***

Comment 48 Matthew Barnes 2010-05-25 10:36:36 UTC

(In reply to comment #46)
> (And cacfd211 is against evolution while 777c55b6 is against eds, too)

Yes, those two would be the ones to try.

I think I may also enable large file support in Fedora for 2.31.x and see if we can get some more statistics from early adopters.

Comment 49 Akhil Laddha 2010-05-26 05:27:55 UTC

*** Bug 619427 has been marked as a duplicate of this bug. ***

Comment 50 Milan Crha 2010-05-26 18:22:36 UTC

(In reply to comment #42)
> Yeah, looks like it did.  The value needs to be cast to a goffset.
> 
> I guess you're seeing it because you're on a 64-bit machine, and I'm on 32-bit.

Nono, I'm on 32 bit as well, but I do not compile eds with --enable-largefile (for now). It's a big issue, though, as regenerating folders.db summary stores incorrect offsets in the summary (some negative number, too large to write it here), and though you can see messages the first run, the second run they are not viewable. Thus I created commit 4f700fb in eds master (2.31.3+).

Comment 51 Fabio Durán Verdugo 2010-06-04 20:24:16 UTC

*** Bug 620597 has been marked as a duplicate of this bug. ***

Comment 52 Jean-François Fortin Tam 2010-06-05 02:36:02 UTC

Hey, I just wanted to say that I notified the maintainer of https://launchpad.net/~jacob/+archive/evo230 about this problem (having segfaults on startup with evolution 2.30.x with imap+). 

He then released a patched version of evolution (2.30.1.2-2ubuntu1~ppa2 *), and this has immediately fixed my problem (evo started normally without segfaulting). Since then, everything has been smooth sailing on my end. For the record, I'm running on 32 bits PAE.

I hope this additional info helps.

*: http://launchpadlibrarian.net/49179092/evolution_2.30.1.2-2ubuntu1~ppa1_2.30.1.2-2ubuntu1~ppa2.diff.gz

Comment 53 Fabio Durán Verdugo 2010-06-09 15:29:20 UTC

*** Bug 621101 has been marked as a duplicate of this bug. ***

Comment 54 Akhil Laddha 2010-06-10 03:44:51 UTC

*** Bug 621104 has been marked as a duplicate of this bug. ***

Comment 55 Robin Cook 2010-06-20 00:47:27 UTC

I am having a similar problem on an X86_64 system that used to be 32bit but converted to 64bit.  Now evolution crashes constantly whether largefile support is compiled in or not.

Comment 56 Robin Cook 2010-06-20 21:11:56 UTC

Is there a utility some where that will create clean files for evolution and migrate the old on into the new?

Comment 57 Milan Crha 2010-07-13 11:40:24 UTC

Robin: see Help->FAQ

OK, after all the similar bugs and some investigation and chatting it turned out that the real problem is with the large file support being enabled on eds side, but not on the consumer, which is Evolution in our case. Because the off_t is of a different size, and it's used in the public API and influences structure size, then the compiler calculates wrong memory offsets in the structure and crashes the application (I suppose). There are two possibilities how to fix this:
a) when the eds has enabled large file support, then also every consumer of it should have enabled it (which is pretty unlikely to be done)
b) do not expose off_t in the public API and use goffset instead

I have a little test application which demonstrates the issue, as it's in both ways: either if eds has large file support enabled and the consumer not or eds is without it, but consumer has large file support enabled. The crash usually happens on lines like:
>  ((CamelStreamMem *)mem)->buffer->...
This is mainly with 2.30.x, and does not exhibit with the git master (2.31.x) because of API changes and GObject-ification of Camel in it, at least with CamelStreamMem, which is used quite extensively. Changing the API will be good to have anyway.

Comment 58 Milan Crha 2010-07-13 11:42:50 UTC

Created attachment 165791 [details]
test app

test application demonstrating the issue out of evolution itself. It crashes based on the fact one uses -D_FILE_OFFSET_BITS=64 or not, in both eds and when compiling this, on a 32bit system.

Comment 59 Milan Crha 2010-07-13 12:46:50 UTC

Created attachment 165795 [details] [review]
eds patch

for evolution-data-server;

Do not expose off_t in public API, use goffset instead. There left two functions, but they shouldn't be used outside of eds anyway.

Comment 60 Milan Crha 2010-07-13 12:48:49 UTC

Created commit 4b28fdd in eds master (2.31.6+)

Comment 61 Milan Crha 2010-07-13 12:57:18 UTC

*** Bug 500591 has been marked as a duplicate of this bug. ***

Comment 62 Milan Crha 2010-10-01 13:07:32 UTC

*** Bug 619259 has been marked as a duplicate of this bug. ***