GNOME Bugzilla – Bug 720156
Excessive memory consumption
Last modified: 2013-12-21 08:53:22 UTC
It seems that frogr 0.8 is taking a lot of memory. And, I don't know if that's a coincidence or not, it seems proportional to the amount that is being transferred to flickr. I'm transfering ~2000 pics, which sum to ~4 GiB (can't tell exact values since frogr updates the values along the way). I didn't notice anything unusual at first since my machine has 16 GiB, but ~12h after frogr began the upload I saw that it was consuming quite a bit of memory: $ ps up 23510 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND felipe 23510 2.8 29.3 7810948 4809800 pts/1 SLl+ 09:43 20:10 frogr That's ~4.5 GiB, which coincidentally looks like the whole data I'm uploading plus ~10% overhead. At the time I ran ps above there was still ~600 MiB to go.
So, about ten minutes later the memory usage still is climbing: $ ps up 23510 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND felipe 23510 2.8 29.6 7885700 4870068 pts/1 SLl+ 09:43 20:21 frogr That's perhaps ~5 MiB/min? It still is possible that it's related to my uploads (perhaps it's keeping a reference to every buffer ever sent?), especially since I remember that some files had to be sent more than once due to connection problems. OTOH, I wouldn't rule the possibility of another kind of leak.
It seems to have topped at almost 5 GiB: $ ps up 23510 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND felipe 23510 2.7 31.4 8297476 5154984 pts/1 SLl+ 09:43 20:56 frogr And now it's telling me "Error reading file for upload." while it prints at the terminal: ** (frogr:23510): WARNING **: Unable to get contents for file Now it really looks like a leak while uploading, maybe a missing free() somewhere. I've tried to attach gdb to the running process and called malloc_stats(): Arena 0: system bytes = 841326592 in use bytes = 833919648 Arena 1: system bytes = 3710976 in use bytes = 388448 Arena 2: system bytes = 368640 in use bytes = 68640 Arena 3: system bytes = 139264 in use bytes = 15904 Arena 4: system bytes = 3948544 in use bytes = 247120 Arena 5: system bytes = 139264 in use bytes = 10336 Arena 6: system bytes = 643072 in use bytes = 76416 Arena 7: system bytes = 716800 in use bytes = 182208 Arena 8: system bytes = 434176 in use bytes = 55632 Arena 9: system bytes = 139264 in use bytes = 29808 Arena 10: system bytes = 294912 in use bytes = 170784 Arena 11: system bytes = 139264 in use bytes = 20912 Arena 12: system bytes = 462848 in use bytes = 99776 Arena 13: system bytes = 33619968 in use bytes = 2943280 Total (incl. mmap): system bytes = 2731618304 in use bytes = 2683763632 max mmap regions = 10 max mmap bytes = 1845534720 However, it seems I must have stepped on frogr's toes while doing so because it segfaulted :). It seems that systemd didn't save the coredump, though, probably because of its sheer size: $ LC_ALL=C sudo systemd-coredumpctl gdb TIME PID UID GID SIG EXE Mon 2013-12-09 22:23:37 BRST 23510 1000 100 11 /usr/bin/frogr Failed to retrieve COREDUMP field: No such file or directory
After a run of 74 images: $ valgrind frogr ==3897== Memcheck, a memory error detector ==3897== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==3897== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==3897== Command: frogr ==3897== ==3897== Invalid write of size 4 ==3897== at 0x8674F5B: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8675581: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x861C076: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x868FAF7: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8662833: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8667401: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8667DB6: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8662833: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x862432B: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x861D808: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8616EA4: cairo_fill (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x4F5A781: ??? (in /usr/lib/libgtk-3.so.0.1000.6) ==3897== Address 0xffeffc5d8 is on thread 1's stack ==3897== ==3897== Invalid read of size 4 ==3897== at 0x86724CE: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8674243: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8674F8B: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8675581: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x861C076: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x868FAF7: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8662833: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8667401: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8667DB6: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x8662833: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x862432B: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== by 0x861D808: ??? (in /usr/lib/libcairo.so.2.11200.16) ==3897== Address 0xffeffc5d8 is on thread 1's stack ==3897== ==3897== ==3897== HEAP SUMMARY: ==3897== in use at exit: 291,590,185 bytes in 118,384 blocks ==3897== total heap usage: 5,001,589 allocs, 4,883,205 frees, 6,076,381,277 bytes allocated ==3897== ==3897== LEAK SUMMARY: ==3897== definitely lost: 69,632 bytes in 264 blocks ==3897== indirectly lost: 40,643,754 bytes in 2,994 blocks ==3897== possibly lost: 243,363,785 bytes in 1,412 blocks ==3897== still reachable: 7,218,190 bytes in 112,033 blocks ==3897== suppressed: 0 bytes in 0 blocks ==3897== Rerun with --leak-check=full to see details of leaked memory ==3897== ==3897== For counts of detected and suppressed errors, rerun with: -v ==3897== ERROR SUMMARY: 304 errors from 2 contexts (suppressed: 2 from 2)
After investigating this issue quite deeply, I think the main reason for this memory leak was that the SoupBuffer that was being created to form the multipart message was not being freed, causing that memory to be lost forever :/ Fortunately, it should be fixed now: https://git.gnome.org/browse/frogr/commit/?id=03889efc5aafbc60505e57c159bcd3ef2961ac87 However, that was not the only issue. I also found some ref counting problems with the pictures that was causing that many times those instances of FrogrPicture never reached ref count 0 when being removed from the UI (either manually or as a result of uploading them), which certainly was aanother important problem in terms of memory management. Again, this should be fixed now too: https://git.gnome.org/browse/frogr/commit/?id=77709a158f34ec5a2044b7b46a0a600b645fd296 Last, there was another problem with the ref counting of photosets and groups, which were not unreffed when closing the related dialogs. That should be fixed now too as well: https://git.gnome.org/browse/frogr/commit/?id=e3f5863c3c2823d009b01290136fee22766998fc So, I'm resolving this bug now because I can not spot that memory problem anymore after testing frogr with ~100 pictures. Thanks a lot for this bug report and apologies both for the delay fixing this issue and also for the issues themselves. Good news is that the next stable release of frogr (which I hope to make in 2-3 weeks from now) will hobefully be better than ever, at least in terms of memory management :) Thanks!