GNOME Bugzilla – Bug 689912
"Too many open files" after recognizing a book
Last modified: 2021-05-17 13:31:49 UTC
After importing a PDF file containing a complete ebook of 227 pages, choosing "Automatically detect and recognize all pages", I cannot navigate to any page other than the first page. Running ocrfeeder from the command line shows the following after trying to navigate to another page from the front page: $ ocrfeeder Traceback (most recent call last):
+ Trace 231271
After this error, the UI becomes mostly unresponsive, opening menus still works, but clicking on anything shows no reaction.
Running `$ sudo ls -l /proc/$(pgrep ocrfeeder)/fd` indeed shows lots and lots of open file descriptors, all but the last 3 (fds 1021-1023) for files which have been deleted. An excerpt: lrwx------ 1 miki users 64 Dec 9 03:13 1016 -> /tmp/ocrfeeder_9uKwYQ/tmpQq5bhS (deleted) lrwx------ 1 miki users 64 Dec 9 03:13 1017 -> /tmp/ocrfeeder_9uKwYQ/tmpnDPsXo.tif (deleted) lrwx------ 1 miki users 64 Dec 9 03:13 1018 -> /tmp/ocrfeeder_9uKwYQ/tmpzyDM5F (deleted) lrwx------ 1 miki users 64 Dec 9 03:13 1019 -> /tmp/ocrfeeder_9uKwYQ/tmppLg_M5.tif (deleted) lrwx------ 1 miki users 64 Dec 9 03:01 102 -> /tmp/ocrfeeder_9uKwYQ/tmp4XVe4T (deleted) lrwx------ 1 miki users 64 Dec 9 03:13 1020 -> /tmp/ocrfeeder_9uKwYQ/tmpSUM4MG (deleted) lrwx------ 1 miki users 64 Dec 9 03:13 1021 -> /tmp/ocrfeeder_9uKwYQ/tmpPTlNep.tif lrwx------ 1 miki users 64 Dec 9 03:13 1022 -> /tmp/ocrfeeder_9uKwYQ/tmpV0he4R lrwx------ 1 miki users 64 Dec 9 03:13 1023 -> /tmp/ocrfeeder_9uKwYQ/tmpLE3UF4.tif It is obvious that those fds are not being closed somehow.
The culprit seems to be: in several places, calls to tempfile.mkstemp are discarding the returned file descriptors. Since in all these cases, all that's actually wanted is a filename, I've been testing a fix which simply replaces all uses of 'mkstemp' with 'mktemp'. I still get lots of "rm: cannot remove <tempfile>: No such file or directory", which otherwise appear to be harmless, because the pages do get converted without problems. I haven't figured out yet where those errors are coming from.
Hi, thanks for the report. Could you tell me which version of OCRFeeder you're using? I will look into this bug very soon. Cheers,
I was able to reproduce this bug. The PDF I am using contains 608 scanned pages. OCRFeeder will correctly import the entire document. Beyond that, when I try to view any page beyond the first few, OCRFeeder will fail to respond. I was able to get it to successfully run "recognize document" one time without this error. Also, if I try to Save As or Export the document, I get the following error message: "Could not read the contents of [username here]" "Too many open files" I've captured my terminal output from the time I open OCRFeeder, to the point where it becomes unresponsive. The terminal output can be found here: http://pastebin.com/FkRPdUsA If you need a copy of this document for the purpose of reproducing the bug for yourself, the document is under copyright. Having said that, it can be found here: https://thepiratebay.se/torrent/6605206/The_Art_and_Science_of_Java.An_Introduction_To_Computer_Science%B
Also, I am using OCRFeeder 0.7.11
Created attachment 258941 [details] [review] Fix for "too many open files" bug Fix for "too many open files" bug; file descriptors referring to temp files now get closed properly.
Created attachment 258946 [details] [review] Fix for "too many open files" bug, 2nd try Fix for "too many open files" bug; changes mkstemp to NamedTemporaryFile in ocrEngines.py and gets rid of hanging temp files in the process. Previous fix caught only one of two incorrect mkstemp uses in ocrEngines.py.
I came across this bug today. After OCRing a imported PDF image of 60 pages, I have: $ lsof | grep ocrfeeder | wc 5321 56844 740423 200 or so in /usr, /lib and /var. The rest are like: ocrfeeder 19105 andre 13u REG 8,17 120047 10747912 /tmp/ocrfeeder_7_V0sy/tmpUA_9uY.tif (deleted) ocrfeeder 19105 andre 14u REG 8,17 0 10748128 /tmp/ocrfeeder_7_V0sy/tmppPHEI4 (deleted) ocrfeeder 19105 andre 15u REG 8,17 1804 10753178 /tmp/ocrfeeder_7_V0sy/tmpY3lQyl.tif (deleted) [...] ocrfeeder 19105 andre 1290u REG 8,17 0 10790786 /tmp/ocrfeeder_7_V0sy/tmpJsLoT_ (deleted) ocrfeeder 19105 andre 1291u REG 8,17 17058 10790787 /tmp/ocrfeeder_7_V0sy/tmp09XCHR.tif (deleted) ocrfeeder 19105 andre 1292u REG 8,17 0 10790788 /tmp/ocrfeeder_7_V0sy/tmpRI8e7V (deleted) dconf 19105 19106 andre 13u REG 8,17 120047 10747912 /tmp/ocrfeeder_7_V0sy/tmpUA_9uY.tif (deleted) dconf 19105 19106 andre 14u REG 8,17 0 10748128 /tmp/ocrfeeder_7_V0sy/tmppPHEI4 (deleted) dconf 19105 19106 andre 15u REG 8,17 1804 10753178 /tmp/ocrfeeder_7_V0sy/tmpY3lQyl.tif (deleted) [...] dconf 19105 19106 andre 1290u REG 8,17 0 10790786 /tmp/ocrfeeder_7_V0sy/tmpJsLoT_ (deleted) dconf 19105 19106 andre 1291u REG 8,17 17058 10790787 /tmp/ocrfeeder_7_V0sy/tmp09XCHR.tif (deleted) dconf 19105 19106 andre 1292u REG 8,17 0 10790788 /tmp/ocrfeeder_7_V0sy/tmpRI8e7V (deleted) gdbus 19105 19107 andre 13u REG 8,17 120047 10747912 /tmp/ocrfeeder_7_V0sy/tmpUA_9uY.tif (deleted) gdbus 19105 19107 andre 14u REG 8,17 0 10748128 /tmp/ocrfeeder_7_V0sy/tmppPHEI4 (deleted) gdbus 19105 19107 andre 15u REG 8,17 1804 10753178 /tmp/ocrfeeder_7_V0sy/tmpY3lQyl.tif (deleted) [...] gdbus 19105 19107 andre 1290u REG 8,17 0 10790786 /tmp/ocrfeeder_7_V0sy/tmpJsLoT_ (deleted) gdbus 19105 19107 andre 1291u REG 8,17 17058 10790787 /tmp/ocrfeeder_7_V0sy/tmp09XCHR.tif (deleted) gdbus 19105 19107 andre 1292u REG 8,17 0 10790788 /tmp/ocrfeeder_7_V0sy/tmpRI8e7V (deleted) gmain 19105 19110 andre 13u REG 8,17 120047 10747912 /tmp/ocrfeeder_7_V0sy/tmpUA_9uY.tif (deleted) gmain 19105 19110 andre 14u REG 8,17 0 10748128 /tmp/ocrfeeder_7_V0sy/tmppPHEI4 (deleted) gmain 19105 19110 andre 15u REG 8,17 1804 10753178 /tmp/ocrfeeder_7_V0sy/tmpY3lQyl.tif (deleted) [...] gmain 19105 19110 andre 1290u REG 8,17 0 10790786 /tmp/ocrfeeder_7_V0sy/tmpJsLoT_ (deleted) gmain 19105 19110 andre 1291u REG 8,17 17058 10790787 /tmp/ocrfeeder_7_V0sy/tmp09XCHR.tif (deleted) gmain 19105 19110 andre 1292u REG 8,17 0 10790788 /tmp/ocrfeeder_7_V0sy/tmpRI8e7V (deleted)
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/ocrfeeder/-/issues/49.