After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 765847 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 63: ordinal not in range
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 63: ordi...
Status: VERIFIED FIXED
Product: ocrfeeder
Classification: Other
Component: general
0.8.x
Other Linux
: Normal normal
: ---
Assigned To: ocrfeeder-maint
ocrfeeder-maint
Depends on:
Blocks:
 
 
Reported: 2016-04-30 13:22 UTC by Alberto Garcia
Modified: 2016-05-01 18:31 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Sample PDF file to reproduce this bug (65.53 KB, application/pdf)
2016-04-30 13:22 UTC, Alberto Garcia
Details

Description Alberto Garcia 2016-04-30 13:22:05 UTC
Created attachment 327072 [details]
Sample PDF file to reproduce this bug

Steps to reproduce this bug:

1. Launch ocrfeeder (I've done this with 0.8.1)
2. File -> Import PDF, and import the attached PDF file.
3. Select all text
4. Choose the 'Tesseract' engine and press 'OCR'
5. File -> Export -> Plain Text, and choose an output file.

Result: OCRFeeder doesn't save the text but it rather prints this error:

Traceback (most recent call last):
  • File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 298 in exportDialog
    self.EXPORT_FORMATS[format][1])
  • File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 281 in exportToFormat
    name)
  • File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py", line 606 in exportPagesWithGenerator
    document_generator.save()
  • File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 371 in save
    f.write(self.text.encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 63: ordinal not in range(128)

Comment 1 Joaquim Rocha 2016-04-30 14:59:14 UTC
Hey! Thanks for the report Berto!

I think I have fixed it now: https://git.gnome.org/browse/ocrfeeder/commit/?id=691f54618ed17a2553f154af07a6cfb4bf887e09
Comment 2 Alberto Garcia 2016-05-01 18:31:51 UTC
Awesome, that was fast!

I tested the fix, it seems to work fine.

Thanks!