After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 654712 - Poor results of copying a table as text from the PDF
Poor results of copying a table as text from the PDF
Status: RESOLVED OBSOLETE
Product: evince
Classification: Core
Component: PDF
2.32.x
Other Linux
: Normal normal
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on:
Blocks:
 
 
Reported: 2011-07-16 00:28 UTC by pedrum
Modified: 2018-05-22 14:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
screenshot of table in document. (67.75 KB, image/png)
2011-07-16 00:29 UTC, pedrum
Details

Description pedrum 2011-07-16 00:28:45 UTC
The expectation is that a table copied from the PDF will preserve the order of rows/columns/text. For example, on page 68 of a Progamming Ruby book, there's a table listing regular expression characters. The Acrobat results below represent what I would expect from evince as well.

Operation: I am selecting the table region with mouse selection/highlight and pasting into another text document. 

OS: Ubuntu 11.04

Evince 2.32.0 results:
\d
\D
\s
\S
\w
\W
As [ . . . ] Meaning
[0-9] Digit character
[^0-9] Any character except a digit
[\s\t\r\n\f] Whitespace character
[^\s\t\r\n\f] Any character except whitespace
[A-Za-z0-9_] Word character
[^A-Za-z0-9_] Any character except a word character
POSIX Character Classes
[:alnum:]
[:alpha:]
[:blank:]
[:cntrl:]
[:digit:]
[:graph:]
[:lower:]
[:print:]
[:punct:]
[:space:]
[:upper:]
[:xdigit:]
Alphanumeric
Uppercase or lowercase letter
Blank and tab
Control characters (at least 0x00–0x1f, 0x7f)
Digit
Printable character excluding space
Lowercase letter
Any printable character (including space)
Printable character excluding space and alphanumeric
Whitespace (same as \s)
Uppercase letter
Hex digit (0–9, a–f, A–F)

Acrobat Reader 9 results:
\d [0-9] Digit character
\D [^0-9] Any character except a digit
\s [\s\t\r\n\f] Whitespace character
\S [^\s\t\r\n\f] Any character except whitespace
\w [A-Za-z0-9_] Word character
\W [^A-Za-z0-9_] Any character except a word character
POSIX Character Classes
[:alnum:] Alphanumeric
[:alpha:] Uppercase or lowercase letter
[:blank:] Blank and tab
[:cntrl:] Control characters (at least 0x00–0x1f, 0x7f)
[:digit:] Digit
[:graph:] Printable character excluding space
[:lower:] Lowercase letter
[:print:] Any printable character (including space)
[:punct:] Printable character excluding space and alphanumeric
[:space:] Whitespace (same as \s)
[:upper:] Uppercase letter
[:xdigit:] Hex digit (0–9, a–f, A–F)
Comment 1 pedrum 2011-07-16 00:29:21 UTC
Created attachment 192066 [details]
screenshot of table in document.
Comment 2 GNOME Infrastructure Team 2018-05-22 14:18:02 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/evince/issues/232.