After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 166830 - better support for reading multi-column documents
better support for reading multi-column documents
Status: RESOLVED OBSOLETE
Product: evince
Classification: Core
Component: general
unspecified
Other Linux
: Low enhancement
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on:
Blocks:
 
 
Reported: 2005-02-09 18:27 UTC by Dan Winship
Modified: 2018-05-22 12:51 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Dan Winship 2005-02-09 18:27:52 UTC
[Note: this is a distant future/wishlist/blue sky kind of idea. It's just
something I'd been thinking about before evince even started and I wanted
to throw it out there and see what other people think.]


All existing PDF viewers suck for reading multi-column documents
(assuming you can't comfortably fit an entire page on your monitor all
at once). This really sucks given that multi-column layouts are
extremely common in PDF files.

The way you end up having to read them currently is that first you see
the top 2/3 or so of the page, and you read the top of the first
column:

+---------------------------------+
|  The licenses     the software  |
|  for most         is free for   |
|  software are     all its       |    
|  designed to      users. This   |
|  take away        General       |    
|  your freedom     Public        |
|  to share and     License       |    
|  change it.       applies to    |
|  By contrast,     most of the   |
|  the GNU          Free          |
|  General          Software      |
|  Public           Foundation's  |
|  License is       software and  |
|  intended to      to any other  |
+---------------------------------+

Then you scroll down and read the final bit of the first column:

+---------------------------------+
|  guarantee        program       |    
|  your freedom     whose         |
|  to share and     authors       |    
|  change free      commit to     |
|  software--to     using it.     |
|  make sure        (Some other   |
|                1                | <- end of page 1
|   ----------------------------  |
|                                 | <- start of page 2
|  Free             software, we  |
|  Software         are           |
|  Foundation       referring to  |
|  software is      freedom, not  |
|  covered by       price. Our    |
+---------------------------------+

But then you have to scroll backwards to get back to the top of the
page so you can start reading the second column. And you have to jump
back and forth like this through every single page of the document.


It would be easier if the viewer could present the text in a single
flow somehow. One possibility would be to display each page twice,
first with the right column grayed out, and then with the left column
grayed out:

+---------------------------------+
|  The licenses     ... ........  | "..."s mean dimmed text
|  for most         .. .... ...   |
|  software are     ... ...       |    
|  designed to      ...... ....   |
|  take away        .......       |    
|  your freedom     ......        |
|  to share and     .......       |    
|  change it.       ....... ..    |
|  By contrast,     .... .. ...   |
|  the GNU          ....          |
|  General          ........      |
|  Public           ............  |
|  License is       ........ ...  |
|  intended to      .. ... .....  |
+---------------------------------+

+---------------------------------+
|  guarantee        .......       |    
|  your freedom     .....         |
|  to share and     .......       |    
|  change free      ...... ..     |
|  software--to     ..... ...     |
|  make sure        ..... .....   |
|                1                | <- end of page 1
|   ----------------------------  |
|                                 | <- start of page 1 again
|  ... ........     the software  |
|  ... ....         is free for   |
|  ........ ...     all its       |    
|  ........ ..      users. This   |
|  .... ....        General       |    
+---------------------------------+

+---------------------------------+
|  .... .......     Public        | rest of page 1...
|  .. ..... ...     License       |
|  ...... ...       applies to    |
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
|  .... ....        (Some other   |
|                1                | <- end of page 1
|   ----------------------------  |
|                                 | <- start of page 2
|  Free             ......... ..  |
|  Software         ...           |
|  Foundation       ......... ..  |
|  software is      ........ ...  |
|  covered by       ...... ...    |
+---------------------------------+

Things that didn't seem to be exclusively part of either column (page
numbers, diagrams, sidebars, etc) could be ungrayed in both views.


I don't know how much work it is for either the frontends or backends
to automatically figure out that a given document uses multiple
columns. It would probably need to be figured out on a per-page basis
anyway, since some papers use a single-column layout on the front page
and then switch to two-column. Maybe this would have to be something
the user turned on explicitly, rather than being figured out
automatically.


Maybe people think this particular idea is awful, but the fact remains
that it's currently really annoying to read multi-column PDFs, and it
would be very cool if we could find a way to make it not be.
Comment 1 Martin Kretzschmar 2005-02-09 19:01:34 UTC
Did you write the multi column ascii by hand?

1) In gv (with one g), if you can page through a document with <space>,
scrolling down one screen at a time. If the document is wider than the viewport
and you have reached a page border, it doesn't go to the next page, but to the
top of the current page and scrolls to the right. A low tech idea. And not
doable once we have continuous scrolling.

2) PDF files can have "Threads". Don't know exactly what they are, but I think
they describe how the text flows on a page (say, a linked list of rectangles
that represent the columns). This information has to be embedded explicitly, so
I wouldn't expect this in many files. Maybe in Output from FrameMaker or other
expensive DTP products.

But as you say, that's all quite blue sky for evince.

3) Tagged PDF. Again, not something I know well. This is used in Acrobat for
Palm to reflow text to fit on a PDA screen. Again, this has to be embedded at
distillation time. The Palm reader was the reason why OpenOffice.org 2.0 will
produce tagged PDF (optionally).
Comment 2 Dan Winship 2005-02-09 21:35:05 UTC
> Did you write the multi column ascii by hand?

I used "C-x r k" / "C-x r y" in emacs (rectangular cut and paste).

The gv way is nice if you don't have continuous scrolling. But continuous
scrolling is nice too... :)

Tagged PDF definitely sounds like something that might be useful. Though
one advantage of having an implementation that didn't depend on any
special PDF features would be that it would work even for documents that
were just scans from hardcopy. (The ACM digital library has a lot of
old articles like that.)
Comment 3 Bryan W Clark 2005-02-22 17:13:17 UTC
Dan it's possible to do something like how the articles on http://www.iht.com/
are done.  Have the Next/Previous buttons take you to the left or right page. 
With continuous document scrolling you'd get a similar effect to the web site.

/me wonders if this would introduce i18n problems WRT left->right reading.
Comment 4 Nathaniel Smith 2005-08-03 21:56:23 UTC
What about just having a keybinding for "jump to top of current page"?  That
alone would help reading multi-column documents a lot.  PageUp followed by
PageDown sorta works, but...

Maybe "." (period/full stop)?
Comment 5 Philip Ganchev 2007-02-22 00:35:31 UTC
I don't see why continuous scrolling precludes the idea implemented in gv.  Pressing PageDown would scroll to the next page, while pressing Space or some other, specific shortcut would scroll to the start of the page and one screenwidth to the side.  

Which side would depend on the text direction at the end of the current column -- that would work correctly in most cases; it would break when most of the text is say right to left and only the last sentence is left-to-right, but it's a good solution.

If it's not possible to detect the page direction in general, make the pan direction configurable.
Comment 6 Brian Ewins 2008-05-12 16:47:26 UTC
I think its worth pointing this out:
http://bugs.freedesktop.org/show_bug.cgi?id=15906

This was a fix to expose info in poppler that allows a reader to zoom & follow columns, in the same order that poppler would extract text. The behaviour had been hacked into the community version of the iRex Iliad's ipdf viewer. Fix is now in poppler master:
http://cgit.freedesktop.org/poppler/poppler/commit/?id=e3e4113c73128f49f99289b592446d4382b5d65c

More info about the ipdf hack:
http://forum.irexnet.com/viewtopic.php?t=1953&sid=5e5e77200c56ad254dbd9d9531cf77f3
Comment 7 Jonas Kölker 2009-12-19 20:40:42 UTC
I think having a shortcut for "Go to the top of the page and one screen width to the right" would be a reasonable approximation, but it would probably break down.

Typically, I zoom in enough that the text of the left column is at a comfortable size and alignment, which leaves some of the right column visible.  That would make scrolling one screen width to the right chop off some of the right column.

... Except in the case where there's less than one whole screen width of space to scroll into; in that case, I guess a reasonable behaviour would be to show me the right-most screen width of the document.

I think a more useful (and, sadly, probably more laborious) thing to do is to compute bounding boxes of the columns and have some heuristic say how many columns there are; then, it should be easy to go to the top of the next column.

As a side benefit, it would (probably) be easy(er) to make it possible to implement a feature that lets the user center-align the current column independent of the zoom level.  Right now, unless I zoom in too far for my tastes, I have the left column left-aligned and the right column right-aligned, which makes me all frowny and such ;)

Happy hacking :)
Comment 8 GNOME Infrastructure Team 2018-05-22 12:51:51 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/evince/issues/2.