After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 448739 - Evince cannot select text in djvu documents
Evince cannot select text in djvu documents
Status: RESOLVED FIXED
Product: evince
Classification: Core
Component: backends
2.29.x
Other Linux
: Normal normal
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on: 669022
Blocks: 691172 703108
 
 
Reported: 2007-06-18 11:10 UTC by Sebastien Bacher
Modified: 2013-06-29 09:10 UTC
See Also:
GNOME target: ---
GNOME version: 2.19/2.20


Attachments
Implements hightlighing in djvu text selection (8.38 KB, patch)
2013-05-23 11:54 UTC, Jonas Danielsson
needs-work Details | Review
Screenshot (24.65 KB, image/png)
2013-05-23 19:05 UTC, Carlos Garcia Campos
  Details
Updated after review (14.53 KB, patch)
2013-05-24 09:14 UTC, Jonas Danielsson
none Details | Review
Implement get_text_mapping (2.48 KB, patch)
2013-05-24 09:16 UTC, Jonas Danielsson
none Details | Review
Implement text_get_text (1.62 KB, patch)
2013-05-26 19:51 UTC, Jonas Danielsson
none Details | Review
strdup => g_strdup (1.62 KB, patch)
2013-05-27 06:52 UTC, Jonas Danielsson
none Details | Review
Enable selection highlighting (7.14 KB, patch)
2013-06-18 06:01 UTC, Jonas Danielsson
needs-work Details | Review
Implement get_text_mapping (2.48 KB, patch)
2013-06-18 06:01 UTC, Jonas Danielsson
none Details | Review
Implement get_text (1.62 KB, patch)
2013-06-18 06:02 UTC, Jonas Danielsson
none Details | Review
Updated after review (10.81 KB, patch)
2013-06-26 08:18 UTC, Jonas Danielsson
committed Details | Review

Description Sebastien Bacher 2007-06-18 11:10:37 UTC
The bug has been opened on https://bugs.launchpad.net/ubuntu/+source/evince/+bug/120860

"Binary package hint: evince

Evince does not appear to be able to select text in djvu documents, even though djview can select text when viewing the same document.
...
http://research.microsoft.com/~simonpj/Papers/pj-lester-book/student.djvu

with djview (which I installed using apt-get) it is possible to use one of the selection tools to select text within the document. evince can select text in PDF (and I believe PostScript documents) but it does not recognize that there is any selectable text in a djvu document (such as the above).
..."
Comment 1 Cosimo Cecchi 2007-07-03 14:28:31 UTC
can confirm this, but evince is actually able to select and copy text, but the normal selection rectangle is not displayed. infact, if you mimic the selection gesture on a word, right-click -> copy and paste it somewhere, it is actually pasted! i'll investigate on this.
Comment 2 Carlos Garcia Campos 2007-09-03 12:36:18 UTC
Text extraction is implemented, but not selection rendering. Confirmed. 
Comment 3 Emmanuel Fleury 2009-11-20 13:58:18 UTC
Still here in Evince 2.29.2 (git master). 

Updating version number.
Comment 4 Joachim Breitner 2011-10-10 13:19:34 UTC
Still in 2.32.0 (though I cannot update the version number).
Comment 5 Germán Poo-Caamaño 2013-01-05 20:20:36 UTC
*** Bug 691172 has been marked as a duplicate of this bug. ***
Comment 6 Jonas Danielsson 2013-05-23 11:54:59 UTC
Created attachment 245118 [details] [review]
Implements hightlighing in djvu text selection

Hi,

This patch implements highlighting when selecting text in djvu documents.
It implements the interface functions:
    iface->render_selection
and
    iface->get_selection_region

What do you think of the approach?

Thank you
Jonas
Comment 7 Carlos Garcia Campos 2013-05-23 19:01:18 UTC
Review of attachment 245118 [details] [review]:

Looks pretty good in general, thanks a lot.

::: backend/djvu/djvu-document.c
@@ +609,3 @@
+		r->y2 = height - tmp   * 72.0 / dpi;
+
+		cairo_set_source_rgba (cr, base->red, base->green, base->blue,

You can't use the given color components directly, GdkColor uses values between 0 and 65535, while cairo uses double values between 0 and 1, so they should be converted by dividing by 65535.

@@ +615,3 @@
+		cairo_set_line_width (cr, 0.5);
+		cairo_set_source_rgb (cr, base->red, base->green, base->blue);
+		cairo_stroke (cr);

Instead of doing this, similar to what we do for search highlights, I think it might look better if we render the selection on top of the page using the multiply operator. For that we would need to modify the selection interface to receive the page surface too, other backends can ignore the page surface. Here we could paint the page surface into the selection surface and then the selection rectangles on top using the multiply operator, and rectangle + fill.

@@ +673,3 @@
+		r->x2 *= 72.0 / dpi;
+		r->y1  = height - r->y2 * 72.0 / dpi;
+		r->y2  = height - tmp   * 72.0 / dpi;

This is duplicated in several place, we might add a helper function to convert a doc rect to view rect.
Comment 8 Carlos Garcia Campos 2013-05-23 19:05:45 UTC
Created attachment 245175 [details]
Screenshot

I've made a dirty hack to quickly check how it looks using the multiply blend mode. This is a screenshot showing the current patch (with the colors fixed) and my hack to use multiply blend mode. I'm using 0.5 alpha in both cases, but I think it looks even better using 0.75 for example. It looks quite similar in the end, but with the current patch you can see the selection is actually over the text while with multiply you can see it blended. What do you guys think?
Comment 9 José Aliste 2013-05-23 19:09:50 UTC
There is one thing with the patch that I don't get the cursor change with djvu whil I do get it with pdf, maybe we need to implement another func?
Comment 10 Jonas Danielsson 2013-05-23 19:17:54 UTC
(In reply to comment #9)
> There is one thing with the patch that I don't get the cursor change with djvu
> whil I do get it with pdf, maybe we need to implement another func?

Yes, I am working on the get_text_mapping implementation as well :)
Comment 11 Jonas Danielsson 2013-05-23 19:19:23 UTC
Thanks a lot for the review and comments!

(In reply to comment #7)
> Review of attachment 245118 [details] [review]:
> 
> Looks pretty good in general, thanks a lot.
> 
> ::: backend/djvu/djvu-document.c
> @@ +609,3 @@
> +        r->y2 = height - tmp   * 72.0 / dpi;
> +
> +        cairo_set_source_rgba (cr, base->red, base->green, base->blue,
> 
> You can't use the given color components directly, GdkColor uses values between
> 0 and 65535, while cairo uses double values between 0 and 1, so they should be
> converted by dividing by 65535.
> 

Oh, thanks!

> @@ +615,3 @@
> +        cairo_set_line_width (cr, 0.5);
> +        cairo_set_source_rgb (cr, base->red, base->green, base->blue);
> +        cairo_stroke (cr);
> 
> Instead of doing this, similar to what we do for search highlights, I think it
> might look better if we render the selection on top of the page using the
> multiply operator. For that we would need to modify the selection interface to
> receive the page surface too, other backends can ignore the page surface. Here
> we could paint the page surface into the selection surface and then the
> selection rectangles on top using the multiply operator, and rectangle + fill.
> 

Thans, that sounds like a better thing to do!

> @@ +673,3 @@
> +        r->x2 *= 72.0 / dpi;
> +        r->y1  = height - r->y2 * 72.0 / dpi;
> +        r->y2  = height - tmp   * 72.0 / dpi;
> 
> This is duplicated in several place, we might add a helper function to convert
> a doc rect to view rect.

Agreed.
Comment 12 Jonas Danielsson 2013-05-23 19:21:59 UTC
(In reply to comment #8)
> Created an attachment (id=245175) [details]
> Screenshot
> 
> I've made a dirty hack to quickly check how it looks using the multiply blend
> mode. This is a screenshot showing the current patch (with the colors fixed)
> and my hack to use multiply blend mode. I'm using 0.5 alpha in both cases, but
> I think it looks even better using 0.75 for example. It looks quite similar in
> the end, but with the current patch you can see the selection is actually over
> the text while with multiply you can see it blended. What do you guys think?

I don't have any strong feelings towards one or the other, I'll try to implement it myself and see it in action.

Thanks!
Comment 13 Jonas Danielsson 2013-05-24 09:14:44 UTC
Created attachment 245216 [details] [review]
Updated after review

This patch implements the changes suggested in review, it also prepares for
implementing get_text_mapping by breaking out selected_region with a helper function.

See patch for get_text_mapping below.
Comment 14 Jonas Danielsson 2013-05-24 09:16:28 UTC
Created attachment 245217 [details] [review]
Implement get_text_mapping

I'm not 100% sure what the other functions in this interface are for.

The get_text and get_text_layout.

What would be the effect of implementing them?
Comment 15 José Aliste 2013-05-24 14:08:09 UTC
Jonas, I think (not sure) they are needed for accesibility, that is so Orca or another screen reader can read the contents of a djvu file.
Comment 16 Jonas Danielsson 2013-05-26 19:51:33 UTC
Created attachment 245357 [details] [review]
Implement text_get_text

Hi, this implements text_get_text of the text interface.
This allows Orca to read the djvu-document on my machine.

I have not implemented text_get_text_layout, and I'm not sure It's possible with the current backend. But I haven't looked really hard. And I'm also not really sure what it's used for. Tried reading some Atk docs that I found googling, but didn't get wiser.

Anyone knows?
Comment 17 Jonas Danielsson 2013-05-26 20:04:40 UTC
Ah, maybe for reading from where the user presses in the document?
Comment 18 Jonas Danielsson 2013-05-27 06:52:31 UTC
Created attachment 245370 [details] [review]
strdup => g_strdup
Comment 19 Jonas Danielsson 2013-06-18 06:00:59 UTC
Hi, since Jasons patch has been commited (https://bugzilla.gnome.org/show_bug.cgi?id=669022), I've respun these patches do use the fallback method of rendering the selection in the view.

I have three patches.

One that enables selection highlighting by implementing get_selection_region
for djvu.
One that implements text_get_text_mapping. To get the nice mouse pointer effect
when hovering over text. And one that implements text_get_text to enable screen readers to get the content of the text.

Is that configuration ok, or do you want it as one patch, or two? or not at all?

Thanks
Jonas
Comment 20 Jonas Danielsson 2013-06-18 06:01:32 UTC
Created attachment 247087 [details] [review]
Enable selection highlighting
Comment 21 Jonas Danielsson 2013-06-18 06:01:50 UTC
Created attachment 247088 [details] [review]
Implement get_text_mapping
Comment 22 Jonas Danielsson 2013-06-18 06:02:10 UTC
Created attachment 247089 [details] [review]
Implement get_text
Comment 23 Carlos Garcia Campos 2013-06-20 13:23:13 UTC
Review of attachment 247087 [details] [review]:

Looks great, there are a few minor details, and I have some doubts about the implementation.

::: backend/djvu/djvu-document.c
@@ +550,3 @@
+	rectangle->x2 *= 72.0 * scale / dpi;
+	rectangle->y1 = height * scale - rectangle->y2 * 72.0 * scale / dpi;
+	rectangle->y2 = height * scale - tmp * 72.0 * scale / dpi;

This is confusing, and it's actually my fault because I suggested it. We normally use a GdkRectangle for view rects. But here we still want a EvRectangle, so what we really want is to apply the dpi and invert the Y axis, I thin it's clearer to leave the previous code in djvu_document_find_find_text, since the 3 conversions we need here are actually different to each other. Sorry for the confusion.

@@ +554,3 @@
+
+static GList *
+djvu_selection_get_highlight_rects (DjvuDocument    *djvu_document,

why highlight? djvu_selection_get_selection_rects sounds better to me.

@@ +568,3 @@
+	rectangle.y1 = (height - points->y2) * dpi / 72.0;
+	rectangle.x2 = points->x2 * dpi / 72.0;
+	rectangle.y2 = (height - points->y1) * dpi / 72.0;

This would be done by the helper function I mentioned before.

@@ +578,3 @@
+
+		djvu_text_page_highlight (tpage, page_text, &rectangle);
+		matches  = tpage->results;

Why matches? looks like we are searching something

@@ +597,3 @@
+
+	document_get_page_size (djvu_document, page, &width, &height,
+				&dpi);

This could be a single line, next line is even longer.

@@ +609,3 @@
+		rect.x      = (gint) ((r->x1 * scale) + 0.5);
+		rect.y      = (gint) ((r->y1 * scale) + 0.5);
+		rect.width  = (gint) (((r->x2 - r->x1) * scale) + 0.5);

Don't line up this expressions, please.

@@ +625,3 @@
+				     EvRectangle     *points)
+{
+	DjvuDocument   *djvu_document = DJVU_DOCUMENT (selection);

Extra spaces between DjvuDocument and *djvu_document

@@ +660,3 @@
 {
+	iface->get_selected_text    = djvu_selection_get_selected_text;
+	iface->get_selection_region = djvu_selection_get_selection_region;

Do not line up these either.

::: backend/djvu/djvu-text-page.c
@@ +374,3 @@
 
+void
+djvu_text_page_highlight (DjvuTextPage *page,

This should be someting like djvu_text_page_get_selection, and it could return a GList instead of using results

@@ +388,3 @@
+
+	djvu_text_page_limits (page, page->text_structure, rectangle);
+	djvu_text_page_selection (page, page->text_structure, 0);

We are not interested in the page itself, what we need is a function like djvu_text_page_selection but that only fills the array of DjvuTextLink

@@ +397,3 @@
+		if (!(needle = strchr (haystack, '\n'))) {
+		    needle = strchr (haystack, '\0');
+		    at_end = 1;

Why not iterating the DjvuTextLink array

@@ +407,3 @@
+			result = djvu_text_page_box (page, start, end);
+			page->results = g_list_prepend (page->results, result);
+			next_word_start = end_p + 1;

I think this is based on search implementation where we have a text to find, but here we have coordinates and we want the bounding box the the text in those coordinates, the actual text itself is not important.
Comment 24 Jonas Danielsson 2013-06-26 08:17:39 UTC
(In reply to comment #23)
> Review of attachment 247087 [details] [review]:
> 
> Looks great, there are a few minor details, and I have some doubts about the
> implementation.

Thanks for review! A new patch will be attached below.
I will obsolete the old patches and start a new issue with the text interface.

> 
> ::: backend/djvu/djvu-document.c
> @@ +550,3 @@
> +    rectangle->x2 *= 72.0 * scale / dpi;
> +    rectangle->y1 = height * scale - rectangle->y2 * 72.0 * scale / dpi;
> +    rectangle->y2 = height * scale - tmp * 72.0 * scale / dpi;
> 
> This is confusing, and it's actually my fault because I suggested it. We
> normally use a GdkRectangle for view rects. But here we still want a
> EvRectangle, so what we really want is to apply the dpi and invert the Y axis,
> I thin it's clearer to leave the previous code in djvu_document_find_find_text,
> since the 3 conversions we need here are actually different to each other.
> Sorry for the confusion.

No problem I'm way more confused anyhow. New suggestion and new helper function in patch below.

> 
> @@ +554,3 @@
> +
> +static GList *
> +djvu_selection_get_highlight_rects (DjvuDocument    *djvu_document,
> 
> why highlight? djvu_selection_get_selection_rects sounds better to me.
> 

Agreed.

> @@ +568,3 @@
> +    rectangle.y1 = (height - points->y2) * dpi / 72.0;
> +    rectangle.x2 = points->x2 * dpi / 72.0;
> +    rectangle.y2 = (height - points->y1) * dpi / 72.0;
> 
> This would be done by the helper function I mentioned before.
> 

Sure.

> @@ +578,3 @@
> +
> +        djvu_text_page_highlight (tpage, page_text, &rectangle);
> +        matches  = tpage->results;
> 
> Why matches? looks like we are searching something
> 

Agreed.

> @@ +597,3 @@
> +
> +    document_get_page_size (djvu_document, page, &width, &height,
> +                &dpi);
> 
> This could be a single line, next line is even longer.
> 

Yes.

> @@ +609,3 @@
> +        rect.x      = (gint) ((r->x1 * scale) + 0.5);
> +        rect.y      = (gint) ((r->y1 * scale) + 0.5);
> +        rect.width  = (gint) (((r->x2 - r->x1) * scale) + 0.5);
> 
> Don't line up this expressions, please.
> 

Ok, sorry about that, personal style I guess.

> @@ +625,3 @@
> +                     EvRectangle     *points)
> +{
> +    DjvuDocument   *djvu_document = DJVU_DOCUMENT (selection);
> 
> Extra spaces between DjvuDocument and *djvu_document

Thanks.

> 
> @@ +660,3 @@
>  {
> +    iface->get_selected_text    = djvu_selection_get_selected_text;
> +    iface->get_selection_region = djvu_selection_get_selection_region;
> 
> Do not line up these either.
> 

Ok!

> ::: backend/djvu/djvu-text-page.c
> @@ +374,3 @@
> 
> +void
> +djvu_text_page_highlight (DjvuTextPage *page,
> 
> This should be someting like djvu_text_page_get_selection, and it could return
> a GList instead of using results
> 

I will use djvu_text_page_get_selection and make it process text or boxes depending on selection variable. See patch suggestion below.


> @@ +388,3 @@
> +
> +    djvu_text_page_limits (page, page->text_structure, rectangle);
> +    djvu_text_page_selection (page, page->text_structure, 0);
> 
> We are not interested in the page itself, what we need is a function like
> djvu_text_page_selection but that only fills the array of DjvuTextLink
> 

See patch below for other suggestion.

> @@ +397,3 @@
> +        if (!(needle = strchr (haystack, '\n'))) {
> +            needle = strchr (haystack, '\0');
> +            at_end = 1;
> 
> Why not iterating the DjvuTextLink array
> 
> @@ +407,3 @@
> +            result = djvu_text_page_box (page, start, end);
> +            page->results = g_list_prepend (page->results, result);
> +            next_word_start = end_p + 1;
> 
> I think this is based on search implementation where we have a text to find,
> but here we have coordinates and we want the bounding box the the text in those
> coordinates, the actual text itself is not important.

Agreed.

Thanks!
Comment 25 Jonas Danielsson 2013-06-26 08:18:19 UTC
Created attachment 247795 [details] [review]
Updated after review
Comment 26 Carlos Garcia Campos 2013-06-29 08:34:49 UTC
Review of attachment 247795 [details] [review]:

Pushed to git master with some minor modifications, thanks!