After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 703108 - Implement the get_text interface for djvu backend
Implement the get_text interface for djvu backend
Status: RESOLVED FIXED
Product: evince
Classification: Core
Component: backends
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: Evince Maintainers
Evince Maintainers
Depends on: 448739
Blocks:
 
 
Reported: 2013-06-26 10:03 UTC by Jonas Danielsson
Modified: 2013-06-29 09:34 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Implementation of the get_text interface for djvu (3.56 KB, patch)
2013-06-26 10:05 UTC, Jonas Danielsson
none Details | Review
V2. (5.35 KB, patch)
2013-06-28 06:36 UTC, Jonas Danielsson
committed Details | Review

Description Jonas Danielsson 2013-06-26 10:03:15 UTC
The djvu backen can implement text_get_text and text_get_text_mapping methods of the get_text interface.

The get_text_layout interface depends on being able to have bounding boxes around characters in words. The djvu backend is unable to do that at the moment.

I have a patch to implement get_text_mapping and get_text and will post it shortly.

The patch depends on: https://bugzilla.gnome.org/show_bug.cgi?id=448739
Comment 1 Jonas Danielsson 2013-06-26 10:05:41 UTC
Created attachment 247805 [details] [review]
Implementation of the get_text interface for djvu
Comment 2 Christian Persch 2013-06-27 16:38:01 UTC
Is there any guarantee that the returned text is valid UTF-8 ?
Comment 3 José Aliste 2013-06-28 01:50:17 UTC
I just reread the djvulibre api and implementation, and it looks to me that the text is guaranteed to be in utf-8.
Comment 4 Jonas Danielsson 2013-06-28 06:36:21 UTC
Created attachment 247943 [details] [review]
V2.

The function djvu_text_prepare_search is somewhat badly named if it is to be used in the get_text function as well.

The v2 patch renames it to djvu_text_page_index_text.
Comment 5 Carlos Garcia Campos 2013-06-29 09:34:36 UTC
Review of attachment 247943 [details] [review]:

Split in two patches and pushed to git master, thanks!

::: backend/djvu/djvu-document.c
@@ +692,3 @@
+		djvu_text_page_index_text (tpage, TRUE);
+		text = g_strdup (tpage->text);
+		djvu_text_page_free (tpage);

Since we are going to free the page here, we can steal the text instead of duplicating it.