Bug 321568 – HEAD fails to compile with CHM support

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 321568 - HEAD fails to compile with CHM support


Summary:	HEAD fails to compile with CHM support


Status:	RESOLVED FIXED

Product:	beagle
Classification:	Other
Component:	General
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Beagle Bugs
QA Contact:	Beagle Bugs

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2005-11-16 00:49 UTC by Ryan P Skadberg
Modified:	2006-06-15 21:30 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
FilterChm.cs ChmFile.cs update (11.15 KB, patch) 2006-05-18 01:20 UTC, Miguel Cabrera	committed	Details \| Review

Description Ryan P Skadberg 2005-11-16 00:49:57 UTC

Version details: HEAD
Distribution/Version: Fedora Core Rawhide

Get this when trying to compile CVS HEAD with CHM support.

./FilterChm.cs(57,21): error CS0103: The name `WalkChildNodesForText' does not
exist in the context of `Beagle.Filters.FilterChm'
./FilterChm.cs(160,5): error CS0103: The name `WalkNodes' does not exist in the
context of `Beagle.Filters.FilterChm'

I'm running chmlib-0.37.4-1

Compiled fine when I removed chmlib.

Comment 1 Debajyoti Bera 2005-11-16 01:28:48 UTC

Oops. FilterChm needs to be changed due to the recent changes in FilterHtml.
Didnt knew that there was a customer of FilterHtml :P.

Comment 2 Miguel Cabrera 2006-05-18 01:20:07 UTC

Created attachment 65742 [details] [review]
FilterChm.cs ChmFile.cs update

Comment 3 Miguel Cabrera 2006-05-18 01:26:22 UTC

ooops :P
I've updated the FilterChm and ChmFile classes (see above the patch).  before this update FilterChm only indexed the toc file and the default page. Now it filters all the html files inside an chm file pretty fast thanks to the improvements made by D Bera to the HtmlDocument in the HtmlAgilityPack.
I also update the glue to libchm which was obsolete due to changes in the libchm API.

Comment 4 Lukas Lipka 2006-05-18 13:05:07 UTC

diff -Naur beagle/Filters/HtmlAgilityPack/HtmlDocument.cs beagle-mike/Filters/HtmlAgilityPack/HtmlDocument.cs
--- beagle/Filters/HtmlAgilityPack/HtmlDocument.cs	2006-04-26 20:03:07.000000000 -0500
+++ beagle-mike/Filters/HtmlAgilityPack/HtmlDocument.cs	2006-05-17 19:55:24.000000000 -0500
@@ -1928,7 +1928,7 @@
 					   _lastparentnode.AppendChild(_currentnode);
 					}
 
-	ReadDocumentEncoding(_currentnode);
+	//ReadDocumentEncoding(_currentnode);

Why don't we want to call the ReadDocumentEncoding ()?

Comment 5 Miguel Cabrera 2006-05-18 14:52:31 UTC

jeje
I forgot to mention that!
Well,  It's because it raises a NullPointer Exception. I don't know why, but it's related to the modifications made by D Bera to HtmlDocument.  I've tested the filter with this line commented out and it work properly. The FilterHtml don't use this interfece to parse the Html, use another overloaded function, so It wont affect the Html Filter.  As a note, when I tried to detect the encoding by hand it also raised the same exception, so i commented out that line. I think HtmlDocument will use the default ASCCI encoding.
I think the only one who can clarfify this to us is D Bera.

Comment 6 Joe Shaw 2006-06-01 14:50:16 UTC

From what I see on the CHM files I've tried, leaving the ReadDocumentEncoding() line causes an EncodingFoundException to be thrown.  In other places in the HtmlAgilityPack code, that exception is caught and then the file is reparsed with the detected encoding.

In looking through the code, I'd suggest that the function passed into ChmFile.ParseContents() not take a TextReader, but instead take a string, since you are just converting a string into a StringReader anyway.  That way, you can properly catch the EncodingFoundException and re-call doc.Load() with the given encoding.  (Because you can't pass in an encoding to doc.Load(TextReader), bleh)

Once that's fixed, I think it's probably fine to go in.

Comment 7 Debajyoti Bera 2006-06-01 19:13:20 UTC

Sorry for the long silence. I was out of town.
The last time I checked, I was unable to use FilterChm due to some library problems. But if the exception is due to EncodingFoundException, then it should be fixed by the commit I made just now. That should also fix the occassional EncodingFoundExceptions I saw in log files but didnt know where it was coming from.
Now you can either use your Load(TextReader) code or use the suggestion by Joe.

Comment 8 Miguel Cabrera 2006-06-01 21:15:15 UTC

Well, Joe, what should I do? 
Should modify the source or it's fine the way it is?

Comment 9 Debajyoti Bera 2006-06-01 21:38:37 UTC

If my commits removed the crashes/exceptions (which should be the case but I was not able to test), then I'd suggest using the current code. All the logic for detecting encoding etc. is (supposed to be) handled correctly by Load(TextReader). The logic for detecting encoding separately and then calling Load() with the encoding is used in Html filter for different reasons; it shouldnt be needed for Chm files.

Comment 10 Joe Shaw 2006-06-09 22:21:18 UTC

This looks good to check in sans the HtmlDocument.cs change.

Comment 11 Joe Shaw 2006-06-15 21:30:25 UTC

Checked this in without the HtmlDocument.cs change.  Thanks!