GNOME Bugzilla – Bug 78723
strip html markup out of articles
Last modified: 2004-12-22 21:47:04 UTC
It seems there's a particular user whose posts I cannot view in Pan. Here is one example of his post that does not show any text in Pan 0.11.2.91 (as of 4/13/2002). There are other posts by the same author in the same group which show up as blanks: Path: eagle.america.net!falcon.america.net!newsfeeds-atl2!news-out.visi.com!hermes.visi.com!news.maxwell.syr.edu!feed.cgocable.net!feed.tor.primus.ca!feed.nntp.primus.ca!news.tor.primus.ca!not-for-mail Message-ID: <3CB2B783.F4C04391@iprimus.com.au> From: Spidy <spidy@iprimus.com.au> X-Mailer: Mozilla 4.61 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.sys.sgi.admin Subject: Re: PC CDs on SGI References: <a5v3buse3cij8i44i5nm1lvmpaof9n8akc@4ax.com> <lfqs8.59331$GF1.8961393@typhoon.nyroc.rr.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-Original-NNTP-Posting-Host: 203.134.131.141 Lines: 33 X-Original-NNTP-Posting-Host: 127.0.0.1 Organization: iPrimus Customer - reports relating to abuse should be sent to abuse@iprimus.com.au Date: Tue, 09 Apr 2002 09:47:19 GMT NNTP-Posting-Host: 203.134.67.67 X-Complaints-To: news@primus.ca X-Trace: news.tor.primus.ca 1018345639 203.134.67.67 (Tue, 09 Apr 2002 05:47:19 EDT) NNTP-Posting-Date: Tue, 09 Apr 2002 05:47:19 EDT Xref: falcon.america.net comp.sys.sgi.admin:167567 X-Received-Date: Tue, 09 Apr 2002 08:21:26 EDT (eagle.america.net)
This person is posting HTML to usenet, and Pan's looking for a text/plain body, not a text/html one. Maybe Pan should put in the text window "[Pan] Some bozo's posting HTML to Usenet" so that the user will know _why_ the message is blank. Or maybe we should dump the text to the screen. Or maybe we should stip out all the html and dump what's left. Opinions?
other people have asked for Pan to strip out html markup, that would probably be the nicest resolution.
Created attachment 7826 [details] the testcase mentioned by the original bug reporter
Partial fix: http://cvs.gnome.org/bonsai/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=pan/pan&command=DIFF_FRAMESET&file=text.c&rev1=1.277&rev2=1.278&root=/cvs/gnome We now show the raw html. I'm leaving this open because it would be nice to also strip out the html markup, at least at a primitive level. Pan shouldn't grok <table></table> markup, but might be able to replace <p> with "\n\n", <br> with "\n", and "<li>" with "\n*"
Since the bug is fixed, what remains is how much we want to clean up html messages in Pan. Accordingly I'm remarking this as a low priority enhancement...
Created attachment 8042 [details] [review] backport to gtk1 (this was a quick one!) apply with -p1.
moving feature requests to 0.13.0. 0.12.x releases will be used for bugfixes.
Nah, fuckit, let's just leave that ugly html formatting in there.