GNOME Bugzilla – Bug 119285
Posts over ~ 13KB (220 lines) fail
Last modified: 2006-06-18 05:04:08 UTC
Long posts will not post. The error log indicates it might be a library problem, not PAN directly. (Note the wrapping.) Wed, 06 Aug 2003 12:23:01 - Article "Re: Ugh" not posted. Wed, 06 Aug 2003 12:23:01 - Usenet posting failed. Check Tools|Log Viewer for more information. Your message was saved in the folder "pan.sendlater" Wed, 06 Aug 2003 12:23:01 - pan - Usenet posting failed. Check Tools|Log Viewer for more information. Your message was saved in the folder "pan.sendlater" Wed, 06 Aug 2003 12:23:01 - GLib - giounix.c:397Error while getting flags for FD: Bad file descriptor (9) Wed, 06 Aug 2003 12:23:01 - GLib - Invalid file descriptor. Wed, 06 Aug 2003 12:23:01 - GLib - Invalid file descriptor. Wed, 06 Aug 2003 12:23:02 - GLib - Error flushing string: Bad file descriptor My glib2 version: $ rpmq libglib2.0_0 libglib2.0_0-2.2.2-3mdk
I guess it might be a library problem, but it really just looks like you're losing your network connection. FWIW, I've just posted a large chunk of MacBeth to alt.test without a hiccup? Have you been having network troubles lately?
I can reproduce this with my ISP's server, so I don't think it's a network problem (as in an unstable line). What's happening is that the whole article is sent to the server in a single call to pan_socket_putline(). The full buffer is written on the socket in a single call to g_io_channel_write_chars(): bytes_written equals the full article size on the first call. Pan then attempts to get a response from the server (ack of the post command), but this times out. The socket is closed, which causes the glib errors. So, two likely reasons: 1. the server doesn't support articles of that size and drops some content. 2. g_io_channel_write_chars() is dropping some of the data (e.g. full TCP buffers) but fails to detect that. Will do a tcpdump tonight to see what goes on the wire.
Connection has been solid as a rock. The group I was testing in is the cox...test private group, which takes far larger binaries, so server raw size restrictions aren't the problem. In addition, it was Lenroc, also from Cox (and who I just turned on to the list, so you see his postings =:^) that brought it to my attention. He finally saved off the message to an MSWormOS file system and rebooted that, where he used OE to successfully post his missive to the (non-test, but unknown to me name) group he was attempting to post to. Thus, it's not the server, if OE can post it. Hmm.. I wonder if PAN on MSWormOS would have posted it..
Duncan: do you compile from source? Could I send a patch for you to test?
braindump, since I'm away for the weekend: pan_socket_putline works a lot better under these conditions when replacing g_io_channel_write_chars() with g_io_channel_write() (or its gnet wrapper, gnet_io_channel_writen()). The main differences between write_chars and write are a) write_chars supports encoding (which we don't use) and b) write_chars is buffered. Yeah, the buffered version drops data and the non-buffered version doesn't. Go figure. :)
> Duncan: do you compile from source? > Could I send a patch for you to test? Yes, I compile from source (tarball). However, I haven't gotten much into patching, tho I know the basics. I can probably do it, but it's not yet second-nature to me, which means tho I'm willing, I may not get to it in a real timely manner, since I tend to put off stuff like that if I'm to tired, and I am working more hours again now and am still recovering from being sick earlier this week. However, bring on the patch, and we'll see how it goes! There's someone else doing some testing on the problem in the cox...test group, and either getting around it, or it doesn't occur with repeated info.. I pointed them here, and will log any new info I get off that if they don't.
Chris: I didn't get around to this last weekend; do you want to keep looking at this?
Created attachment 19123 [details] [review] Patch for unbuffered writes
Duncan: if you have some time, could you try this patch? It's a one file patch for pan/sockets.c, so this should be fairly straightforward. Charles: I wouldn't mind an extra pair of eyes. The above patch works better in my ISP's server, though I don't really understand why. :) Mind you, if you're not planning to do another beta, I'd punt this to 0.14.2.
No, we've had far too many changes to go out without another beta. I'm planning on doing another beta later this week.
Duncan: ping (Yes, the ping lag times are getting smaller... I'm itching to put out another release. I don't think Pan's bugzilla bin has ever been this empty! :) ((knocks wood))
Patched successfully.. recompiling now.. (and hoping my old Athlon o/ced for to long while running 100% cpu doing distributed.net doesn't start faulting like it tends to do if it gets to overworked for to long.. despite the fact I don't o/c it any more.. <g>) Compiled fine.. launches.. testing..
The patch DOES seem to fix it!! The full original post that wouldn't post before, does now, as does the same thing doubled to nearly 500 lines! (Sorry for the delay, there. As I said, patching isn't second nature enough for me to do it when I'm tired, yet, and... A day off once in awhile definitely helps there -- I've been sick most of them lately, but I enjoyed this one, even if I DID sleep 14 hours of it! <g>)
Chris: the patch looks OK. Any idea why GIOChannel behaves that way?
Not sure. I don't think write_chars() should drop data (though tcpdump confirmed it does), but perhaps we're using the buffered API incorrectly. I need to set up an isolated testcase to check why this is happening. For now, we can use the workaround in the patch. BTW, I'm out (again) till tuesday, so feel free to commit if you plan to release the next beta.
Committed the workaround and bumped a review for 0.14.2. http://cvs.gnome.org/bonsai/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=pan/pan&command=DIFF_FRAMESET&file=sockets.c&rev1=1.122&rev2=1.123&root=/cvs/gnome
Mass-bumping of 0.14.2 features to 0.14.3 to make way for an emergency 0.14.2 release.
Chris: is this still an issue?
The workaround hasn't introduced any side effects. As for the core issue: I filed bug #122291, but haven't gotten any feedback. Feel free to close: I can track it through 122291.