GNOME Bugzilla – Bug 575183
pass_fd function now fails on Solaris
Last modified: 2011-05-04 13:33:55 UTC
I notice that in GNOME 2.25 (with vte 0.19.4) that gnome-pty-helper does not seem to be working properly. With the older GNOME 2.24 version (with vte 0.17.4), I would see this output when I built VTE with debug enabled and setting VTE_DEBUG=pty: --- Sent request to helper. Received response from helper. Helper returns success. Tag = 80682b0. Got master pty 21 and slave pty 22. Setting up child pty: name = /dev/pts/3, fd = 22 Starting new session Setting size on fd 22 to (80,24). Returning ptyfd = 21, child = 13860. Setting size on fd 21 to (80,24). Size on fd 21 is (80,24). --- However, with 2.25 (with vte 0.19.4) I see this output (with some comments added to explain the difference): --- Sent request to helper. Received response from helper. Helper returns success. Tag = 8068c18. # Note that there was a failure here and instead of getting the "Got master # pty # and slave pty #" message, it instead falls back and uses the # _vte_pty_open_unix98 function. # Allocated pty on fd 19. PTY slave is `/dev/pts/3'. Setting up child pty: name = /dev/pts/3, fd = -1 Starting new session Setting size on fd 19 to (80,24). Returning ptyfd = 19, child = 127. Setting size on fd 19 to (80,24). Size on fd 19 is (80,24). --- Digging into the code, I found that gnome-pty-helper was exiting when the open_ptys() function was called pass_fd() and this function returned a -1. This was happening because the pass_fd function has these lines: if (sendmsg (client_fd, &msg, 0) != 1) return -1; I notice that on Solaris it is falling into the "return -1" and errno is 9 (EBADF). If I revert the pass_fd code so it is the same as an older 0.17.4 version of VTE, then the problem goes away. I'm unsure what the right fix is here, though. The newer code seems cleaner so it would be nicer to fix it than to revert to the older code. I am hoping the maintainers might have some suggestions how to approach fixing this.
That would be the patch from bug 562385 then. I wonder if maybe the CMSG* macros are misdeclared? I simply copied the linux definitions here... In fact I'm not even sure this /* Solaris doesn't define these */ is true; I simply added that based on some google codesearch that pointed in this direction. I'd also be interested in the config.log and config.h files from gnome-pty-helper/ subdir (not the ones from the main vte directory).
Created attachment 130591 [details] config.h file from gnome-pty-helper subdirectory
Created attachment 130592 [details] config.log from gnome-pty-helper subdir
I notice that on the latest Solaris Nevada/OpenSolaris builds that /usr/include/sys/socket.h does define CMSG_LEN and CMSG_SPACE, but does not define CMSG_ALIGN. Note that CMSG_ALIGN is only used by the other two macros, so I don't think that on the latest Solaris Nevada/OpenSolaris builds the #defines in the VTE code are being used. That said, I suspect that older versions of Solaris probably do need these. Anyway, as I have dug further into looking at this problem I notice that the problem goes away if I build gnome-pty-helper with optimization turned on (with Sun Studio's -xO4 flag), but the problem exists if I build without an optimization flag specified or if I build with debug (-g). That seems odd, and makes me think that this problem is likely a bug with the Sun Studio compiler and not with the VTE code.
Can you follow up on this? Is this definitely a solaris compiler bug (-> NOTGNOME) ?
Sorry for letting this sit idle for so long. I never before reported this problem to the Sun Studio compiler team, mostly because I could not figure out how to recreate the problem in a way that would demonstrate the problem to them. However, I think I was able to workout a fairly standalone test for them and filed a bug today. They should provide some feedback soon, and I will update this bug report with their response. If it turns out that there is any change that should be made to the VTE code to avoid the problem, I will make sure to highlight the issue. If it is a Sun Studio compiler bug, then I'll just close this bug report.
Any update here?
Yes, this was a Sun Studio compiler bug and has been fixed in the compiler. Sorry for the noise.