Bug 747046 – cleanup pty.c

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 747046 - cleanup pty.c


Summary:	cleanup pty.c


Status:	RESOLVED FIXED

Product:	vte
Classification:	Core
Component:	general
Version:	0.39.x
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	VTE Maintainers
QA Contact:	VTE Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-03-30 17:19 UTC by Christian Persch
Modified:	2018-11-25 15:49 UTC

See Also:	https://bugzilla.opensuse.org/show_bug.cgi?id=948773 https://launchpad.net/bugs/1516440
GNOME target:	---
GNOME version:	---

Description Christian Persch 2015-03-30 17:19:27 UTC

pty.c is #ifdef hell in need of simplification.

That means to at leastremove gnome-pty-helper, since it doesn't do anything over directly opening the ptys in-process apart from the mostly useless utmp logging.

Then have a look at all the defines, decide which ones are still relevant on modern systems, and remove the rest.

Comment 1 Christian Persch 2015-03-30 18:16:33 UTC

https://git.gnome.org/browse/vte/log/?h=wip/pty-cleanup

pty-helper removal is a nice win:  22 files changed, 22 insertions(+), 2944 deletions(-)

Comment 2 Egmont Koblinger 2015-03-30 18:32:08 UTC

> apart from the mostly useless utmp logging

Big +1 from me :)

I never understood the rationale behind utmp/wtmp.  I open/close browser windows/tabs, tcp/udp ports, documents, pictures; start and terminate all kinds of apps etc. all the time, yet there's not a single log record of any of these.  I open/close terminal windows/tabs hundreds of times a day – I can't see why it should be any different and keep a log of them.

But if someone really wants to, I bet he can voluntarily maintain such logs using helper scripts.  (By the way, "ls -l /dev/pts" is already a much better replacement for utmp.)

Let alone bug 52319, bug 317312, bug 465036...

I'd be happy to see [uw]tmp gone for good :)

Comment 3 Christian Persch 2015-03-31 20:52:15 UTC

I pushed the branch to master, but leaving open since there's still much to cleanup.

Comment 4 Dominique Leuenberger 2015-11-18 14:00:07 UTC

(In reply to Christian Persch from comment #0)
>  apart from the mostly useless utmp logging.
> 

Not that useless... there are a bunch of tools relying on it.

Most notably, pam_wheel apparently needs the information to decide who is coming its way (see downstream bug https://bugzilla.suse.com/show_bug.cgi?id=948773 )

Comment 5 Egmont Koblinger 2015-11-18 15:14:45 UTC

How about fixing pam_wheel then? :)

utmp is inherently unreliable. E.g. user A's terminal emulator, which does utmp logging, might perform an unclean exit for whatever reason, leaving the record behind. Then user B can get access to that pts (and not update utmp). Noone wants user B to be able to impersonate user A, correct?

Therefore, security sensitive components should never rely on utmp. If they need to figure out who owns a tty, the ownership of the /dev/pts* file is a reasonable choice.

The error message in the linked suse bug is "pam_wheel(su:auth): who is running me ?!", so does it really tries to use utmp to figure out a piece of information that it already knows straight away. The user ID is as far as I recall available to PAM as a pam environment variable, and even if it wasn't, it could do a get[e]uid().

Also, passwordless "su" shouldn't even require an underlying terminal, so I don't see at all how the terminal is involved in this game at all. It should be about user ID and nothing more – or am I missing something_

> a bunch of tools

What else?

Comment 6 Dominique Leuenberger 2015-11-18 16:16:45 UTC

(In reply to Egmont Koblinger from comment #5)
> How about fixing pam_wheel then? :)

with pleasure... the question will be what will be acceptable by pam
 
> The error message in the linked suse bug is "pam_wheel(su:auth): who is
> running me ?!", so does it really tries to use utmp to figure out a piece of
> information that it already knows straight away. The user ID is as far as I
> recall available to PAM as a pam environment variable, and even if it
> wasn't, it could do a get[e]uid().

yes, pam_wheel does rely on utmp:
* pam_wheel calls pam_modutil_getlogin()
and pam_modutil_getlogin is defined in ./libpam/pam_modutil_getlogin.c

that code is probably just so old - that nobody dares touching it - last commit on this file: 2005-11-23
 
will see what I can do with the pam maintainers on that case.

Comment 7 Egmont Koblinger 2015-11-20 00:23:36 UTC

I'm wondering if we could/should bring back optional utmp support via the libutempter library, for those relying on this brain-damaged misfeature :)

Comment 8 Christian Persch 2015-11-20 09:28:44 UTC

I don't think so. So far we've only seen one problem, and that was doing it wrong anyway (basing a security decision on untrustable data; and easily fixed by using the use_uid option to pam_wheel).

Comment 9 Dominique Leuenberger 2015-11-20 09:32:45 UTC

(In reply to Christian Persch from comment #8)
> I don't think so. So far we've only seen one problem, and that was doing it
> wrong anyway (basing a security decision on untrustable data; and easily
> fixed by using the use_uid option to pam_wheel).

I don;t mind it - but do we have a full doc available I can smack over a pam maintainer? Maybe they consider switching to use_uid by default instead of utmp

Comment 10 Egmont Koblinger 2015-11-20 09:48:11 UTC

Oops, forgot to link https://bugs.launchpad.net/terminator/+bug/1516440.

Scripts relying on "who am i" or "whoami" (one of the two, can't remember which) or "logname" might break. Again these are probably badly written scripts, but still. There's no security involved here.

The comment about "guake" there makes me wonder though if this belongs to vte or the app.

I'm not pro or con bringing it back, just wondering, looking around...

Comment 11 Christian Persch 2015-11-20 10:57:14 UTC

Since the removal was a nice win in code size and maintainability, I'd rather not bring this back.

What exactly are the use cases for [uw]tmp ?
* pam_wheel: Add "use_uid" option. About the unreliability, comment 5 explains one problem, and also there's bug 317312 .
* want to know which pty you're currenty on? tty(1), or readlink /proc/$$/fd/0
* want a list of all ptys of your user? ls /dev/pts
* anything else?

Comment 12 Laurent Bigonville 2015-11-26 00:46:26 UTC

I think it's related, with gnome-terminal, getlogin(3) (POSIX) function returns NULL

With xterm, terminator,... the function return my login name.

Comment 13 Matthias Clasen 2015-11-26 01:13:06 UTC

If you read the man page, it is pretty clear that getlogin is not a very reliable or useful function

Comment 14 Laurent Bigonville 2015-11-26 08:02:24 UTC

Well the fact that getlogin() is not useful doesn't mean it's not used by some applications are not using it or that it's not part of the POSIX API, but I don't think it's upto GNOME to declare that kind of interfaces useless or obsolete.

And if some applications are using it for security purposes, the bug exists in these application and it should be fixed there.

Comment 15 Egmont Koblinger 2015-11-26 17:17:42 UTC

I have no firm opinion here.

On one hand, I hate the concept of utmp and I'd be happy to see it gone. It's clearly not suitable for pam, and even for other uses it'd be desirable to use other, more reliable source for the data. I'm wondering if it'd be feasible to change getlogin()'s implementation to return the owner of the tty device instead of looking at utmp, and whether glibc folks would be open for such a change. This might break certain other things too, and might go against POSIX.

On the other hand, I understand that some libraries and apps rely on this legacy crap and it would nice if we didn't break them. Folks writing scripts relying on e.g. "who am i"'s output might not know anything about utmp and not care at all, just expect things to work.

We should take a look if we could add libutempter support with just a few lines of code, as an optional feature (decided at configure time).

Still, it wouldn't be clear to me if it belongs to vte or g-t. I tend to vote for g-t. If we add it to vte, we either make it mandatory (I wouldn't want to do that), or the API becomes quite complex and we make the life of vte-based apps' creators quite complicated, since they'd need to query whether vte supports utempter, only offer the corresponding UI if it does, and compile/install both versions for themselves to be able to develop/verify this option. Provided of course that they want to make it configurable. Life's probably simpler if it's up the apps. Not sure, though.

Comment 16 Egmont Koblinger 2015-11-29 21:35:06 UTC

libutempter's interface is damn simple... however:

NOTES
     During execution of the privileged process spawned by these functions, SIGCHLD signal handler will be temporarily set to the default action.

vte/g-t don't directly tamper with SIGCHLD, but does glib do? I don't want to get into a race condition nightmare.

Comment 17 Christian Persch 2015-11-29 21:58:20 UTC

IMHO we should NOT bring back [uw]tmp support. 

I also don't see what's stopping utmpter or anything else from providing a wrapper that makes the utmp entry, then takes its remaining argv[] to exec, waitid() on that, and then removes the utmp entry before exiting. That could be used in the g-t 'custom command' to prefix the real command.

(In reply to Laurent Bigonville from comment #14)
> Well the fact that getlogin() is not useful doesn't mean it's not used by
> some applications are not using it or that it's not part of the POSIX API,
> but I don't think it's upto GNOME to declare that kind of interfaces useless
> or obsolete.

It is however up to the vte developers to *observe* that utmp is in fact obsolete and useless, and act accordingly in removing support for it.

(In reply to Christian Persch from comment #3)
> I pushed the branch to master, but leaving open since there's still much to
> cleanup.

I have now done all the cleanups I wanted to do.

Comment 18 Laurent Bigonville 2015-11-29 22:36:02 UTC

> (In reply to Laurent Bigonville from comment #14)
> > Well the fact that getlogin() is not useful doesn't mean it's not used by
> > some applications are not using it or that it's not part of the POSIX API,
> > but I don't think it's upto GNOME to declare that kind of interfaces useless
> > or obsolete.
> 
> It is however up to the vte developers to *observe* that utmp is in fact
> obsolete and useless, and act accordingly in removing support for it.

I guess the glibc should first be modified to not rely on this file to implement some POSIX API then.

Comment 19 Egmont Koblinger 2015-11-30 19:29:17 UTC

(In reply to Christian Persch from comment #17)

> I also don't see what's stopping utmpter or anything else from providing a
> wrapper that makes the utmp entry, then takes its remaining argv[] to exec,
> waitid() on that, and then removes the utmp entry before exiting. That could
> be used in the g-t 'custom command' to prefix the real command.

Yup.

Or have a line in .profile that calls the utility to write a record, and a line in .bash_logout to call that utility again to remove it. If your bash crashes, you'll be left with an outdated record, which can happen anyways (if the terminal emulator crashes, or if you intentionally misuse the utmp-writer utility), so it's not a big deal.

Distributions should go ahead and implement this, perhaps as part of libutempter, and get it accepted by mainstream...

Comment 20 Mantas Mikulėnas (grawity) 2015-12-07 11:19:09 UTC

(In reply to Egmont Koblinger from comment #19)
> (In reply to Christian Persch from comment #17)
> 
> > I also don't see what's stopping utmpter or anything else from providing a
> > wrapper that makes the utmp entry, then takes its remaining argv[] to exec,
> > waitid() on that, and then removes the utmp entry before exiting. That could
> > be used in the g-t 'custom command' to prefix the real command.
> 
> Yup.
> 
> Or have a line in .profile that calls the utility to write a record, and a
> line in .bash_logout to call that utility again to remove it. If your bash
> crashes, you'll be left with an outdated record, which can happen anyways
> (if the terminal emulator crashes, or if you intentionally misuse the
> utmp-writer utility), so it's not a big deal.

That only works when the terminal is configured to launch shells in "login" mode, doesn't it? (Otherwise there is no .bash_logout equivalent aside from bash-specific voodoo.) Plus it might conflict with such layers as tmux or Screen.

I feel like this is actually a kernel problem. Besides utmp, getlogin() actually already has another, reliable source of information – /proc/$$/loginuid [updated by pam_loginuid]. Unfortunately that /proc entry is hidden behind CONFIG_AUDITSYSCALL, which many distros disable due to the performance issues it causes... If someone were to split loginuid&sessionid support into a separate Kconfig option, such utmp hacks would no longer be necessary.

Comment 21 Mantas Mikulėnas (grawity) 2015-12-07 11:20:01 UTC

(In reply to Mantas Mikulėnas from comment #20)
> (In reply to Egmont Koblinger from comment #19)
> > (In reply to Christian Persch from comment #17)
> > 
> > > I also don't see what's stopping utmpter or anything else from providing a
> > > wrapper that makes the utmp entry, then takes its remaining argv[] to exec,
> > > waitid() on that, and then removes the utmp entry before exiting. That could
> > > be used in the g-t 'custom command' to prefix the real command.
> > 
> > Yup.
> > 
> > Or have a line in .profile that calls the utility to write a record, and a
> > line in .bash_logout to call that utility again to remove it. If your bash
> > crashes, you'll be left with an outdated record, which can happen anyways
> > (if the terminal emulator crashes, or if you intentionally misuse the
> > utmp-writer utility), so it's not a big deal.
> 
> That only works when the terminal is configured to launch shells in "login"
> mode, doesn't it? (Otherwise there is no .bash_logout equivalent aside from
> bash-specific voodoo.) Plus it might conflict with such layers as tmux or
> Screen.
> 
> I feel like this is actually a kernel problem. Besides utmp, getlogin()
> actually already has another, reliable source of information –
> /proc/$$/loginuid [updated by pam_loginuid]. Unfortunately that /proc entry
> is hidden behind CONFIG_AUDITSYSCALL, which many distros disable due to the
> performance issues it causes... If someone were to split loginuid&sessionid
> support into a separate Kconfig option, such utmp hacks would no longer be
> necessary.

Ah, nevermind, I forgot that there's a world beyond Linux. :(

Comment 22 Christian Persch 2015-12-07 16:45:22 UTC

I'ts perfectly fine to depend on linux for a feature; the feature just won't work on other kernels then.

Comment 23 Christian Persch 2015-12-13 19:31:37 UTC

I've done all the cleanup on pty.cc that I wanted to do, so closing.

Comment 24 Trevor Cordes 2017-07-02 23:17:26 UTC

"* anything else?"

You guys broke "write" and "wall", neither of which works with g-t anymore.
https://bugzilla.redhat.com/show_bug.cgi?id=1466993

Comment 25 Egmont Koblinger 2017-07-20 21:56:12 UTC

> You guys broke "write" and "wall", neither of which works with g-t anymore.
> https://bugzilla.redhat.com/show_bug.cgi?id=1466993

There's also another report of breaking write/wall/etc. at https://bugzilla.xfce.org/show_bug.cgi?id=13710 so let me respond here.

Disclaimer: I'm not the one who decided on the removal of this feature, and I'm sure I'm not aware of all the reasons behind it. Below is solely my personal opinion, not necessarily matching the main VTE maintainer's or the VTE project's or GNOME's (if it makes sense at all to talk about these latter ones at all).

TL;DR: IMO we stopped supporting something that was totally broken to begin with. As such, I don't mind it at all.

#include everything I've written earilier on this thread.

---

It was mentioned that we removed 2200 lines of code "just because". Well, not "just because", it had tons of troubles including security ones. And 2200 lines is a freaking lot. I've checked the toughest ones of my contributions to VTE, including rewrapping on resize, compressed and encrypted scrollback, as well as hyperlink support. In each of them the unified diff (including removed lines and context lines) is below 2200 lines. So 2200 lines is a freaking lot, removing this much is indeed a noticeable maintainability win.

---

In Ubuntu, after logging in to their default Unity7 and opening an xterm (no gnome-terminal whatsoever), a "write egmont" silently pretends to succeed, i.e. the "write" client doesn't report any errors. I just don't receive the message. So it's broken here as well, in one of the leading distros, unrelated to gnome-terminal. A "write egmont pts/0" works as expected. Guess what: according to "w" there's also a "/bin/sh /usr/lib/gnome-session/run-systemd-session ubuntu-session.target" running on tty7 and this is where "write" writes to. Pretty useful, ah?

---

The world keeps changing, and these changes aren't only additions over time, sometimes old things break or get retired, or old habits get out of fashion. I could come up with many examples, but let's just look at X Window. It was designed with a server-client architecture that was in heavy use with X terminal servers (not to be confused with xterm or terminal emulators!!) decades ago, but now I believe its usage has significantly shifted to local ones. Old-fashioned server-side fonts are ugly as hell, they have practically been replaced by the network-intensive client-side rendering. All the fancy looks, bended colors, animations are network-intensive. Acceleration, essential for movies, gaming, animations etc. only available locally. Not sure about compositing, perhaps that too. Sound only available locally (maybe a simple beep can be transferred by X11, dunno). I've tried a couple of times, but especially with nice-looking widget sets (like GTK+) I could hardly ever get useful behavior with remote X. And by the way it's all being replaced by Wayland which I'm not familiar with, but I really doubt it has this client-server thingy.

The computer world has moved a lot towards usability and visually pleasing apperience in the last few decades. Two-three decades ago people were okay with the Xaw look. The bar is much higher now.

---

What's the use case for the current bug to occur?

Using your own computer (I believe by far the most typical usage) - no, no other person reboots that for you or wants to message you.

Running gnome-terminal and then ssh'ing to a remote computer? No, in that case it's ssh doing the utmp logging so it's still fine.

Using local X on a computer (let's say in a computer lab) that's remotely managed by an admin? Could be, but in that case something's really fundamentally wrong in the process if the admin wants to reboot it underneath you.

Using remote X, that is, a traditional X terminal? Does anyone really still do that? I think the amount of folks doing this is really marginal. Or rdesktop/vnc/alike? Still probably not that typical, and anyway, the sysadmin shouldn't just reboot as he pleases, reboots should be scheduled and announced way ahead of time by some other means.

Having not a complete graphical session, but just gnome-terminal running remotely (e.g. over an "ssh -X"?) I'm wondering why anyone would do this over a local terminal emulator...

What else am I missing?

---

Now let's look at "write" and "wall". Let me mention that I did use these intensively (along with talk and ytalk) in the late 90's when I had dialup net (either net or phone line), no mobile etc. Needless to say, the world has changed a lot. Home dirs of others are lo longer readable by default. Mesg defaults to no rather than yes. It no longer occurs to me to "write" to a user, instead I IM them in a browser (gtalk/facebook/etc.).

But here are two biggest things that truly bother me.

---

First:

You're working in a terminal emulator and suddenly you get a message. Depending on what you're doing, the message could quickly scroll out and remain unnoticed (if you have an app running that spits out tons of output), could quickly get corrupted and only partially readable or not readable at all (e.g. you're scrolling in an editor by holding down e.g. PgDn; by the time you realize you got a message and release the key it's gone), these are not acceptable. If a message is important, it should always be readable.

And of course it could easily corrupt the display of whatever you're doing (your editor, file manager etc.), in turn making you randomly hit Ctrl+L or Ctrl+R or other keycombos because of course every app has a different one for repainting the screen. The message can even be injected into the middle of an escape sequence, causing the terminal emulator to receive an utter garbage.

Maybe 20-30 years ago this was the state of the art or at least good enough. Today I find it a UX disaster that shouldn't be allowed. I'd even go as far as questioning why terminal devices are accessible at all under /dev/pts/* rather than being unaccessible externally such as network ports, unnamed pipes etc. are.

Maybe a small screen/tmux-like software built into the kernel which decreases the height seen by the app and scrolls the notification at the bottom, or a special escape sequence designated for sysadmin messages only (still subject to breaking other escape sequences in the middle, tho'), or something like this could be an okay-ish UX in terminals.

---

Second:

Why is the terminal emulator special? Why would you get notified there and only there? Why is this and only this supposed to be the client software for receiving and displaying such messages?

If a user has one terminal emulator running, he or she gets notified once.

If one has twenty gnome-terminal tabs open, you expect them to be notified twenty times... ugh. Kinda compensates against some of these notifications not being readable, as mentioned above, geez.

But if one has no terminal emulator open, because they use a graphical text editor, word processor, CAD utility, photo editor, web browser, music player, chat software, rdesktop/vnc to connect remotely, vmware for local emulation, play games, do all such kinds of other things, is it then absolutely okay not to notify them and just reboot the system under them???

I totally don't buy this concept. Desktop environments (window managers) should have a unified means of getting system-wide notifications sent by root (maybe over dbus), and display them appropriately for every user. I'm not sure if such a system exists, but if being able to notify users is really a demand, developers should work towards creating one. There should be a way to notify _logged in users_, not tty lines.

Oh, and by the way, the error message:

write: egmont is not logged in on pts/0

is absolutely correct, I'm not *logged in* on pts/0. Opening a new terminal emulator tab/window, and a new tty pseudo-device for that behind the scenes is not *logging in*, just as much as opening a new TCP port or starting to play a new song is not logging in either. Logging in is where you authenticate against the system, which is never gnome-terminal. It could be "outside" of gnome-terminal, e.g. on X display level, or "inside" gnome-terminal, e.g. by ssh.

Why would be the terminal emulator be the only piece of software responsible for receiving and displaying such messages? Why not browsers, for example, by injecting them to a pretty much random place in the DOM (or, analogously to the terminal world where the message can interrupt an escape sequence, I should rather say at a random place within the HTML source possibly breaking the entire HTM)? Why don't gvim or a graphical emacs, or libreoffice, gimp etc. show you root's such important messages?

---

To summarize, my 2 cents is that gnome-terminal stopped supporting something that was technically problematic, and more importantly, utterly broken by design anyways. As such, I don't mind this being gone. Unfortunately there's no reliable alternative at this moment. However, if it's found a common requirement (which I'm really unsure about, since I don't quite see the scenario where a root should perform unscheduled reboots under a logged in user) then a new method with a 21st century user-friendly approach should be designed and implemented. Corrupting the display of one particular type of application while potentially not even delivering the message in a readable way, is just so freaking 1900's that I'm not looking forward to revive.

Comment 26 Trevor Cordes 2017-07-21 06:25:36 UTC

Hi, I sincerely appreciate Egmont and Igor's comments here and elsewhere. For sure the devs' opinions/desires get more weight and demand our respect (well that or we abandon using their work). And as a dev myself, I understand the need for maintainability. I really do. So please take what I say in a soft, friendly advice, things to ponder, way.

I think the best way to think of this whole problem is as Linus repeats loudly every so often: "don't break userland". Similar to the principle of least surprise. To me this means: if something has worked in XYZ way for 25 years in nearly every single instance, leave it working in XYZ way. Improve upon it, sure, find better solutions, sure, but don't just break it.

Does this axiom carry from kernel to vte? I think it should a bit. No one ever gets angry when "new feature X" doesn't get implemented, but people sure get angry (and file bugs) when "25 year old feature Y" gets yanked. (Similar to the removal of word-selection chars last year or two.)

I agree the use cases are limited (but they are still there). I agree it seems like lame 1990 technology (but people still use it). I agree the original ideas of utmp/write/wall are sub-optimal (but it worked well enough to be "standard" for 30 years). But you can't possibly imagine all the weird uses people have found for this interesting, de facto standard, technology over those 30 years (like mine and the other posters'). To even try smacks of hubris.

I'm shocked there isn't a library that does all this for you and terminal devs only have to put 10 lines of C code in to use it. Shocked. What do all the other 800 terminals out there (minus the 3 we've found that don't) do to make this work? Are they all having to maintain 2200 lines of code? Why is it that 99% of terminals' devs think it's still worthwhile?

I won't pick this as my hill to die on for the simple fact that I was lucky enough to have a simple workaround (echo > pts file), but other users may not be so lucky. In fact, my use case where I use this to identify which terminal of 100 open is the one with a certain open file it in, is not one you identified. Worse still, the proffered idea that pts's shouldn't be real files at all would have meant even my workaround wouldn't have worked. I hazard a guess that if you're a *NIX dev and you think *less* things should be files, in an "everything's a file" OS, you're missing something fundamental of the whole philosophy. *NIX shouldn't be about hiding things, it should be about exposing them. *NIX shouldn't be about yanking functionality, it should be about adding it. That's what makes it so useful, malleable and timeless. After all, we're not Windows.

Bottom line: if your terminal is lacking a feature that junky xterm has, you may not be barking up the right tree... just sayin'

Comment 27 Egmont Koblinger 2017-07-21 07:19:33 UTC

(In reply to Trevor Cordes from comment #26)

> No one ever gets angry when "new feature X" doesn't get implemented

I, for one, joined VTE development because the lack of rewrap on resize made me angry on a daily basis :)

> but people
> sure get angry (and file bugs) when "25 year old feature Y" gets yanked.

If I only had a dollar for every time an old feature that my daily routine relied on broke... :)

> I'm shocked there isn't a library that does all this for you and terminal
> devs only have to put 10 lines of C code in to use it.

There's one, discussed above, although I'm not sure if we're okay with the mentioned drawback thereof.

Or maybe it should rather be a wrapper executable that g-t launches. Not sure if that one exists. If it does, though, you can set it up for yourself as a workaround (still not as ideal as g-t doing it out of the box, I understand that).

> I hazard a guess that if you're a *NIX dev and you think *less* things
> should be files, in an "everything's a file" OS, you're missing something
> fundamental of the whole philosophy.

Maybe they just shouldn't be so prominent. E.g. a special entry under /proc/12345/fd/12 (such as with unnamed pipes) but no direct entry under /dev/pts. /dev/pts/xx is useful for a few special corner cases (and as a dev, I do use them a lot), but is totally useless for an average user, and IMO also encourages people coming up with broken designs such as "write"/"wall".

Why don't, let's say, open TCP connections have such an entry so that I could just "echo ... > /dev/tcp/from-12345-to-192.168.0.1-80" to send a message (and f.ck up the protocol, just as echoing to /dev/pts/* might also do)? :)

> Bottom line: if your terminal is lacking a feature that junky xterm has

We do lack loads and loads of features from xterm.

> Linus repeats loudly every so often: "don't break userland".

Message taken. I'm not arguing with this one. Will keep in mind for my future design decisions.

Comment 28 Egmont Koblinger 2017-07-21 07:23:45 UTC

Back to the "wall" use case:

I guess someone should write a utility that:

- has no UI, no X dependency
- runs in the background
- allocs a tty
- writes its utmp record
- when receives a message (bit of heuristics needed, e.g. a simple read() call, or timing to see when the message ends) launches a custom command (e.g. zenity) with that message.

A big popup appearing in the middle of the screen, in the foreground with the message from root, without corrupting any of my windows, could be quite useful.

Comment 29 Egmont Koblinger 2017-07-21 07:24:43 UTC

Typo: bit of heuristics needed, e.g. a *single* read() call...

Comment 30 Mantas Mikulėnas (grawity) 2017-07-21 08:10:55 UTC

(In reply to Trevor Cordes from comment #26)
> I think the best way to think of this whole problem is as Linus repeats
> loudly every so often: "don't break userland".  Similar to the principle of
> least surprise.  To me this means: if something has worked in XYZ way for 25
> years in nearly every single instance, leave it working in XYZ way.  Improve
> upon it, sure, find better solutions, sure, but don't just break it.

`write` is not broken for your use case – since know the pty, you can use `write` with it. (It's true that it can no longer *guess* the tty correctly, but that's also been true for longer than you think.)

`wall` has been broken for a long time, both because it requires terminal emulators to be setgid to 'utmp' (or use a setgid helper) which isn't always true anyway, and because [as already mentioned] an X11 session can exist without any terminal emulators running in which case the user will never get notified *anyway*.

A more sensible approach would be to consider the purpose of `wall` (notifying all users on all sessions) instead of a specific implementation (having an open terminal). KDE used to solve the latter problem by having a dedicated pty and showing a graphical notification whenever something is written to it; I also had once implemented the same for GNOME with a few lines of Perl. [Let me check if I still have the code...]

> I agree the use cases are limited (but they are still there).  I agree it
> seems like lame 1990 technology (but people still use it).  I agree the
> original ideas of utmp/write/wall are sub-optimal (but it worked well enough
> to be "standard" for 30 years).  But you can't possibly imagine all the
> weird uses people have found for this interesting, de facto standard,
> technology over those 30 years (like mine and the other posters').  To even
> try smacks of hubris.

On the other hand, that seems to imply that if nobody has improved some technology in 30 years, then it becomes untouchable and nobody else can ever try to improve it. If "well enough" means "still works better than the attempted modern replacements", that's one thing. But this seems to be merely a case of "works well enough that nobody *bothers* to invent something better." (<cough> systemd </cough>)

Let's say, if the goal is to notify all users, see my earlier comment about how KDE does it without a terminal at all. Even `wall` itself could very well be updated to use notify-send for graphical sessions, and fall back to /dev/pts for console ones. (I had done just that in ConsoleKit days; it'd be even easier now.)

As for the remaining weird uses... they may be interesting and useful so I'm not going to dismiss them with an Xkcd link, but how far should they take priority over explicit goals (like maintainability)?

> I'm shocked there isn't a library that does all this for you and terminal
> devs only have to put 10 lines of C code in to use it.  Shocked.  What do
> all the other 800 terminals out there (minus the 3 we've found that don't)
> do to make this work?  Are they all having to maintain 2200 lines of code? 

There are a few libraries, e.g. utempter.

> Why is it that 99% of terminals' devs think it's still worthwhile?

Do you know for sure that they think so, and aren't merely afraid to remove it?

> I won't pick this as my hill to die on for the simple fact that I was lucky
> enough to have a simple workaround (echo > pts file),

Are you saying the absence of utmp record makes `write $user $tty` stop working? Not in my tests.

> Bottom line: if your terminal is lacking a feature that junky xterm has, you
> may not be barking up the right tree... just sayin'

Even cruft like the Tek4014 mode, remotely-controlled window movement, &c.?

(For that matter, xterm is missing quite a few features that vte has. Just sayin'.)

Comment 31 Christian Persch 2017-07-23 11:18:03 UTC

The way to get the utmp logging back is by using a wrapper as outlined in the 2nd paragraph of comment 17.

Please do not add any more comments to this bug.

Comment 32 Conrad Hughes 2017-07-24 19:22:24 UTC

So the reason more people are commenting on this is that this "feature" has just been pushed out to Debian stable. For me it breaks email notification from remote hosts, a feature implemented on top of who and rwho.

It's really helpful that you've proposed a possible solution for us, but for all that it may be straightforward for you, it may not be so for users who're now finding that stuff they've depended on for (in my case) 29 years doesn't work any more. I appreciate the arguments presented here, that the entire concept of utmp and writing to ttys is totally broken, but until I'm offered an alternative that works as conveniently and universally as utmp used to, it would be nice not to break the old system. Which worked until I "upgraded" to Debian stretch a month ago.

I have attempted to implement the suggested wrapper, and for all that the code in principle looks trivial, I'm finding that invocations of libutempter fail silently. Rebuilding it in debug mode, it appears that ptsname() is unhappy with being asked to get a ptsname for stdin when run from within a gnome-terminal (or anywhere else, from my experiments). My thought mirrors the author of libutempter: since stdin maps to the pts (as in "ls -l /proc/self/fd/0"), this is a sensible route to finding the pty for which we're trying to create a utmp entry. But the errno returned when calling ptsname(STDIN_FILENO) (ENOTTY: Inappropriate ioctl for device) indicates (according to ptsname()'s manpage) that stdin "does not refer to a pseudoterminal master device". This comes as maybe not such a surprise since we're providing it with the slave, not the master, right? However at this point I'm at a loss: "man pty" tells me that for BSD pseudoterminals there's a clear way of translating from slave to master, but the implication is that for UNIX 98 the master devices are all invisible clones of /dev/ptmx. And we're clearly in the UNIX 98 model. And ptsname() seems to want a master, not a slave. And giving it /dev/ptmx doesn't help.

Any hints would be greatly appreciated. My code is here:

https://github.com/ConradHughes/utmpwrap

Comment 33 Christian Persch 2017-07-24 19:46:58 UTC

ttyname() instead of ptsname() should work. Or readlink on /proc/$$/fd/N.

Comment 34 Conrad Hughes 2017-07-24 21:33:26 UTC

Thanks Christian — that was really helpful and I've now got the workaround working using a patched version of libutempter.  Bug report on *that* submitted to the Debian maintainers here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=869617

I've updated my Github code, and it can successfully be used to create and destroy utmp entries for those who need want them.  Improvements will no doubt be necessary:

https://github.com/ConradHughes/utmpwrap

Comment 35 Egmont Koblinger 2018-02-04 13:17:38 UTC

In addition to utmpwrap from the previous comment, I happened to come across this tool (mentioned in st's FAQ):

https://git.suckless.org/utmp/

which also seems to do exactly this. (I haven't tried.)

Comment 36 Peter Selinger 2018-11-24 02:43:06 UTC

(In reply to Christian Persch from comment #11)
> Since the removal was a nice win in code size and maintainability, I'd
> rather not bring this back.
> 
> What exactly are the use cases for [uw]tmp ?
> * pam_wheel: Add "use_uid" option. About the unreliability, comment 5
> explains one problem, and also there's bug 317312 .
> * want to know which pty you're currenty on? tty(1), or readlink
> /proc/$$/fd/0
> * want a list of all ptys of your user? ls /dev/pts
> * anything else?

The use case you didn't mention was to see what *other* users are currently logged into terminals. The 'who' command uses /var/run/utmp to get that information. Is there another convenient way to get it?

Comment 37 Egmont Koblinger 2018-11-24 03:15:20 UTC

(In reply to Peter Selinger from comment #36)

> The use case you didn't mention was to see what *other* users are currently
> logged into terminals. The 'who' command uses /var/run/utmp to get that
> information. Is there another convenient way to get it?

ls -l /dev/pts

It's a reliable source, as opposed to utmp which is unreliable, maintained (or not) voluntarily as someone opens a graphical terminal emulator.

Comment 38 Mantas Mikulėnas (grawity) 2018-11-24 17:20:48 UTC

(In reply to Peter Selinger from comment #36)
> (In reply to Christian Persch from comment #11)
> The use case you didn't mention was to see what *other* users are currently
> logged into terminals. The 'who' command uses /var/run/utmp to get that
> information. Is there another convenient way to get it?

But GNOME Terminal windows don't even correspond to logins. I mean, people can be logged into GNOME *without* having a terminal window open!

To see people logged into the system, you should instead make the actual *login app*, e.g. GDM, create those utmp records. This could be done through a custom PAM module nicely. I might try writing a pam_utmp if there isn't one already.

(Note that systemd-logind already does this – it keeps internal track of users via PAM, and you can see who's logged in by running `loginctl`.)

Comment 39 Peter Selinger 2018-11-25 02:51:12 UTC

(In reply to Mantas Mikulėnas (grawity) from comment #38)
> But GNOME Terminal windows don't even correspond to logins. I mean, people
> can be logged into GNOME *without* having a terminal window open!
> 
> To see people logged into the system, you should instead make the actual
> *login app*, e.g. GDM, create those utmp records. This could be done through
> a custom PAM module nicely. I might try writing a pam_utmp if there isn't
> one already.
> 
> (Note that systemd-logind already does this – it keeps internal track of
> users via PAM, and you can see who's logged in by running `loginctl`.)

That is correct, but I was concerned with the 'who' command, which is a POSIX standard and is currently broken by gnome-terminal. I completely understand that the whole rationale of 'who' doesn't correspond to many modern use cases: there are better ways to get the information, and more relevant information one might want to get. The command goes back to 1970s multi-user systems where users typically connected locally or remotely via a tty. Nevertheless, it's a POSIX standard that the 'who' command, if present, should output certain types of information, so I am a bit concerned that gnome-terminal removed cooperation with this feature for no especially good reason.

I was thinking about whether it would make sense to re-implement 'who' to print the information it is supposed to print, but without using /var/run/utmp. On the positive side, POSIX makes 'who' optional, and gives lots of leeway about the output, as well as how it is implemented. The output is primarily intended to be human-readable, so it is hopefully unlikely that many scripts (especially portable ones) would rely on the output of 'who'.

Comment 40 Mantas Mikulėnas (grawity) 2018-11-25 15:49:37 UTC

(In reply to Peter Selinger from comment #39)
> (In reply to Mantas Mikulėnas (grawity) from comment #38)
> > But GNOME Terminal windows don't even correspond to logins. I mean, people
> > can be logged into GNOME *without* having a terminal window open!
> > 
> > To see people logged into the system, you should instead make the actual
> > *login app*, e.g. GDM, create those utmp records. This could be done through
> > a custom PAM module nicely. I might try writing a pam_utmp if there isn't
> > one already.
> > 
> > (Note that systemd-logind already does this – it keeps internal track of
> > users via PAM, and you can see who's logged in by running `loginctl`.)
> 
> That is correct, but I was concerned with the 'who' command, which is a
> POSIX standard and is currently broken by gnome-terminal. [...] Nevertheless,
> it's a POSIX standard that the 'who' command, if present, should output
> certain types of information, so I am a bit concerned that gnome-terminal
> removed cooperation with this feature for no especially good reason.

Broken in what way? As far as I can see, the POSIX standard *does not* require every single tty/pty to be listed; the description only talks about "accessible users" and leaves defining that to the implementation.

So if the Linux implementation of `who` just lists one entry per (graphical) session rather than one per pty, that still seems within the spec *and* the spirit of the command. I mean, it tells you who is logged in, and that's generally more useful than telling you who's got an xterm open.

You don't even need that PAM module I mentioned – GDM already inserts an utmp entry for every user logging in graphically. For example, I have an X11 login active, and sure enough – the utmp and `who` say there's one person logged in:

    $ who -H
    NAME     LINE         TIME             COMMENT
    grawity  :0           2018-11-25 01:08 (:0)

Now if the fact that xdm & gdm have been adding POSIX-nonconfirming entries with tty ':0' into utmp for 15 years hasn't been bothering you until today, I don't see why that should suddenly become a problem now either.