Bug 646098 – vte uses too many file descriptors

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 646098 - vte uses too many file descriptors


Summary:	vte uses too many file descriptors


Status:	RESOLVED FIXED

Product:	vte
Classification:	Core
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	VTE Maintainers
QA Contact:	VTE Maintainers

URL:
Whiteboard:	[needed-next][commit 5ad51c95557f4e95...

Depends on:
Blocks:

Reported:	2011-03-29 13:28 UTC by Behdad Esfahbod
Modified:	2015-01-17 12:06 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
fix for comment 12 (3.73 KB, patch) 2013-09-29 14:25 UTC, Egmont Koblinger	committed	Details \| Review
server: Increase RLIMIT_NOFILE to the maximum allowed (1.91 KB, patch) 2014-10-19 07:32 UTC, Christian Persch	committed	Details \| Review

Description Behdad Esfahbod 2011-03-29 13:28:31 UTC

Originally reported at https://bugzilla.redhat.com/show_bug.cgi?id=667539 and possibly other places.  People with more than 100 tabs run out of fds.  Here's why:

Normally, vte uses two fds.  One to the /dev/pts/X, another to /dev/ptmx.  Not sure if we need two.  Will investigate.

Then the scrollback buffer stuff uses three to store three different streams (text stream, attributes stream, and line-breaks stream).  And unless you are using unlimited scrollback, those streams will allocate another fd each to implement a ring.

So, each widget can use up to 8 fds.  With a 1024 per-process limit, we run out between 120 and 130 tabs.  That's way too fast.

A workaround would be to use unlimited scrollback.   Just make sure you have room under /tmp.

Comment 1 Behdad Esfahbod 2011-03-29 13:33:32 UTC

Humm, on my Ubuntu the limit is set at 32k instead of 1024.

Comment 2 Tom Hughes 2011-03-29 13:34:37 UTC

I don't dispute that there are other FDs used by each widget and that these may be freed when the widget is closed, thus meaning that when the limit is reached closing a few windows may allow you to open some again.

I do however dispute that there is no leak of PTY related descriptors. The experiment I conducted and details in my posts on the RedHat bug makes that absolutely clear I believe.

Basically what I did was to open a single window, run lsof and note which PTY descriptors were open, then open a second one and use lsof again to see that two more had appeared. Finally I closed the second window and noted that lsof still shows the second set of PTY descriptors as open.

Comment 3 Tom Hughes 2011-03-29 13:35:25 UTC

Here is the detail of what I observed:

https://bugzilla.redhat.com/show_bug.cgi?id=667539#c1

Comment 4 Christian Persch 2011-03-29 13:38:53 UTC

(In reply to comment #3)
That leak has been fixed already (bug 632257). The issue in comment 0 is a different problem.

Comment 5 Behdad Esfahbod 2011-03-29 13:39:45 UTC

In the mean time, maybe g-t should increase the fd limit from it's default to the hard limit.  getrlimit/setrlimit() do that.  ChPe?

Comment 6 Behdad Esfahbod 2011-03-29 13:40:42 UTC

Ah, ok.  So these are two separate issues.  Thanks for the clarifications.

Comment 7 Behdad Esfahbod 2011-03-29 14:17:34 UTC

We should find a way to close the slave fd in the parent process.  We don't need it after the fork, but we just don't know when the fork happens.  Maybe add API for that and rely on the user to call it?

Comment 8 Behdad Esfahbod 2011-03-29 14:29:34 UTC

Also, I agree with you in vte_pty_child_setup():

                /* Try to reopen the pty to acquire it as our controlling terminal. */ 
                /* FIXMEchpe: why not just use the passed fd in TTY_OPEN_BY_FD mode? */ 


I think that's to make the fd the controlling terminal of the child session.  We already do ioctl TIOCSCTTY if it's available.  Maybe do the reopening only if that fails?

Comment 9 Mariano Suárez-Alvarez 2011-03-30 04:15:41 UTC

Is using 100 tabs a supported use case? 

(Of course, if there is a leak, it should be plugged... but that is a completely different issue)

Comment 10 Tom Hughes 2011-03-30 07:31:08 UTC

It's not just 100 tabs - because gnome-terminal uses one process for all the terminals on a desktop, no matter how many windows are open, it could be 100 windows, or 20 windows each with 5 tabs, or any other combination that leads to a large total number of terminals existing.

Comment 11 Behdad Esfahbod 2011-03-30 13:29:14 UTC

With 30" monitors and 8 workspaces being norm in sysadmin land, I don't see why not...

Comment 12 Egmont Koblinger 2013-09-29 11:04:27 UTC

The scrollback buffer can use up to 12 (!) fds per vte. Six for the normal screen and six for the alternate screen.

It should be quite easy to fix the alternate screen to grow ring->mask to accomodate all the rows, so that we don't hit the disk here.

Comment 13 Egmont Koblinger 2013-09-29 14:25:18 UTC

Created attachment 256022 [details] [review]
fix for comment 12

This fixes the bug in my recent comment. The patch applies on top of bug 415277's fix.

Guaranteeing ring->mask >= terminal->row_count is a double win:

1. As mentioned in the previous comment, the alternate screen never uses any file descriptors for its scrollback. This can save up to 6 fds per vte widget.

2. On the normal screen do this: output 35+ lines of text, then resize the terminal to be 35 lines tall, then keep producing output. Previously after each output line printed (after the resize), vte would read() from the disk (thaw and cache rows) to draw the top 4-5 lines, as ring->mask stayed on 31. Now vte doesn't need to access the disk for this anymore, it always stores at least terminal->row_count rows in memory.

The code also contained a harmless off-by-one: the ring's array would use one less entry than it could. E.g. with the initial mask==31 it would normally store 30 lines, right after storing a 31st the top one would be kicked out (frozen). I modified it to normally store 31 (i.e. ring->mask) and upon storing one extra it'd freeze the top one, or increase the mask.

The new code ensures that ring->mask >= terminal->row_count always holds, increasing the mask if necessary. With the new behavior of ring->mask (after fixing that off-by-one) this is exactly what we want.

Please test and review:

1. Start vte with a relatively small font size (e.g. 'Monospace 6'), produce tons of output on the normal screen. Start mc, maximize vte. Check the fds under /proc/xxx/fd. Previously there'd be 11-12 /tmp/vte... deleted files, now there should be no more than 6.

2. Repeat steps of bullet point 2 from above, with VTE_DEBUG=ring. See those millions of Caching/Thawing debug lines eventually disappear (as soon as the ring with the increased mask is filled up).

3. No new bugs should arise :)

Comment 14 Behdad Esfahbod 2013-09-29 20:16:00 UTC

Thanks Egmont.  I'm cleaning this up and committing.  Please don't use "//"-style comments in future patches.

Comment 15 Christian Persch 2014-10-19 07:32:58 UTC

Created attachment 288830 [details] [review]
server: Increase RLIMIT_NOFILE to the maximum allowed

Comment 16 Egmont Koblinger 2014-10-20 15:46:54 UTC

LGTM

Mainly for future reference: I can see 7 FDs per terminal now (ptmx + 6 for the ring, not sure that the 8th would be). Bug 738601 comment 11 suggests a way to store the ring in 3 FDs rather than 6, that would allow 200-250 terminal with the default nofile=1024. (Plus we have ideas how to make that 2 instead of 3.)

I think the best is to have both improvements (fewer FDs plus larger limit) at the same time, that'd easily allow us 800+ terminals :)

Comment 17 Egmont Koblinger 2014-10-20 15:55:40 UTC

... although your patch makes me wonder what's the purpose of having soft ulimits at all, why not just the hard ones?  Is it maybe about playing nice across apps, and providing a safety cap for emergencies?  Maybe g-t could leave a bit of gap, i.e. increase the limit only up to let's say hardlimit-64? Would that make any sense? Dunno. (After all, you can start closing those hundreds of tabs in case of such an emergency.)

Comment 18 Egmont Koblinger 2015-01-17 12:06:35 UTC

Now we use at most 4 FDs per terminals (ptmx + 3 for the stream).  Each FD of the stream is only opened after ~64k of data, so in practice probably many terminals end up using fewer.

This, along with Christian's change, should be good enough; so marking as fixed.

Bug 741520 contains an idea for further improvement.