minibase/doc/vtmux.txt at 0.6.2 · arsv/minibase

286 lines (219 loc) · 13.1 KB
What is vtmux
~~~~~~~~~~~~~
Virtual Terminal MUltipleXer allows running several processes willing to
access /dev/input devices and/or become KMS masters concurrently by proxying
their access to those devices, and multiplexing data streams in sync with
regular VT switching.
Or in other words, vtmux is needed to run one wayland compositor on tty1,
another one on tty2, and a regular text mode console on tty3, so that C-A-Fn
VT switching would work as expected.
Why the need for some extra tools to do that?
Because of the two problems outlined below.
Problem 1: concurrent access
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Whenever a process reads from, or writes to, /dev/ttyN, it only gets the keys
pressed and its output only appears on the screen if ttyN is active.
This is handled within the kernel.
Display servers (as in, X servers or Wayland compositors) do not read their
input from /dev/ttyN, and do not write there to get their output onto
the screen. Instead, they open numerous devices from /dev/input/ to get input
events, and use one or more nodes in /dev/dri/ to set up their video output.
None of those devices are multiplexed within the kernel.
If two processes are reading from /dev/input/event0, both will get the same
keystrokes no matter which VT is active. Doing anything related to the actual
output on DRI (setting modes, flipping frames and such) requires the process
to be a "DRI master". Only one process may be DRI master for a given device
at a time. The second process will only get EBUSY or EAGAIN.
Problem 2: access restriction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Imagine the following process tree:
  weston -+
          +- wlterm
          +- chromium
Say the user types something, weston reads keystrokes from /dev/input/event0
and passes them to wlterm. Chromium should not be able to (a) get wlterm's
keystrokes from weston and (b) open /dev/input/event0 and get the same
keystrokes directly from the device, bypassing weston.
Making sure (a) holds is one of the key design features of Wayland. But (b)
is not that simple. Weston compositors are expected to run with the same uid
and the same gids as their clients. If weston can open event0, how to prevent
chromium from doing the same?
Note this problem is often posed as running "non-privileged" display server,
which is somewhat misleading. It is really about making display server *more*
privileged than its clients, but without going as far as giving it all (root)
privileges.
Solution: userspace multiplexer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A dedicated highly privileged process opens devices on behalf of processes
that use them (weston, X, whatever) and manages data flows so that only one
process controls the devices at any given time.
Client processes possess some sort of token that they use to request device
access from the multiplexer but do not pass to their children.
In the example above, weston has the token, chromium and wlterm do not,
so weston can request access to event0 but the other two cannot, so they
must rely on weston to provide them with the events.
The token vtmux uses is an open file descriptor, specifically a half
of a socketpair(2); vtmux itself keeps the other half. The socket doubles
as communication channel for the requests.
Technical background
~~~~~~~~~~~~~~~~~~~~
Linux is not very well suited for proxying device access in userspace.
A process cannot listen(2) on a device node and accept(2) connections.
Even if it were possible (think FUSE), multiple context switches would
occur exactly where they should not: in latency-sensitive input code,
and potentially very fast frame-flipping code.
Instead, the client processes get actual open device fds and communicate
directly with the kernel.
Opening a device file from userspace makes the kernel allocate its internal
file-block structure. Re-opening the same device allocates another block, but
dup(2)ing a fd does not: the copy of the descriptor references the same shared
file-block structure, even if the fds end up in different processes.
Passing a fd through a unix domain socket is equivalent to duping it.
Both DRI and input devices have dedicated ioctls to "disable" the file-block
structure. Disabled inputs never become ready for reading and read nothing.
Disabled DRIs do not allow any output (kind of, see below on mastering).
The ioctls can be performed via any of the fds referring to the block, as long
as the calling process has enough privileges to do so. The block becomes
disabled for *all* fds referring to it. Disabling one block does not affect
other blocks referring to the same device.
Upon client request, vtmux opens the device and sends the fd over the socket.
The client gets a copy of the fd and uses it for device access. Vtmux keeps
its copy and uses it to disable the underlying file-block whenever the client
becomes inactive.
Notes on DRI masters
~~~~~~~~~~~~~~~~~~~~
DRI devices (/dev/dri/cardN) handle both benign ops like GPU-accelerated
rendering to off-sreen buffers and privileged ops like modesetting (KMS),
video output configuration, frame buffer allocation, frame flipping.
Any process willing to use GPU must be able to open /dev/dri/cardN.
Only display servers should be able to do modesetting, and only one at a time.
And display servers are not supposed to have root privileges.
So the devices files themselves are not protected, and can be opened by almost
anyone. For each DRI device, at most one file-block at a time may be designated
a "master". Privileged ops may only be performed on a fd referring to
the master block. The process performing them does not need any additional
privileges, only a fd referring to the master block.
Setting or removing master status from a block requires root privileges and
a fd referring to the block. Multiplexer revokes master from inactive clients,
and sets it for client on the active VT. The clients themselves cannot change
their master status, but can do master-only ioctl when they are active.
Clients may get the master status suddenly revoked from underneath them and
must be ready to deal with unexpected EAGAINs from privileged ioctls.
VT switches are usually signalled, but signal delivery is inherently racy.
Clients that get the master status suddenly bestowed upon them must be prepared
to re-initialize the device, or at least check its configuration.
Notes on input device revocation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There is no such thing as input mastering. Inputs are single-purpose and
should not be readable for non-privileged processes.
Unlike DRIs, disabled input devices cannot be re-enabled. The reasons for this
are not clear at all. The clients are supposed to re-open all inputs each time
their VT becomes active.
Like with DRIs, disabling file-block requires root privileges.
VTs in KMS environments
~~~~~~~~~~~~~~~~~~~~~~~
Display servers that use KMS and input devices do not really need VTs at all.
It's enough to have DRI and input multiplexing in place for them to work.
Once the process becomes DRI master, it can do anything with the video outputs,
and inputs are not affected by VT switching at all.
If all clients would use KMS and input, it would be possible to throw out
VT code from the kernel, remove VT_* ioctl from vtmux, and only do fd
multiplexing. This would also work well with kmscon. VT system, at this point,
is a kind of in-kernel KMS display manager providing services for non-KMS-aware
userspace clients, and does not belong in kernel space.
In a mixed environment, with clients that may still rely on VTs interfaces,
vtmux cannot tell for sure whether any given client is KMS-aware or not, so
it allocates a VT for all clients uniformly. VT switching in this case is
like engaging/disengaging this in-kernel display manager.
All clients also get their stdin, stdout and stderr redirected to respective
/dev/ttyN device, whether it makes sense or not. In KMS-only environment, it
would probably be better for vtmux to leave standard fds as is.
Why explicit keyboard input / VT_LOCKSWITCH?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C-A-Fn key combos are usually handled within the kernel. During startup,
vtmux disables this in-kernel handling, and waits to be commanded a VT
switch via its control socket instead.
There are several reasons for this, the primary being that it makes the
switching code within vtmux somewhat saner. When the command comes from
within vtmux, the sequence is clear: disengage current client, switch VT,
engage new client. This would also work exactly the same way on a pure KMS
kernel with no VT support.
With the switching completely controlled by vtmux, we can also decide
whether we want to switch or not, effectively implementing disabled VTs.
Conventional method of doing the switch is reactive in nature and depends
heavily on the in-kernel VT subsystem. The kernel informs us that it wants
to switch, we disengage current client and report that to the kernel, the
kernel switches and informs up, we engage another client.
Starting new sessions / what is greeter?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Opening a new VT implies several steps:
0. deciding to start a new session
1. allocating a new VT
2. querying user credentials (login)
3. calling setuid/setgid
4. spawning UI
Steps 0 and 2 are often free-form and require heavy user interaction. Think
graphical login screen. Steps 1 and 3 are privileged, but depend on the data
obtained from steps 0 and 2, and so does step 4.
Traditional tty login process relies on pre-determined answers to 0+1
("we need 6 VTs") and does steps 2+3 in a single highly privileged process
called "login". This approach is not viable if anything graphical is involved
on steps 1 (like starting X or weston) or 2 (login screen).
X-based DMs typically split step 2 into a dedicated, non-privileged process
called greeter, but otherwise follow the same structure: DM gets started on
a pre-allocated VT, starts greeter to query user credentials, gets the data
it needs from the greeter, *stops the greeter*, and proceeds with steps 3+4
on the same VT. For unrelated reasons, it is also the privileged process (DM)
from step 3 that verifies the user credentials from step 2.
In vtmux, the sequence is a bit different. The greeter process runs on a
dedicated VT, and does 0+2: takes user's command to start a new session,
ask for password if necessary, and (unlike the X case) verifies the password.
The greeter then commands vtmux (which is a privileged process) to proceed
with steps 1+3+4, passing it a single token, a "session name" $user. Which
may accidentally match the name of the user, but it's not necessary.
Vtmux takes the token, allocates a new VT (step 1), and spawns there
a (still privileged) script called /etc/vts/$user. The script is expected
to do 3+4, i.e. call setuid and spawn X or weston or bash, but vtmux does
not really care about that. Once the process exists, vtmux closes the VT.
From vtmux point of view, greeter itself is just another session name
(and thus a script in /etc/vts/, likely with setuid and exec calls) which
vtmux can start on its own, either during startup or on C-A-Esc key combo.
More on greeter
~~~~~~~~~~~~~~~
Unlike with common X configurations, the greeter may continue to run
after the session has been allocated. It may exit as well, no big deal,
vtmux will restart it on request, but it is not required to exit for
another client to be started. In vtmux terms, it just becomes a background
It is probably a good idea for greeter to remain active right after login,
but exit after some time of inactivity. If so, it is up to greeter itself
to decide when exactly.
To avoid interference with regular user session, greeter is spawned on
some high VT, typically 10-11-12. This way VT 1 will likely be the first
user session, VT 2 second user session and so on.
In contrast with common configurations, it is up to greeter to perform user
credentials check. This allows stuff like smart-card authentication or
fingerprint reading to be done in non-privileged code right next to the
password prompt. This also means the usual hash password database (/etc/shadow,
but more likely something entirely different) must be available to the greeter,
which will likely run with its own uid.
The scripts in /etc/vts are privileged and should not be writable for anyone
but root. Greeter may be denied from reading them, or even entering
the directory; all it needs is a list of session names, which should probably
be kept somewhere else along with the passwords.
See ../../doc/login.txt on something ideas behind this design, and ./howto.c
on how to write greeters.
(if either of those files is missing, it has not been written yet).
Trying vtmux
~~~~~~~~~~~~
Upon startup vtmux will lock VT switching, and may NOT unlock it in case
of abnormal exit. Make sure to have something like "sleep 30; ./unlock"
running somewhere, or at least some way to run it on request (maybe via
an ssh session).
Unlocking VTs while vtmux is running may interfere with its operation.
Switching VTs in a way that's not visible to vtmux (e.g. with C-A-Left
while in text mode, or with chvt(8)) will result in DRI/input devices being
engaged to a background VT.
Any switch performed by vtmux will lock VTs back again.
For any sort of debugging, run vtmux -n. This will prevent it from using
the tty is started on for any of the clients.
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

vtmux.txt

Latest commit

History

vtmux.txt

File metadata and controls