-
Notifications
You must be signed in to change notification settings - Fork 15
Expand file tree
/
Copy pathgidcaps.txt
More file actions
209 lines (153 loc) · 8.62 KB
/
gidcaps.txt
File metadata and controls
209 lines (153 loc) · 8.62 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
Groups IDs as userspace capabilities
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Talking to certain kernel subsystems is only possible if the calling
process has the right capability bit set, as defined by capabilities(7).
For instance, reconfiguring kernel IP stack requires CAP_NET_ADMIN.
What if the subsystem happens to reside in the user space?
Take for instance a high-level network configuration manager (wimon, connman,
NetworkManager). Makes sense to allow non-privileged users to issue commands
to the manager. But there should be at least some restrictions. Not all users,
not in all cases. In particular, it is highly desirable to distinguish
between the human operator and the stuff running on his behalf, like say
Steam or Chromium.
In modern (and not so modern) Linux systems this problem is usually solved
with tools like sudo, which is fundamentally very different from capability
based approach the kernel takes. There is also dbus with its own set of issues.
And there is a simple elegant solution to this problem that does not involve
any tools and only relies on the basic kernel interfaces: using auxiliary
group ids as userspace capability tokens. If implemented, it would provide
reliable, flexible IPC permission control. However, current kernels lack
two relatively minor functions crucial for this approach to work well, and
cripples it so much it's not viable compared even to sudo.
So this is an outline of how the solution would work, which changes are
necessary for it to work, and why.
How it should work
~~~~~~~~~~~~~~~~~~
On login, the top spawned user process gets several auxiliary gids
representing user's capabilities. Whenever the whenever a process
tries to contact a service that requires certain capabilities, its
current auxiliary gids are checked against service's requirements.
For instance, having "network" group among gids means the process
can ask high-level network configuration manager to switch networks,
but cannot directly configure the kernel IP stack (which would require
CAP_NET_ADMIN).
Like with kernel capabilities, the user is free to set up the environment
to drop unnecessary capabilities before spawning certain processes.
In the example above, the user drops "network" gid before spawning Steam,
preventing the latter from reconfiguring network. However, desktop applet
that shows networking status is spawned with "network" still in gids, so
it is still allowed to issue network configuration commands.
Problem 1. GID checks are too difficult to configure in current kernels
Problem 2. Users *cannot* drop gids in current kernels
Context: passing commands to a daemon
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Let's use wimon as an example. The long-running, highly privileged manager
process (wimon) creates a control socket on startup and listens for
connections:
srwxr-xr-x 1 root root 0 ... /run/wimon
A control tool, wictl, opens the socket and sends some command there.
How should wimon ensure that the command is ok to execute?
Case 1: sudo (privilege escalation for client code)
Let wimon check effective uid of the connecting process, and reject
the connection unless it's 0. Then use sudo to run wictl, so that all
permitted commands would be issues with effective uid 0.
It is up to sudo to decide whether the user is allowed to control wimon,
ask some sort of password if necessary and so on.
Case 2: dbus (dedicated authentication broker).
Instead of connecting to wimon directly, wictl, running with non-privileged
uid and euid=uid, connects to a privileged broker service and asks it to
send given command to wimon. The broker checks permissions based on wictl
euid, and relays the command to wimon. Like in case 1, wimon only accepts
commands sent from euid 0, or some other uid reserved for the broker.
It is up to the broker to check user's permission to run a given command.
Case 3: server-side credentials check
wictl sends its uid/gid(s) over the socket, or wimon uses getpeerinfo
to get them. Either way, wimon decides whether to allow the command
based on the ids it gets.
This allows really fine control over permissions since the checks
happen as close to the command parsing code as possible. Also, there's
no privilege escalation and no additional services.
Case 4: kernel-mediated permissions check
File permissions on /run/wimon are used to control access to wimon.
The socket gets chmod srwxr-x--- early, and its gid gets set somehow.
When wictl attempts to connect to the socket, the kernel checks its
uids and gids and either allows or disallows connection.
If implemented, this would require almost no effort from the userspace
once the file ownership has been set.
Problem: setting permissions for a socket
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because of the weird ways local sockets work, the only reliable way to
do it is from the service process, with a sequence like this:
unlink(path); /* or otherwise make sure $path does not exist */
fd = socket(...);
bind(fd, { AF_UNIX, path })
chown(path, uid, gid); /* note: not fchown(fd) or fchmod(fd) */
chmod(path, mode);
listen(fd);
For this to work, the service itself must know uid and gid.
How do we configure that?
One option is to hard-code them. It's not a good option.
Another option is pass them to the daemon somehow, naturally as names
and not as ids:
./server -u user -g group
The service -- any service -- will then need to convert "user" and "group"
into uid and gid by parsing /etc/passwd and /etc/group. This may be viable
for regular large-libc settings, but for minitools, it's just not right.
The root of the problem above is the unlink() call at the top.
Were the service capable of binding on a pre-made socket, it would
be possible to use chown(1) to set up permissions, and the service
would not need to care about that at all.
Solution: sticky sockets
~~~~~~~~~~~~~~~~~~~~~~~~
Linux already allows creating sockets with mknod, even though the resulting
FS nodes are completely useless. The idea is to fix this, in a way that won't
break existing code too much.
Let's call a local socket with the sticky bit set a "sticky socket".
This combination is meaningless and should never happen with conventional
use of local sockets, so we can assign pretty much semantics to it.
Trying to bind() a sticky socket should
* fail with EBUSY if it is already bound by another process
* fail with EPERM unless the calling process owns the socket
(euid = uid of the socket), or it has CAP_CHOWN
* succeed otherwise
With sticky sockets, permission setup becomes simple and straightforward:
mknod /run/wimon s
chown wimon:network /run/wimon
exec wimon +/run/wimon
The service only needs to know the name of the socket to bind.
All passwd/group parsing code remains in chown where it belong.
Problem: no way to drop gids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This one is simple, setgroups(2) unconditionally requires CAP_SETGID.
In contrast, capset(2) allows dropping bits for non-privileged users.
No real options either, setgroups should be allowed for non-privileged
users as long as no attempt is made to gain gids.
For a process having groups C, setgroups(G) should succeed
if (g in C) for all (g in G).
This sounds simple, but turns much worse than expected upon closer
examination. In Linux, gid space is huge (2^32) and allowed size
of both sets is quite large as well (2^16). For comparison, there
are at most 32 capability bits and no more that 32 of them per set.
Luckily setgroups is rarely used even outside of minitools context,
and even then, it's mostly to impersonate a user, which means it reads
and applies whatever is written in /etc/groups.
Solution: setgroups() semantics change
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Let setgroups() work as is for privileged users. For non-privileged users,
compare C and G, assuming both are sorted, and return
* -EPERM, if inversion is detected in either C or G
* -EPERM if exists g in G: g not in C
* 0, and update C, otherwise
Those privileged uses of setgroups that matter (i.e. may be followed by
non-privileged use) will be in minitools-specific code anyway, so making
sure C is ordered should be possible.
Sub-socket level permission control
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What if the service needs finer access control than just connect/no-connect?
Something like allowing some commands to all users and some to those in
specific group only.
Sure it can't be done with a single sticky socket, but it's not like sticky
sockets prevent the service from using any of the cases above. Send a set
of pre-configured groups over the socket in particular should work well
and may be freely combined with socket permissions.
Another possible alternative is to make one sticky socket per role.