Y2038, glibc and utmp/utmpx on 64bit architectures
On January, 19th 2038 at 03:14:07 UTC the 32bit time_t counter will overflow. For more information about this I suggest to start with the wikipedia Year 2038 problem article. That problem is long known and several groups are working on a solution for 32bit systems, but many people don’t know that pure 64bit systems could be affected, too.
The general statement so far has always been that on 64bit systems with a 64bit time_t you are safe with respect to the Y2038 problem. But glibc uses for compatibility with 32bit userland applications 32bit time_t in some places even on 64bit systems:
/* The ut_session and ut_tv fields must be the same size when compiled
32- and 64-bit. This allows data files and shared memory to be
shared between 32- and 64-bit applications. */
#if __WORDSIZE_TIME64_COMPAT32
int32_t ut_session; /* Session ID, used for windowing. */
struct
{
int32_t tv_sec; /* Seconds. */
int32_t tv_usec; /* Microseconds. */
} ut_tv; /* Time entry was made. */
#else
long int ut_session; /* Session ID, used for windowing. */
struct timeval ut_tv; /* Time entry was made. */
#endif
Affected is everything around utmp/utmpx
, wtmp/wtmpx
and lastlog
.
In this article I will concentrate on the utmp/utmpx
implementation of glibc on Linux. There are two more blogs about wtmp
and lastlog
:
- Y2038, glibc and wtmp on 64bit architectures
- Y2038, glibc and /var/log/lastlog on 64bit architectures
From the utmp manual page:
The utmp file allows one to discover information about
who is currently using the system.
What are the problems?
There are two main problems with utmp/utmpx with glibc on e.g. x86-64:
- A 32bit time_t field is used for the time, which will overflow in 2038
- There are design issues which allow a DoS attack (utmp/wtmp locking allows non-privileged user to deny service)
An analysis of the second problem by the glibc developers showed, that an extra daemon would be necessary, who handles utmp/utmpx access.
There are some more problems:
Due to the fixed format the length of the username is limited to 32 characters, while nowadays all other tools allow more or less unlimited usernames. And strings are null terminated unless they use the full length of the variable. This complicates the parsing of the data and leads to errors in the applications every now and then.
Additional there is a generic problem with utmp/utmpx
: the usage, especially who should create which entry, is not defined.
As result: if you use GNOME and start 5 terminals, who
will report one user. If you use KDE or xdm and start 5 terminals (konsole or xterm), who
will report six users. Same if you use screen
or tmux
: with screen who
will report every session an extra user, with tmux you will only see one user.
So you can only use the data for informative reasons, but you cannot trust them for monitoring or something similar.
Another problem is the age of the format and how data is stored.
Who is really using utmp/utmpx today?
The applications which uses utmp/utmpx
can be sorted into three categories:
- Init daemon for process management. systemd is not using it for this.
- Applications writing
utmp/utmpx
entries for others. This are applications like login, sshd or display manager (xdm, gdm, sddm,…), which write the information about which user, on which terminal, from which remote machine, did login at which time. They need special access rights for this, which is especially for X11 applications a big problem. - Applications using the information from
utmp/utmpx
for informative messages or monitoring systems. This are tools likew
,who
,uptime
but also tools likewall
orwrite
, which useutmp
to get the tty of the user session. The most often reasons for usingutmp
by this applications are in following order. All other usages can be ignored, this are single usages:- Counting logged in users
- Display logged in users with tty, login time and from where
- Write messages to the tty of a user
More information about which applications use utmp/utmpx and for which reasons can be found on my Y2038 document
At least with systemd, utmp fields like runlevel or dead processes are not used.
Is utmp really needed today with systemd?
If you use systemd as init process, you don’t need utmp
. Systemd comes with a PAM module, which collects all required data used by the category 3 applications above. And there are two interfaces, which allows to query systemd-logind for this data:
- libsystemd (sd-login.h,
sd_*()
functions) - DBUS
Latest with systemd 254 all required information are accessible via a sd_*()
function.
Proposal
- Change all applications, which read
utmp
, to query systemd-logind instead - Stop writing
utmp
entries after we are sure nobody uses them anymore
Were other alternatives evaluated?
Yes, we deeply looked at alternate solutions.
Adjust glibc to use 64bit time_t on all architectures
This was of course the first idea, but this is not trivial, as changing the utmp format is a massive ABI breakage. And since we speak about structs and variables, this cannot be solved with symbol versioning. These ideas and first implementations were rejected by the glibc developers, because the development effort and the problems arising from the ABI change are disproportionate to the benefits. The security problems around utmp don’t get solved: applications still need special rights to write to the file, and the DoS problems are not solvable with this, too. And at least, the data is still not trustworthy.
There is still some free space in utmp for an additional time_t field
Yes, there is enough free space left in the struct utmp
of glibc for future enhancements, which could be used for an additional 64bit time field. But the migration will be pretty complicated. We will have utmp entries with only the old time field, and utmp entries with the new and old time field. And we have applications, which only know about the old time filed, while others use the new one.
Hiding this in glibc alone is not possible, since this is a struct and applications accessing the members of the struct directly. So every reading application needs to be adjusted to check, which field got used by the writing application.
This idea does not solve the security and trust problems, too.
Why not write an own daemon?
That was considered, but this would mean to create a secure design for this, which does not have the current problems, and implement that in a really secure way. But: we have already such a daemon: systemd-logind
So why have the infrastructure twice? Means applications would need to submit the same data to two daemons and applications can query two daemons for the same data. This does not make much sense. While this solves the security problems, it does not solve the trustworthy of the data.
What if I don’t use systemd?
This heavily depends on the libc and init system you are using. For s6 exists e.g. a secure utmp (utmps) implementation, while on the other side musl libc has no support for utmp/wtmp at all.
There is also the elogind project, a standalone version of the systemd project’s logind. It’s designed for users who prefer a non-systemd init system, but still want to use popular software that otherwise hard-depends on systemd.
I haven’t evaluated it yet, but since loginctl can query it, it should be possible for other tools to get the informations in the same way from elogind
as they get them from logind
. This would then be the own daemon
solution.
Example code
Using the sd_*()
functions is most of the time much simpler than parsing utmp
. For the most common use case of counting the currently logged in users, common code looks often like:
...
#include <utmpx.h>
...
int users = 0;
struct utmpx *ut;
setutxent();
while ((ut = getutxent()))
if (ut->ut_type == USER_PROCESS)
users++;
endutxent();
With systemd-logind this would look like:
...
#include <systemd/sd-login.h>
...
int users = sd_get_sessions(NULL);
Current status
For all packages, libsystemd v254 or greater is required.
- coreutils >= 9.4 has support for logind (
--enable-systemd
) and should be Y2038 safe. - Linux-PAM >= 1.5.3 resolved all Y2038 issues (
--enable-logind
). - openssh requires patches, PR:logind-set-tty.patch, Add wtmpdb support as Y2038 safe wtmp replacement.
- procps-ng >= 4.0.4 contains support for logind (
--with-systemd
). - psutil requires patches, PR: Use logind instead of utmp because of Y2038.
- qemu: WIP, upstream status unclear.
- rsyslog >= 8.2312.0 (December 2023), accepted PR: use logind instead of utmp for wall messages with systemd.
- samba: upstream status unclear, patch exists.
- shadow >= 4.14.0 has systemd-logind support (
--enable-logind
). - systemd >= v254 has all necessary interfaces, systemd >= v255 does not use utmp internal anymore if you compile with
-Dutmp=false
. - util-linux accepted first patches for this (Y2038 and utmp/wtmp/lastlog on bi-arch systems like x86-64 with glibc), 2.40 should be Y2038 safe.
Presentations
There are recordings of two presentations explaining this in detail: