Y2038, glibc and utmp/utmpx on 64bit architectures

8 minutes
March 3, 2023 - Last modified: November 13, 2023

On January, 19th 2038 at 03:14:07 UTC the 32bit time_t counter will overflow. For more information about this I suggest to start with the wikipedia Year 2038 problem article. That problem is long known and several groups are working on a solution for 32bit systems, but many people don’t know that pure 64bit systems could be affected, too.

The general statement so far has always been that on 64bit systems with a 64bit time_t you are safe with respect to the Y2038 problem. But glibc uses for compatibility with 32bit userland applications 32bit time_t in some places even on 64bit systems:

/* The ut_session and ut_tv fields must be the same size when compiled
32- and 64-bit. This allows data files and shared memory to be
shared between 32- and 64-bit applications. */
#if __WORDSIZE_TIME64_COMPAT32
int32_t ut_session; /* Session ID, used for windowing. */
struct
{
int32_t tv_sec; /* Seconds. */
int32_t tv_usec; /* Microseconds. */
} ut_tv; /* Time entry was made. */
#else
long int ut_session; /* Session ID, used for windowing. */
struct timeval ut_tv; /* Time entry was made. */
#endif

Affected is everything around utmp/utmpx, wtmp/wtmpx and lastlog.

In this article I will concentrate on the utmp/utmpx implementation of glibc on Linux . There are two more blogs about wtmp and lastlog:

From the utmp manual page:

The utmp file allows one to discover information about
who is currently using the system.

What are the problems?

There are two main problems with utmp/utmpx with glibc on e.g. x86-64:

  1. A 32bit time_t field is used for the time, which will overflow in 2038
  2. There are design issues which allow a DoS attack (utmp/wtmp locking allows non-privileged user to deny service )

An analysis of the second problem by the glibc developers showed, that an extra daemon would be necessary, who handles utmp/utmpx access.

There are some more problems:

Due to the fixed format the length of the username is limited to 32 characters, while nowadays all other tools allow more or less unlimited usernames. And strings are null terminated unless they use the full length of the variable. This complicates the parsing of the data and leads to errors in the applications every now and then.

Additional there is a generic problem with utmp/utmpx: the usage, especially who should create which entry, is not defined. As result: if you use GNOME and start 5 terminals, who will report one user. If you use KDE or xdm and start 5 terminals (konsole or xterm), who will report six users. Same if you use screen or tmux: with screen who will report every session an extra user, with tmux you will only see one user. So you can only use the data for informative reasons, but you cannot trust them for monitoring or something similar.

Another problem is the age of the format and how data is stored.

Who is really using utmp/utmpx today?

The applications which uses utmp/utmpx can be sorted into three categories:

  1. Init daemon for process management. systemd is not using it for this.
  2. Applications writing utmp/utmpx entries for others. This are applications like login, sshd or display manager (xdm, gdm, sddm,…), which write the information about which user, on which terminal, from which remote machine, did login at which time. They need special access rights for this, which is especially for X11 applications a big problem.
  3. Applications using the information from utmp/utmpx for informative messages or monitoring systems. This are tools like w, who, uptime but also tools like wall or write, which use utmp to get the tty of the user session. The most often reasons for using utmp by this applications are in following order. All other usages can be ignored, this are single usages:
    1. Counting logged in users
    2. Display logged in users with tty, login time and from where
    3. Write messages to the tty of a user

More information about which applications use utmp/utmpx and for which reasons can be found on my Y2038 document

At least with systemd, utmp fields like runlevel or dead processes are not used.

Is utmp really needed today with systemd?

If you use systemd as init process, you don’t need utmp. Systemd comes with a PAM module, which collects all required data used by the category 3 applications above. And there are two interfaces, which allows to query systemd-logind for this data:

  1. libsystemd (sd-login.h, sd_*() functions)
  2. DBUS

Latest with systemd 254 all required information are accessible via a sd_*() function.

Proposal

  1. Change all applications, which read utmp, to query systemd-logind instead
  2. Stop writing utmp entries after we are sure nobody uses them anymore

Were other alternatives evaluated?

Yes, we deeply looked at alternate solutions.

Adjust glibc to use 64bit time_t on all architectures

This was of course the first idea, but this is not trivial, as changing the utmp format is a massive ABI breakage. And since we speak about structs and variables, this cannot be solved with symbol versioning. These ideas and first implementations were rejected by the glibc developers, because the development effort and the problems arising from the ABI change are disproportionate to the benefits. The security problems around utmp don’t get solved: applications still need special rights to write to the file, and the DoS problems are not solvable with this, too. And at least, the data is still not trustworthy.

There is still some free space in utmp for an additional time_t field

Yes, there is enough free space left in the struct utmp of glibc for future enhancements, which could be used for an additional 64bit time field. But the migration will be pretty complicated. We will have utmp entries with only the old time field, and utmp entries with the new and old time field. And we have applications, which only know about the old time filed, while others use the new one. Hiding this in glibc alone is not possible, since this is a struct and applications accessing the members of the struct directly. So every reading application needs to be adjusted to check, which field got used by the writing application.

This idea does not solve the security and trust problems, too.

Why not write an own daemon?

That was considered, but this would mean to create a secure design for this, which does not have the current problems, and implement that in a really secure way. But: we have already such a daemon: systemd-logind

So why have the infrastructure twice? Means applications would need to submit the same data to two daemons and applications can query two daemons for the same data. This does not make much sense. While this solves the security problems, it does not solve the trustworthy of the data.

What if I don’t use systemd?

This heavily depends on the libc and init system you are using. For s6 exists e.g. a secure utmp (utmps) implementation, while on the other side musl libc has no support for utmp/wtmp at all.

There is also the elogind project, a standalone version of the systemd project’s logind. It’s designed for users who prefer a non-systemd init system, but still want to use popular software that otherwise hard-depends on systemd. I haven’t evaluated it yet, but since loginctl can query it, it should be possible for other tools to get the informations in the same way from elogind as they get them from logind. This would then be the own daemon solution.

Example code

Using the sd_*() functions is most of the time much simpler than parsing utmp. For the most common use case of counting the currently logged in users, common code looks often like:

...
#include <utmpx.h>
...
int users = 0;
struct utmpx *ut;
setutxent();
while ((ut = getutxent()))
      if (ut->ut_type == USER_PROCESS)
      	 users++;
endutxent();

With systemd-logind this would look like:

...
#include <systemd/sd-login.h>
...
int users = sd_get_sessions(NULL);

Current status

For all packages, libsystemd v254 or greater is required.

Presentations

There are recordings of two presentations explaining this in detail:

Further documentation