Systemd – systemctl list-unit-files timeouts

Published on Author gryzli

The problem

Recently I’ve started to occasionally see systemctl to timeout, while executing “list-unit-files”. The problem itself looks very annoying and the first thought I had was: “systemd is bugged out, the server must be restarted

In my current case, the problem is observed on a Centos 7 box.

If you “restart” the server you will be happily surprised that the problem seems to be gone, but in fact it isn’t.

After another couple of hours or days or weeks (strongly depends on your server usage), you will start to encounter this problem again.

As a consequences of these systemctl timeouts, you could observe all kinds of “weired” behaviors, such as:  “slow ssh loging“.

After digging some time, I found the following thread:

https://github.com/systemd/systemd/issues/1961

 

If you read enough, you will understand what really happens:

1) There is a process “systemd-logind”, which handles all kind of logins to your system, and for every login generates sessions. These sessions are stored /run/systemd/system

2) Each session is composed of 2 parts:

  • One file that looks like this: /run/systemd/system/session-XXXXXX.scope
  • One directory that looks like this: /run/systemd/system/session-XXXXXX.scope.d 

3) Because of some weird bug between DBUS and systemd-logind, some of these session files/dirs doesn’t get “cleaned, flushed” properly

4) The result is fulfilled directory with session files and directories, which causes systemctl to timeout, when executing “list-unit-files”, because the command also traverse this directory

 

As reported in the github thread, the problem seems to affect: Debian Jessie, Arch Linux, and Centos 7.

 

The Solution / Fix

As far As I get it from the thread above, the DBUS devs, must fix this problem and release it, but until then you could use the following:

 

Add a regular executing cron job (for example every day) , that will delete the session files and directories:

find /run/systemd/system -type f \( -name "*.conf" -o -name "*.scope"  \)  -exec rm -f {} \; ;rmdir /run/systemd/system/session-*.scope.d

 

My cron file looks like this:

root@server# crontab -l 

0 20 * * *  find /run/systemd/system -type f \( -name "*.conf" -o -name "*.scope"  \)  -exec rm -f {} \; ;rmdir /run/systemd/system/session-*.scope.d

 

Side effects ?

I was very skeptic about deleting systemd session files from the directory mentioned above, cause I was not sure what this is going to break.

Then I found that one of my servers had crashed “systemd-logind” process  and no session files were generating, but in the same time all of the server services worked just fine.

Those I thought, there won’t be a problem to delete these files/directories, and until now I haven’t observed any bad side effects.

One Response to Systemd – systemctl list-unit-files timeouts

  1. Thank you for this useful hint.
    Recent amazon linux 2 instances have this issue.

    I’ve adjusted your script a bit to make sure not to remove files belonging to running services. In addition I restricted the search to the top level directory.

    #!/bin/bash

    find /run/systemd/system -mindepth 1 -maxdepth 1 -mtime +1 -type f -name ‘run-*’ \
    | perl -p -e ‘s#^/run/systemd/system/run-(\d+).+$#$1#g’ \
    | sort -u \
    | while read line
    do
    ps -o pid= -o comm= $line \
    | grep -q $line \
    || rm -rf /run/systemd/system/run-$line.*
    done