Detailed CPU Usage Overview With psacct

Published on Author gryzli

If you ever asked yourself the following questions, this might be the right place for you:

How to check how much cpu is consuming each system user ?

How to check how much cpu is consumed by each command ?

Check when a command has been executed and how much cpu it did ?

How to check all commands executed by a given user ?

What is the total cpu my system is consuming ?

These are pretty standard sysadmin questions, which arise soon or later in your day-to-day practice.

There is a pretty nice interface inside your linux kernel, called “Process accounting“, which you can benefit a lot by using “psacct“.

 

What information Process Accounting could give us ?

If you are using recent Linux distribution, for example Centos 6 or Centos 7, the structure of each process accounting record, contains the following fields (it’s the acct_v3 one):

 

struct acct_v3 {
    char      ac_flag;      /* Flags */
    char      ac_version;   /* Always set to ACCT_VERSION (3) */
    u_int16_t ac_tty;       /* Controlling terminal */
    u_int32_t ac_exitcode;  /* Process termination status */
    u_int32_t ac_uid;       /* Real user ID */
    u_int32_t ac_gid;       /* Real group ID */
    u_int32_t ac_pid;       /* Process ID */
    u_int32_t ac_ppid;      /* Parent process ID */
    u_int32_t ac_btime;     /* Process creation time */
    float     ac_etime;     /* Elapsed time */
    comp_t    ac_utime;     /* User CPU time */
    comp_t    ac_stime;     /* System time */
    comp_t    ac_mem;       /* Average memory usage (kB) */
    comp_t    ac_io;        /* Characters transferred (unused) */
    comp_t    ac_rw;        /* Blocks read or written
                               (unused) */
    comp_t    ac_minflt;    /* Minor page faults */
    comp_t    ac_majflt;    /* Major page faults */
    comp_t    ac_swaps;     /* Number of swaps (unused) */
    char      ac_comm[ACCT_COMM]; /* Command name */ };  

By enabling process accounting, you will get such structure for each process that has finished it’s execution.

So basically you have access to pretty much information about each process, which could be effectively used for debugging,troubleshooting or performance optimizations later on.

In order to make your life easier, there is the ‘psacct’ package, which makes handling this information pretty easy.

 

Installing psacct under Centos 6/7

The first thing you need to do is to install psacct and enable it (which will also enable the process accounting inside your kernel).

The process is easy enough:

# Install

yum install psacct

 

# Enable Centos 7

systemctl enable psacct

# Enable Centos 6

chkconfig psacct on

 

Finally start the process

# Start Centos 7/ Centos 6

service psacct start

 

Effectively Using psacct To Monitor CPU Usage

Now that you have installed psacct, you will have access to some nice commands, most important of which are:

sa, lastcomm, ac and dump-acct

Before going any further, you should know the following:

*) Process accounting is made on a daily basis. Whatever information you see it is for the current day

*) By default linux kernel is using “/var/account/pacct” file for writing the process information. Which means, that all the tools provided by ‘psacct’ are working with this file

*) There is logrotate config installed, which makes sure to rotate your ‘pacct’ file on a daily basis and keep the older files in the same directory “/var/account/” for a historic analysis when needed

 

Using sa  to check cpu usage summary

`sa` is the tool to use for checking summary information for your cpu consumption. Most of the time you will use it to get summary either by ‘user’ or by ‘command’

Showing cpu usage summary by command

[root@server ~]# sa 
   82979   31029.47re     381.61cp         0avio     11952k
     101    3555.96re     302.73cp         0avio     54342k   php-fpm*
    2092    1959.50re      45.41cp         0avio     48581k   php
     184       5.14re      23.37cp         0avio     39925k   pigz
     478      20.16re       4.74cp         0avio     78762k   perl*
      21       3.26re       2.57cp         0avio     31403k   mysqldump
     348       0.81re       0.53cp         0avio     66208k   curl
       2    1851.99re       0.51cp         0avio         0k   kworker/u16:2*
     184       5.14re       0.43cp         0avio     24062k   tar
     125      10.53re       0.37cp         0avio     94240k   sw-engine
      64    2434.23re       0.23cp         0avio     31670k   smtpd
      11     799.94re       0.23cp         0avio         0k   kworker/0:1*
      24     171.21re       0.22cp         0avio     29684k   ***other*
    5232      76.31re       0.11cp         0avio     10674k   nrpe*
      75       0.08re       0.07cp         0avio     63293k   pmm-ras
     134       4.73re       0.02cp         0avio     29635k   sshd
      13       0.03re       0.02cp         0avio     58929k   pmmcli
      17       9.01re       0.01cp         0avio     94963k   sw-engine-fpm*
     348      35.25re       0.01cp         0avio      2390k   check_cpu.sh
     767      12.92re       0.01cp         0avio     32624k   crond*
      72     149.41re       0.01cp         0avio     22960k   trivial-rewrite
    1859    4314.23re       0.00cp         0avio     22497k   smtp
      22       0.00re       0.00cp         0avio     38304k   ps

The most important stats from this output are:

column1:  Total number of executions for the given command

column2:  Total number of time elapsed (the time, the process was running) in minutes

column3:  Total number of CPU time consumed by the command ( this is the sum of sys time + user time) in minutes

column6:  The command executed by execve() syscall, limited to 16 chararcters

 

You should know, that when you are executing scripts like perl, python, bash, php , the command that will be accounted is going to be the binary executable.

For example running “php some_script.php” will be seen just as “php”.

This makes life a bit harder, because you can’t see the actual script name, that has been executed.

 

Showing cpu usage summary by user

There is one more interesting execution variation of sa, which is sa -m

Here is how the output looks like:

[root@server ~]# sa -m 
                                    86165   31682.06re     396.90cp         0avio     11923k
ftp_4444                              579    3746.03re     316.02cp         0avio     22352k
ftp_333331836                       17406    4038.40re      47.38cp         0avio     30482k
root                                 7398   18564.31re      32.53cp         0avio     23228k
postnex                              2174       2.58re       0.55cp         0avio     33967k
postfix                              1142    5058.41re       0.24cp         0avio     23434k
nagios                              56882     152.43re       0.12cp         0avio      3508k
psaadm                                 50      39.20re       0.04cp         0avio     52283k

Most important columns are:

Column1:  System username

Column4:  CPU time consumed by the user (for today) in minutes

 

Using lastcomm to check each command executed

Now as you know, how to see the CPU summaries, you may want to dig deeper, by observing when and what each user has executed. This is really nice technique for debugging the cpu usage by a given user or by command.

 

Showing All Commands By User (–user option)

Let’s say we want all commands executed by user “root”

[root@server ~]# lastcomm  --user root 
gdbus            S   X root     __         0.02 secs Sat Apr 27 15:20
lastcomm               root     pts/0      0.08 secs Sat Apr 27 15:22
pipe             S     root     __         0.01 secs Sat Apr 27 15:20
lastcomm               root     pts/0      0.03 secs Sat Apr 27 15:22
sa                     root     pts/0      0.03 secs Sat Apr 27 15:22
man                    root     pts/0      0.01 secs Sat Apr 27 15:20
troff                  root     pts/0      0.01 secs Sat Apr 27 15:20
man                    root     pts/0      0.02 secs Sat Apr 27 15:20
lastcomm             X root     pts/0      0.54 secs Sat Apr 27 15:20
gdbus                X root     pts/0      0.01 secs Sat Apr 27 15:20
crond            SF    root     __         0.01 secs Sat Apr 27 15:16
smtp             S     root     __         0.02 secs Sat Apr 27 15:05
php_session_cle  S     root     __         0.02 secs Sat Apr 27 15:01
pgrep                  root     __         0.01 secs Sat Apr 27 15:01
crond            SF    root     __         0.01 secs Sat Apr 27 14:59
crond            SF    root     __         0.01 secs Sat Apr 27 14:55
smtp             S     root     __         0.01 secs Sat Apr 27 14:40
crond            SF    root     __         0.01 secs Sat Apr 27 14:33
smtp             S     root     __         0.02 secs Sat Apr 27 14:30
smtp             S     root     __         0.02 secs Sat Apr 27 14:20
crond            SF    root     __         0.01 secs Sat Apr 27 14:22
php_session_cle  S     root     __         0.01 secs Sat Apr 27 14:01
pgrep                  root     __         0.01 secs Sat Apr 27 14:01

The output contains the following:

Col1: Command nam

Col2: Flag. These are the meanings of the flags:
S — command executed by super-user
F — command executed after a fork but without a following exec
C — command run in PDP-11 compatibility mode (VAX only)
D — command terminated with the generation of a core file
X — command was terminated with the signal SIGTERM

Col3:  username

Col5: CPU time consumed in seconds (both system + user time)

Col6: Date/Time of the command execution

 

 

Showing All Commands By Command (–command option)

We could also print commands by command name, which is done with the –command option. 

Let say we need who and when executed tar:

[root@server ~]# lastcomm --command tar
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.01 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar              S     root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar              S     root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar              S     root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    root     __         0.00 secs Sat Apr 27 03:44
tar                    apache   __         0.00 secs Sat Apr 27 03:44

 

You could also add multiple “–command” options if you like to view information for multiple commands at the same time. Same applies for the “–user” option as well.

Using dump-acct to dump information from /var/account/pacct

As I mentioned earlier, ‘/var/account/pacct’ is generated directly by the kernel and is in binary format. Sometimes you may need to parse/dump the contents of this file. One easy way to do this (without writing software) is to use the psacct provided dump-acct command.

 

Here is an example:

[root@server ~]# dump-acct  /var/account/pacct | head 
accton          |v3|     0.00|     0.00|     0.00|     0|     0|  4216.00|     0.00|   12459    12457|Sat Apr 27 03:35:07 2019
cat             |v3|     0.00|     0.00|     0.00|     0|     0|107968.00|     0.00|   12461    12457|Sat Apr 27 03:35:07 2019
grep            |v3|     0.00|     0.00|     0.00|     0|     0|112832.00|     0.00|   12460    12457|Sat Apr 27 03:35:07 2019
sh              |v3|     0.00|     0.00|     0.00|     0|     0|113152.00|     0.00|   12457    12438|Sat Apr 27 03:35:07 2019
logrotate       |v3|     0.00|     4.00|    89.00|     0|     0|125248.00|     0.00|   12438    12436|Sat Apr 27 03:35:07 2019
logrotate       |v3|     0.00|     0.00|    89.00|     0|     0|113152.00|     0.00|   12436     2197|Sat Apr 27 03:35:07 2019
awk             |v3|     0.00|     0.00|    89.00|     0|     0|113536.00|     0.00|   12437     2197|Sat Apr 27 03:35:07 2019
basename        |v3|     0.00|     0.00|     0.00|     0|     0|107968.00|     0.00|   12462     2197|Sat Apr 27 03:35:07 2019
logger          |v3|     0.00|     0.00|     0.00|     0|     0|107968.00|     0.00|   12463     2197|Sat Apr 27 03:35:07 2019

 

The output from the dump-acct is in the following format:

`ac_comm'
     name of the executed program

`ac_version'
     version of the `acct' file format

`ac_utime'
     user time

`ac_stime'
     system time

`ac_etime'
     elapsed time

`ac_uid'
     user id

`ac_gid'
     group id

`ac_mem'
     (average) memory usage

`ac_io'
     number of characters transferred on input/output

`ac_pid'
     process id

`ac_ppid'
     parent's process id

The cpu related timings “ac_utime, ac_stime, ac_etime” are all measured in CPU ticks. On most systems , 1 cpu second is equal to 100 CPU ticks.

 

Pitfalls

There are some important pitfalls, you should consider before/during using accounting tools.

Command names shortening

Currently all command names are shortened to 16 characters (at least on Centos 6/7 systems).  The limitation comes from the acct v3 structure itself.

 

CPU by long-running processes

As stated earlier, accounting works only for processes, that has finished their execution. It is important to understand, that when you are checking CPU summary with (sa or sa -m), you don’t see the CPU time consumed, by still or long-running processes, like daemons are.

Usually such processes are: php, httpd, nginx, mysqld, dovecot, exim, ….etc….

Also keep in mind, that when long-running process has terminated, the accounting subsystem, will account all the CPU that has been made by this process. So even that you know, you are seeing accounted cpu times, for the current day, this doesn’t apply for terminated long-running processes.

 

Using sa/lastcomm too frequently is expensive !

If you are planning on automating and scraping the results from sa/lastcomm for some reason, keep in mind that running sa/sa -m/ lastcomm is expensive, especially on a systems with huge accounting files.

Before doing such thing, make sure to calculate the CPU time, that your sa command will consume (time sa -m );