newspaint

Documenting Problems That Were Difficult To Find The Answer To

Configuring RAS Daemon for Lenovo ThinkServer TS140

The RAS daemon (rasdaemon) can be installed on Ubuntu Linux 16.04 using the command:

$ sudo apt-get install rasdaemon

Once this is installed the ThinkServer TS140 requires a custom file to be created in /etc/ras/dimm_labels.d/ with the following file format:

# Vendor: 
#   Model: 
#     :  ...

… although it seems the old Edac label format is also valid:

# Vendor: 
#   Model: 
#     :  ..

In order to discover the values required one can run the following commands:

$ for i in /sys/devices/system/edac/mc/mc0/rank*; do \
  echo ==$i==; \
  cat "$i/dimm_location"; echo -ne "\n"; \
done

==/sys/devices/system/edac/mc/mc0/rank0==
csrow 0 channel 0 
==/sys/devices/system/edac/mc/mc0/rank1==
csrow 0 channel 1 
==/sys/devices/system/edac/mc/mc0/rank2==
csrow 1 channel 0 
==/sys/devices/system/edac/mc/mc0/rank3==
csrow 1 channel 1 
==/sys/devices/system/edac/mc/mc0/rank4==
csrow 2 channel 0 
==/sys/devices/system/edac/mc/mc0/rank5==
csrow 2 channel 1 
==/sys/devices/system/edac/mc/mc0/rank6==
csrow 3 channel 0 
==/sys/devices/system/edac/mc/mc0/rank7==
csrow 3 channel 1 

The board name and vendor can also be found:

$ cat /sys/devices/virtual/dmi/id/board_vendor
LENOVO
$ cat /sys/devices/virtual/dmi/id/board_name
ThinkServer TS140

Then a file (e.g. “lenovo-thinkserver-ts140.txt”) can be created in /etc/ras/dimm_labels.d/:

Vendor: LENOVO
  Model: ThinkServer TS140
    DIMM1_0: 0.0.0
    DIMM1_1: 0.0.1
    DIMM2_0: 0.1.0
    DIMM2_1: 0.1.1
    DIMM3_0: 0.2.0
    DIMM3_1: 0.2.1
    DIMM4_0: 0.3.0
    DIMM4_1: 0.3.1

And the rasdaemon restarted:

$ sudo systemctl status rasdaemon
$ sudo systemctl restart rasdaemon

Then the script can be run:

$ ras-mc-ctl --layout
          +-----------------------------------------------+
          |                      mc0                      |
          |  csrow0   |  csrow1   |  csrow2   |  csrow3   |
----------+-----------------------------------------------+
channel1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
channel0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
----------+-----------------------------------------------+

$ ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 csrow 0 channel 0               DIMM1_0              DIMM1_0             
mc0 csrow 0 channel 1               DIMM1_1              DIMM1_1             
mc0 csrow 1 channel 0               DIMM2_0              DIMM2_0             
mc0 csrow 1 channel 1               DIMM2_1              DIMM2_1             
mc0 csrow 2 channel 0               DIMM3_0              DIMM3_0             
mc0 csrow 2 channel 1               DIMM3_1              DIMM3_1             
mc0 csrow 3 channel 0               DIMM4_0              DIMM4_0             
mc0 csrow 3 channel 1               DIMM4_1              DIMM4_1    

The first time you do this the new labels may not be registered, e.g. you see the following:

$ ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 csrow 0 channel 0               DIMM1_0              mc#0csrow#0channel#0
mc0 csrow 0 channel 1               DIMM1_1              mc#0csrow#0channel#1
mc0 csrow 1 channel 0               DIMM2_0              mc#0csrow#1channel#0
mc0 csrow 1 channel 1               DIMM2_1              mc#0csrow#1channel#1
mc0 csrow 2 channel 0               DIMM3_0              mc#0csrow#2channel#0
mc0 csrow 2 channel 1               DIMM3_1              mc#0csrow#2channel#1
mc0 csrow 3 channel 0               DIMM4_0              mc#0csrow#3channel#0
mc0 csrow 3 channel 1               DIMM4_1              mc#0csrow#3channel#1

The “SYSFS” contents can be found:

$ for i in /sys/devices/system/edac/mc/mc0/rank*; \
  do echo "$i/dimm_label: $(cat $i/dimm_label)"; \
done
/sys/devices/system/edac/mc/mc0/rank0/dimm_label: mc#0csrow#0channel#0
...
/sys/devices/system/edac/mc/mc0/rank7/dimm_label: mc#0csrow#3channel#1

So you can register the configured labels by running:

$ sudo ras-mc-ctl --register-labels
$ ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 csrow 0 channel 0               DIMM1_0              DIMM1_0
...

In Ubuntu the rasdaemon service is defined in /etc/systemd/system/multi-user.target.wants/rasdaemon.service and contains the line:

ExecStart=/usr/sbin/rasdaemon -f -r

This instructs the daemon to start in foreground mode and to record events to a SQLite3 database.

The SQLite3 database is located at /var/lib/rasdaemon/ras-mc_event.db:

$ sqlite3 /var/lib/rasdaemon/ras-mc_event.db .tables
aer_event     extlog_event  mc_event      mce_record  

$ sqlite3 /var/lib/rasdaemon/ras-mc_event.db ".schema mc_event"
CREATE TABLE mc_event (
  id INTEGER PRIMARY KEY,
  timestamp TEXT,
  err_count INTEGER,
  err_type TEXT,
  err_msg TEXT,
  label TEXT,
  mc INTEGER,
  top_layer INTEGER,
  middle_layer INTEGER,
  lower_layer INTEGER,
  address INTEGER,
  grain INTEGER,
  syndrome INTEGER,
  driver_detail TEXT
);

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: