newspaint

Documenting Problems That Were Difficult To Find The Answer To

Category Archives: SysAdmin

Asterisk – Getting SIP NAT Working with Linux 5.4

I updated Ubuntu from 16.04 to 18.04 to 20.04. Somewhere along the line the netfilter SIP helpers stopped working. Because I run my Asterisk server in a LXC container this became a problem.

The solution for me was to add the following lines to my host server’s iptables rules:

# packets coming from the Asterisk server in the LXC container to port 5060
-t raw -A PREROUTING -i lxcbr+ -p udp --dport 5060 -j CT --helper sip

# packets coming from the Internet from port 5060
-t raw -A PREROUTING -i eth+ -p udp --sport 5060 -j CT --helper sip

Note that after reloading the iptables I had to also flush the connection tracking tables (with sudo conntrack -F) and I also restarted my VoIP phone to be sure.

Asterisk – Function CUT not registered

I had entries in my /var/log/syslog:

Jan 21 02:37:52 asterisk asterisk[132]: Function CUT not registered
Jan 21 02:39:58 asterisk asterisk[132]: Function CUT not registered
Jan 21 02:41:14 asterisk asterisk[132]: Function CUT not registered
Jan 21 02:42:35 asterisk asterisk[132]: Function CUT not registered
Jan 21 02:44:52 asterisk asterisk[132]: Function CUT not registered
Jan 21 02:56:17 asterisk asterisk[132]: Function CUT not registered
Jan 21 03:04:57 asterisk asterisk[132]: Function CUT not registered

The cause? I was using commas instead of pipe symbols to separate arguments in the function as per the note in this article:

As of Asterisk 1.2.8, use a pipe (“|”) character instead of commas as a parameter delimiter.

Selecting Fields to Display in TShark

Sometimes you want to process packet captures from the command line rather than from Wireshark’s GUI. In this case the TShark tool is very useful.

Just as you can configure what columns to display in the packet summary in Wireshark – you can tell TShark what fields to display from the command line.

Such an example command line might look like:

$ tshark.exe -r mycapture.pcap.gz -2 -R "ip.addr==10.2.3.5" -T fields -E separator=, -E quote=d -e frame.time -e ip.src -e ip.dst -e _ws.col.Info
"Jan 18, 2021 06:25:14.987845000","10.2.3.5","192.168.0.3","IN  s1/tmm1 : NTP Version 3, client"
"Jan 18, 2021 06:36:05.109737000","10.2.3.5","192.168.0.3","IN  s1/tmm1 : NTP Version 3, client"

Breaking down that command line we have:

Option Description
-r filename packet capture file to read from
-2 two-pass process, required for -R option
-R display_filter display filter to select what packets to show
-T fields display selected fields
-E separator=, commas in between fields
-E quote=d put quote marks around each field
-e field select field for display

But where does one find out the field name for the desired field?

There are two ways: the first is to look up the display field reference.

The other is to open a packet capture Wireshark, select a desired packet from the summary list, then in the breakdown of the protocols below right-click on the desired field:

Right-click on desired protocol field in Wireshark

Right-click on desired protocol field in Wireshark

 

Then in the pop-up menu select Copy > Field Name.

Choose Copy then Field Name

Choose Copy then Field Name

 

Now the clipboard will contain the field name you can put after the -e option in TShark (in this example it was ssl.record.length).

Xubuntu 20.04 on Lenovo ThinkPad E14 Gen 2 (AMD)

I recently bought a Lenovo ThinkPad E14 Gen 2 (AMD) with a Ryzen 4500U processor and 8GB of RAM soldered.

First thing I did was install another 32GB of 3200MHz DDR4 SO-DIMM RAM. Now I have 40GB of RAM.

Installing Xubuntu 20.04 has presented a number of issues. Some are solved, others are not.

Keyboard/Touchpad Stopped Working After Several Minutes

Soon after installing Xubuntu I discovered that the keyboard and touchpad would simply stop responding after some time, maybe 5 minutes, of inactivity.

Eventually I figured out I could press CTRL-ALT-F1 to switch to a text console and log in there. On examination of /var/log/Xorg.0.log I would see entries like:

[   310.542] (II) event3  - Power Button: device removed
[   310.565] (II) event5  - Video Bus: device removed
[   310.585] (II) event0  - Power Button: device removed
[   310.605] (II) event2  - Sleep Button: device removed
[   310.621] (II) event8  - Integrated Camera: Integrated C: device removed
[   310.662] (II) event4  - AT Translated Set 2 keyboard: device removed
[   310.710] (II) event6  - ETPS/2 Elantech TrackPoint: device removed
[   310.741] (II) event14 - ThinkPad Extra Buttons: device removed

I was able to restart my X session by typing sudo systemctl restart lightdm. This would restart the X manager and I could log in again with a working keyboard/touchpad.

By chance I tried the command xflock4 to test the screensaver – but every time I did the same thing would happen, keyboard/touchpad stop working, and log messages like the above would be produced.

I discovered that if I killed the xfce4-screensaver process then xflock4 would work, lock the screen (screen would go blank/black, and I could log back in again after a keypress on the keyboard to activate the display.

In the end I ran apt-get remove xfce4-screensaver to remove the xfce4-screensaver package altogether.

No more failing keyboard/touchpad.

Suspend to RAM Not Waking Up

Every time I tried suspend to RAM the laptop would seemingly suspend okay, the power button (and lid LED that forms the dot above the i in the word “ThinkPad”) would slowly fade in and out to indicate suspension, but on wake-up the laptop would backlight the keyboard but the screen would stay blank/black.

Later I discovered, from reading a forum post, that I could wake the laptop (with the blank/black screen), press CTRL-ALT-F1 to switch to console mode (still no display), type in my username and password, enter sudo shutdown -r now and re-enter my password (for sudo) and the laptop would reboot okay.

To test suspend I did the following:

$ cat /sys/power/pm_test # what test modes are available?
[none] core processors platform devices freezer

$ cat /sys/power/state # what power-down states are available?
freeze mem disk

$ sudo bash -c "echo core >/sys/power/pm_test" # test as much of the suspend process as possible
$ cat /sys/power/pm_test
none [core] processors platform devices freezer

$ sudo bash -c "echo mem >/sys/power/state" # initiate suspend for 5 seconds

This process worked with no problems. The suspend would occur, a delay of 5 seconds was then encountered, and the system then woke up, display working, keyboard working, everything working.

So what fixed it? Turns out Xubuntu 20.04 (as of 2020-12-19) comes with Linux kernel version 5.4. But it is kernel version 5.8 that is needed for proper support of the AMD GPU.

I had to update my /etc/apt/sources.list and add focal-updates as a source.

Then I installed linux-generic-hwe-20.04-edge:

$ apt-get install linux-generic-hwe-20.04-edge

This installed a version 5.8 kernel, specifically Linux version 5.8.0-33-generic (buildd@lgw01-amd64-010). After a reboot suspend-to-RAM worked, and so did waking up, the screen was reactivated (you might have to tap a key, e.g. SHIFT, to activate the display).

Firefox 83.0 and NS_BINDING_ABORTED

I upgraded my browser on Ubuntu and started seeing a new problem. Some of my tabs would not load. When I opened Developer Tools and viewed the Network tab I would see the string NS_BINDING_ABORTED in the “Transferred” column.

Something else changed in this version of Firefox, too, the HTTP auth window was no longer a pop-up on the page but a more formatted box at the top of the page.

The HTTP auth window seems related. In another tab (on another window, actually) was a tab waiting for HTTP auth to be entered. Once I cancelled that HTTP auth window I could load the other tab that was saying NS_BINDING_ABORTED.

I haven’t tested but I suspect that this version of Firefox cannot handle more than one tab requesting HTTP auth at a time. Either per-domain, or per-process.

Upgrading ZFS Mirror With Bigger Drives

This can be tricky on Ubuntu 16.04. I was replacing my 8TB drives in a mirror with 12TB drives, one at a time, resilvering to the other side each time. But each drive was also encrypted with LUKS.

After physically replacing the drive in the mirror reboot. When, normally, you’d be asked for the crypt password you’ll get a series of error messages – and this goes on for up to 5 minutes – you just have to wait it out, and then you get plonked into a rescue mode from which you just have to type:

exit

.. and proceed with the boot – albeit with one half of your mirror missing. You might have to wait another 1m30s for another process to realise the missing disk isn’t actually there.

Then comes the detach and attach commands.

e.g. you initially see the pool is in DEGRADED state:

user@host:~$ sudo zpool status
  pool: mypool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 17h27m with 0 errors on Thu Sep  3 12:03:23 2020
config: 

        NAME                     STATE     READ WRITE CKSUM
        mypool                   DEGRADED     0     0     0
          mirror-0               DEGRADED     0     0     0
            8236712398351102035  UNAVAIL      0     0     0  was /dev/mapper/crypt4
            crypt3               ONLINE       0     0     0
        logs
          crypt5                 ONLINE       0     0     0
        cache
          crypt6                 ONLINE       0     0     0

So then issue the detach command for the missing disk.

user@host:~$ sudo zpool detach mypool 8236712398351102035

Now – once you’ve done a cryptsetup luksFormat on the replacement disk you can add it to the zpool:

user@host:~$ sudo zpool attach mypool /dev/mapper/crypt3 /dev/mapper/crypt4

The zpool should now be resilvering.

user@host:~$ sudo zpool status mypool
  pool: mypool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Sep 10 02:12:34 2020
    16.6M scanned out of 6.95T at 2.38M/s, (scan is slow, no estimated time)
    15.9M resilvered, 0.00% done

Fear not, the resilver will be slow at the start, after a minute or two it will get back up to speed.

While the resilvering is taking place you can upgrade the pool size. By default it won’t have expanded:

user@host:~$ sudo zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
mypool 7.25T  6.95T   303G     3.64T    53%    95%  1.00x  ONLINE  -

To expand several steps on Ubuntu 16.04 are required:

user@host:~$ sudo zpool set autoexpand=on mypool
user@host:~$ sudo zpool set autoexpand=off mypool
user@host:~$ sudo zpool online -e mypool /dev/mapper/crypt3

After a pause of several seconds the mirror will now be expanded:

user@host:~$ sudo zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
mypool 10.9T  6.95T  3.95T         -    35%    63%  1.00x  ONLINE  -

Microsoft Teams – Linux Trap Invalid Opcode

I’ve been running Microsoft Teams inside a LXC container on Ubuntu 16.04 (with XFCE) lately and very recently I’ve had two occasions where my desktop stopped functioning normally.

On reviewing /var/log/syslog I’ve seen entries like the following:

Jul 30 08:00:59 localhost kernel: [5904607.094017] traps: Watchdog[5428] trap invalid opcode ip:55aba1d9d724 sp:7f3836ffcac0 error:0 in teams[55aba0273000+4bf6000]

.. and..

Aug  1 08:17:02 localhost kernel: [6078368.525104] traps: Watchdog[20595] trap invalid opcode ip:55eaeced5724 sp:7f42137fdac0 error:0 in teams[55eaeb3ab000+4bf6000]

This is on an Intel Xeon CPU:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Xeon(R) CPU E3-1226 v3 @ 3.30GHz
stepping	: 3
microcode	: 0x28
cpu MHz		: 3366.000
cache size	: 8192 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single ssbd ibrs ibpb stibp kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
bogomips	: 6584.57
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

I might have to stop using Microsoft Teams on Linux.

Strawberry Perl for Windows – Makefile Temporary Directory

I had a problem. My assigned company laptop would only allow me to execute (run) files from a single path (e.g. C:\Permitted\).

When I tried to use cpan to install a module, or I downloaded the zip of the module and tried to build the module myself, I would always get errors about a file in C:\Users\username\AppData\Local\Temp\makennnnnn.bat being attempted to be executed.

Now, downloading a module from CPAN and unzipping it, then running perl -w Makefile.PL was no problem.

The issue occurred when running gmake. In debugging mode it would output similar to the following:

Creating temporary batch file C:\Users\username\AppData\Local\Temp\make90123-1.bat
Batch file contents:
        @echo off
        rem
CreateProcess(C:\Users\username\AppData\Local\Temp\make90123-1.bat,C:\Users\username\AppData\Local\Temp\make90123-1.bat,...)

The solution was found in this linked thread. That linked to Microsoft documentation for the GetTempPathA function which specified that the temporary path returned was either:

  1. the path specified by the TMP environment variable, or if not exists
  2. the path specified by the TEMP environment variable, or if not exists
  3. the path specified by the USERPROFILE environment variable, or if not exists
  4. the Windows directory

So it is a simple matter of setting an environment variable and GNU make (gmake) will create the temporary batch files there instead.

Copy Timestamp From One File to Another in Linux

Tested on Ubuntu 16.04.

# get last modified time in seconds past the epoch
export SECS=$(/usr/bin/stat -c "%Y" "${FILE_SRC})

# set modified time to seconds past the epoch
touch --date=@${SECS} "${FILE_DST}"

Audacious Only Plays One Song Then Stops

If Audacious is only playing a single song then stopping – even though you have a large playlist – then it may be the following:

  • open the menu by clicking on the tiny icon to the left of the word “AUDACIOUS” at the top left of the player window
  • choose Playback > No Playlist Advance (Ctrl+N) if ticked