Archive for the ‘work’ Category

Nice kdump hack: get dmesg only

Tuesday, March 18th, 2014

Last week during a kernel debugging trainig, I was asked by a participant if it would be possible to get only the dmesg of the crashed kernel, without capturing the whole crash dump.
The possibility is clear, since both current RHEL/CentOS versions as well as SLES11SP3 already put a “dmesg.txt” next to the vmcore in the crash dump directory.
But how would you achieve to get only the dmesg?
And why would one want that?
Well, the second question is easily answered: in order to deploy crash dump capturing in a large hardware pool, quite some preparation needs to be done. In my daily work, servers most of the time have more RAM than they have local disk storage, so you need to store the dumps on the network. Then you need to make sure that a large amount of crashing servers (a famous example was the leap second bug) does not fill up the storage and leads to further problems like machines not coming up again due to full storage etc. All solvable, but to be considered before deployment. If you just capture the dmesg, you can almost certainly store that locally without creating problems. Another reason would be to get the servers up again as soon as possible, while still capturing some useful information (dumping a few hundreds of gigabytes of RAM can take quite some time).

So how to do it?
SUSE’s kdump infrastructure (tested on SLES11SP3) has a configuration option KDUMP_PRESCRIPT which allows to give a custom script which will be run before the crash dump is captured. This script now needs to call vmcore-dmesg and save the output somewhere for later inspection, then unmount the rootfs and issue reboot -f. Since this script never returns, the regular core-collector will not run. Problem solved.

The script is actually pretty trivial, so that it can be pasted here:

# small script which can be used as KDUMP_PRESCRIPT in SLES
# it *only* saves the dmesg of the crashed kernel and then
# reboots immediately, *no* crash dump is saved.
# benefits:
# * get the machine up ASAP, while still collecting
#   some useful information.
# * can be always enabled without worrying about storage etc
# License: WTFPL v2
NOW=`date +%Y-%m-%d-%H%M`
# in SUSE kdump initrd, real rootfs is mounted to /root
PRG=/root/usr/sbin/vmcore-dmesg                 # SLES12
test -x $PRG || PRG=/root/sbin/vmcore-dmesg     # SLES11SP3
$PRG /proc/vmcore > $OUT
umount /root
reboot -f       # do not continue the kdump initrd

It is slightly more complicated than absolutely necessary, but it should work in newer releases which now put the tools in /usr/sbin, too.
In my case, I saved it to /usr/local/sbin/ and then changed the following in /etc/sysconfig/kdump:


After restarting kdump, the next crash gave me a nice:

sles11sp3:~ # ls -l /var/crash/
total 36
-rw-r--r-- 1 root root 33383 Mar 18 09:33 vmcore-dmesg-2014-03-18-0911.txt

and no crash dump, mission accomplished.

Fix coolstream neo tuner voltage problem

Wednesday, January 1st, 2014

After trying to use the EN50494 feature of neutrino-mp, I found that as soon as I attached the coolstream neo to my coax-”bus”, all other receivers on the same bus could no longer tune any frequency. After quite some investigation, I found out, that the tuner does not lower the voltage to 13V if there is less than about 10mA of current load on the “LNB in” coax output.
I tried to work around the problem in software. However, that did not work too well, because if the neo was the only active receiver on the bus, then the SCR matrix would shut down if there was no voltage applied. So I checked if there is a way to simply fix the broken hardware.
There is. Next to the tuner “tin box” there is a transistor that has the 14/19V at one of its terminals. Just adding a 1.1kOhm resistor from there to ground made the voltage switch correctly even without any coax cable attached.

1.1kOhm resistor fixes tuner voltage problem

1.1kOhm resistor fixes tuner voltage problem

The picture is not very good but it shows where the LNB power can be tapped.
Note: use this at your own risk, soldering the resistor to the wrong terminal or shortening the wrong solder points might very well kill your box. You have been warned.
It is also very well possible that better soldering points exist, however this solution can be implemented without disassembling the box completely by just soldering on the top side of the PCB.

It would be interesting if this modification also solves the huge amount of DiSEqC switching problems that were reported last spring after some software update, it surely solved my EN50494 (aka unicable) bus blocking problem. After all the vendor has issued no statement until today, even though this clearly looks like broken (by design) hardware…

“rdmsr” implemented in perl

Monday, December 23rd, 2013

Today I needed the “rdmsr” tool to determine if machines are configured correctly for Hypervisor usage (VT-X enabled). Just after learning how to detect this (by looking at the cpu-checker ubuntu package), I found out that msr-tools is not available on SLES11. Instead of building the package, I just implemented a minimal version in perl (it will be integrated into a perl tool checking other aspects of the system anyway):

# License: WTFPLv2
# minimal "rdmsr" implementation
$msr = shift
        or die "need msr as parameter";
$msr = hex($msr) if ($msr =~ m/^0x/i);
open(FD, "/dev/cpu/0/msr")
        or die "open /dev/cpu/0/msr: $!";
sysseek(FD, $msr, SEEK_SET)
        or die "sysseek: $!";
sysread(FD, $reg, 8 ) == 8
        or die "sysread: $!";
$reg = reverse($reg);
$hex = unpack('H*', $reg);
$hex =~s/^0*//;
print "$hex\n";

Yes, i know, my perl is horrible :-) but maybe this is useful for someone else anyway.

openSUSE Kernel debuginfo weirdness

Friday, August 16th, 2013

Just because I fell into the same trap twice, once when trying kdump a year ago, now when working on systemtap, a short reminder for everyone:

If you want to do something that needs kernel-default-debuginfo installed (like, say, “kdump”/”crash” or “systemtap”), then make sure that you also have kernel-default-devel-debuginfo on your system.

The reason for this is, that the kernel-default-debuginfo package has only the debuginfo for the kernel modules, but it misses the debuginfo for /boot/vmlinux. This debuginfo is in kernel-default-devel-debuginfo package.

This is at least strange, since the vmlinux binary is not in the devel package but in the main kernel package.
But in practice it does not matter if this is a bug or not: you need both debuginfo packages installed to make kdump analysis with “crash” or systemtap work.

ExpressCard hotplug with kernel 3.11

Friday, August 16th, 2013

(Executive summary: boot with acpiphp.disable=1)

Kernel 3.11rc has fixed the mei driver suspend problem for me. It brought another quirk, however. ExpressCard hotplug does not work anymore with my trusty old Thinkpad X200s. I need this to use my USB3 card. With 3.11 it only works if it is already plugged in during boot.
To be honest, this has never been 100% automatic before: I always needed to manually load the “pciehp” module to get the slot to work. The other possible driver, “acpiphp” refused to load.
With 3.11, both the acpiphp and the pciehp drivers can no longer be built as modules but are both built in statically.
In addition, now the acpiphp driver claims to support my slot, which it in fact does not, but it still claims the slot and prevents the pciehp driver (which is initialized later) from working. The kernel hackers are working to fix up this mess, but it is probably not going to “just” work in 3.11, so for now the way to get the ExpressCard slot working on a Thinkpad X200s is to pass the boot option


to the kernel.

“mei” driver suspend regression in linux-3.10

Monday, July 1st, 2013

For all those running kernels from Kernel:HEAD, where 3.10-final has arrived today, be warned that a suspend / resume regression crept in (I spotted and reported that after -rc4 or such, but it has not been fixed).
This regression makes my Thinkpad X200s not resume from suspend to RAM (I have not tried suspend to disk, but it might be affected, too), the machine just freezes hard on resume.
Fortunately, it is relatively easy to work around.
The affected driver is the “mei_me” driver for Intel’s Management Engine, which AFAICT allows you to use management functions of the chipset like serial-over lan and similar. I don’t need this, but I need a suspend to RAM that works :-)

The work around is to simply unbind the driver after boot, then the machine will suspend and resume fine (blacklisting the module does not work since the driver is built in in the openSUSE kernel).

Find out, which device is bound to the driver:

# find /sys/bus/pci/drivers/mei_me/ -type l|sed 's#^.*/##'

Now unbind (as root):

# echo 0000:00:03.0 > /sys/bus/pci/drivers/mei_me/unbind

…and your next resume will work better.

Using virtio console with KVM

Tuesday, October 23rd, 2012

Yesterday I finally figured out how to use the virtio console on KVM, which is very easy once you know what to do.

A short explanation of what a “virtio console” is:
On paravirtualized environments like Xen the VM (guest) has no direct hardware access, but also no emulated hardware but instead needs special drivers that know it is running on a hypervisor. The advantage of this approach is that you don’t need to emulate hardware and that the driver for the guest is often much simpler (it does not need to probe hardware, for example, as it talks to the hypervisor via a special interface). This makes for a very convenient default Xen setup where you can just do “xm console ” on the hypervisor and get a functional console without any fiddling.
On fully virtualized environments, usually an emulated serial port is used for that, which is harder to set up. It is not impossible, but definitely not that easy and there is quite some stuff to configure to get it working.
Nowadays even fully virtualized systems often have some paravirtualized drivers available, mostly for performance reasons. One of those paravirtualized drivers available with KVM is the virtio console driver.

So back on topic: how to set up the virtio console driver for a KVM guest
First, edit the configuration of your guest, in my case with “virsh edit factorytest” (factorytest is the name of my guest). Add the following snippet of configuration before the closing “</devices>” tag:

    <console type='pty'>
      <target type='virtio' port='0'/>

Then a few things need to be done on the KVM guest. In my case of an openSUSE 12.2 guest, the virtio_console module was loaded automatically. However, it surely won’t hurt to put it into INITRD_MODULES in /etc/sysconfig/kernel.
The second thing is “console=hvc0″ needs to go onto the Kernel command line.
The easiest way to accomplish this is to use the YaST bootloader module and add it in the “Boot Loader Options” tab. You need to actually make sure there is “console=tty console=hvc0″ there, because without the “console=tty” the system did not boot up cleanly for me. I guess that the kernel is not happy with only a console that gets added later via a module, but I need to investigate that.

All the rest was handled by systemd and friends just fine, after rebooting the VM guest, I was immedately able to do “virsh console factorytest”, hit “Enter”, get the login prompt and log in.

Kdump Talk / Paper

Thursday, May 24th, 2012

This time in english :-)
Tomorrow I’m holding the kdump talk at LinuxTag in Berlin, and since I had submitted it in english, I had to translate my short paper anyway, so here it is. It is a rather short “beginner’s guide” but it should get you started. And it’s much better than the slides as there is almost no technical details in them due to the short timeslot (30 minutes) I got.

Kdump Vortrag / Paper

Sunday, March 18th, 2012

(German only, the paper is also only available in german right now).

Zur Zeit halte ich Vorträge über kdump, unter anderem auf dem GUUG Frühjahrsfachgespräch 2012 und den Chemnitzer Linuxtagen 2012.

Da die Dauer der Vorträge meist auf ca. 45 Minuten begrenzt ist, enthalten die Vortragsfolien nur wenig technische Details. Für den Tagungsband des FFG 2012 habe ich jedoch einen Artikel geschrieben, der als Einstieg in das Thema besser geeignet ist. Diesen Artikel gibt es hier.

Viel Erfolg!

If you don’t want to fight package maintainers…

Friday, December 23rd, 2011

…then work around the bug.

One example is bug 699400: hwclock is never set correctly when NTP is used (unless it was almost correct before…).

I reported the bug, I argued with the maintainer, it did lead to nothing, the bug was “RESOLVED INVALID” (interesting notion: either it is resolved or it is invalid, then there is nothing to resolve…)

What’s my workaround? I exploit the fact that old SysV init uses bash scripts. The init script which sets the clock does source /etc/sysconfig/clock. Internally, it uses a variable ELEVENMIN_MODE.
So what did I do? Simply add

readonly ELEVENMIN_MODE=no

at the end of /etc/sysconfig/clock.
The script will throw some errors on boot and shutdown due to its inability to change a readonly shell variable, but I can live much better with a harmless warning than with a totally wrong system time at each reboot :-)

Update: Werner has pointed out in the bug that this is totally wrong and will make a mess of your time settings, so use this at your own risk. OTOH I have more than 2000 productive systems here running with this. We had lots of problems before and have no problems anymore after doing it like described here. So at least for me the wrong solution works good.