[linux-elitists] netconsole module - kernel dmesg-over-network logging
Karsten M. Self
karsten@linuxmafia.com
Thu Dec 11 14:38:22 PST 2008
This is an old-but-it's-new-to-me feature I've just discovered. The
existing docs are sufficiently not-quite-clear that I thought I'd take a
stab at it here.
The netconsole kernel module was introduced in 2001 and has since been
ported to the 2.6 series kernels. It allows the kernel to log dmesg
output directly to a network logging host, and was specifically aimed at
the case of kernel panics occuring which leave no persistent trace on
disk (such as our current issue in which a 2.6.22-3-amd64 is crashing
hard on a 3w_9xxx driver / 3Ware 9000 storage controller.
What's particularly slick about netconsole is that it can log to
infrastructure (say, a network syslog server), or to an ad-hoc host
using netcat.
The arcanity is the options specifications. I messed with these for a
bit before getting things fully sorted. I've written an assisting
script to simplify the process. We _aren't_ using the dynamic
configuration facility in which configurations are stored under
/sys/kernel/config/netconsole/, but instead specify parameters at module
load time (I think dynamic configuration is a newer feature).
I'm referring to the host being logged as "source" and the host
recording the log as "target".
There's relatively good documentation in
Documentation/networking/netconsole.txt, though explanation of terms is
still a bit unclear:
netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
where
src-port source for UDP packets (defaults to 6665)
src-ip source IP to use (interface address)
dev network interface (eth0)
tgt-port port for logging agent (6666)
tgt-ip IP address for logging agent
tgt-macaddr ethernet MAC address for logging agent (broadcast)
I've written a small helper script for this which clarifies the
configuration parameters slightly. It is (and they are):
------------------------------------------------------------------------
#!/bin/sh
# Example. Note that source and target are on (likely) different
# 192.168 subnets.
SRC_PORT=6666
SRC_IP=192.168.0.10
SRC_DEV=eth0
TGT_PORT=6666
TGT_IP=192.168.1.10
# MAC address is *EITHER* the target host if on the same LAN segment, *OR* the
# local gateway MAC. Get that from your route + arp table.
GW_MACADDR=00:00:FF:FF:FF:FF
set -x # Echo the final command
modprobe netconsole netconsole="${SRC_PORT}@${SRC_IP}/${SRC_DEV},${TGT_PORT}@${TGT_IP}/${GW_MACADDR}"
------------------------------------------------------------------------
Note that:
- You must specify the port, IP, and _device_ on the source host.
When using bonded NICs, you must specify the underling interface,
not the bonded device itself (e.g.: eth0 rather than bond0).
- You must specify the target port and IP.
- Port number is arbitrary, though 6666 is a traditional default.
- The target MAC address is _either_ the MAC address of the target
host, if on the same LAN segment, or of the gateway router, if the
logging host is on another network. You may have to play with mtr
or traceroute, route, and /proc/arp (or the arp command) to find
this. Under most circumstances you should be able to specify your
default gateway, I suspect. This tripped us up for a bit.
On the target side, I've had a few issues with netcat hanging up the
connection. My current solution is to use the '-q <nn>' option. For
negative values, netcat *never* terminates on receipt of an EOF.
netcat -ul -q -1 -p 6666 | tee netconsole.log
That's: UPD mode, listen, never hang up, port 6666.
On the source side, bump up the kernel logger's verbosity with dmesg:
dmesg -n 8 # 8 appears to be the max level
For testing, you want to generate some kernel activity. This is where
I've scratched my head a bit, though umounting and remounting an ext3
filesystem will produce something like:
EXT3 FS on sdv1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
I'm open to other suggestions. Is there anything similar to the
logger(1) command which can be used to generate a kernel message?
The other primary shortcoming is that the messages aren't timestamped if
read via netcat (syslog would add its own stamps). I'm also not
entirely sure that we're eliminating all buffering on both sides of the
connection, which is key as we're trying to catch something that's
apparently happening pretty quickly.
So for the moment we're bakin' the source and waitin' on the target.
Hats off to Ingo Molnar & Matt Mackall.
For more reading on the subject:
[patch] netconsole - log kernel messages over the network. 2.4.10.
From: Ingo Molnar (mingo@elte.hu)
Date: Wed Sep 26 2001 - 15:04:33 EST
http://lkml.indiana.edu/hypermail/linux/net/0109.3/0009.html
Linux Configure Netconsole To Log Messages Over UDP Network
by Vivek Gite [Last updated: July 2, 2008]
http://www.cyberciti.biz/tips/linux-netconsole-log-management-tutorial.html
HOWTO: Log a kernel panic.. It can be done!
Posted in March 19th, 2007
by The Elite Geek in Linux
http://www.tocpcs.com/howto-log-a-kernel-panic-it-can-be-done/
Peace.
--
Karsten M. Self <karsten@linuxmafia.com> http://linuxmafia.com/~karsten
Ceterum censeo, Caldera delenda est.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: Digital signature
Url : http://allium.zgp.org/pipermail/linux-elitists/attachments/20081211/4c0a3983/attachment.pgp
More information about the linux-elitists
mailing list