Difference between revisions of "Remote console setup"
(Fixed a typo in →Sending side: , thanks to davej) |
(rm translate tags) |
||
(28 intermediate revisions by 9 users not shown) | |||
Line 1: | Line 1: | ||
− | In case you are experiencing a kernel crash ([[oops]]) and have already [[Hardware testing|tested your hardware]], you should report what kernel says to the console (i.e. an [[oops]] text) to [ | + | In case you are experiencing a kernel crash ([[oops]]) and have already [[Hardware testing|tested your hardware]], you should report what kernel says to the console (i.e. an [[oops]] text) to [https://bugs.openvz.org/ bug tracker]. Sometimes kernel crashes so badly that <tt>syslogd</tt> is not working and what kernel says is never written to a file. If this is the case, you have to catch what kernel says. There are several ways possible. |
+ | |||
+ | == KDump == | ||
+ | |||
+ | With RHEL6-based servers, kdump is pre-configured. See http://kb.odin.com/en/10044 to check the configuration. Dumps can be found under <code>/var/crash/</code> directory. | ||
== Manual/Photo == | == Manual/Photo == | ||
Line 8: | Line 12: | ||
=== Hardware setup === | === Hardware setup === | ||
− | First of all you should make sure that your node has a [ | + | First of all you should make sure that your node has a [[w:Serial port|serial port]]. If there is no such port then |
unfortunately this way is not for you. | unfortunately this way is not for you. | ||
Then you need to find a second machine with a serial port on it. | Then you need to find a second machine with a serial port on it. | ||
This machine will be used to collect logs from your primary machine. Further you need to acquire | This machine will be used to collect logs from your primary machine. Further you need to acquire | ||
− | so-called [ | + | so-called [[w:Serial cable|null modem cable (a.k.a. serial cable)]] and it must be long enough to connect these two machines. |
=== Software setup === | === Software setup === | ||
Line 23: | Line 27: | ||
</pre> | </pre> | ||
+ | <!--T:10--> | ||
+ | {{Warning|make sure kernel command line does '''not''' contain the word '''<code>quiet</code>''', otherwise most of the kernel messages will not be printed to console.}} | ||
+ | |||
+ | <!--T:11--> | ||
For example, in GRUB boot loader configuration file <tt>/boot/grub/grub.conf</tt> it looks like this: | For example, in GRUB boot loader configuration file <tt>/boot/grub/grub.conf</tt> it looks like this: | ||
<pre> | <pre> | ||
Line 33: | Line 41: | ||
Kernel loaded with such parameters will send all kernel messages to /dev/ttyS0 (first serial port, a.k.a. COM1). If you have several ports, make sure that your null modem cable is connected to the appropriate port. | Kernel loaded with such parameters will send all kernel messages to /dev/ttyS0 (first serial port, a.k.a. COM1). If you have several ports, make sure that your null modem cable is connected to the appropriate port. | ||
− | ==== Receiving side ==== | + | ==== Receiving side ==== <!--T:13--> |
On the second node you should run any software that can log from /dev/ttyS0. | On the second node you should run any software that can log from /dev/ttyS0. | ||
+ | <!--T:14--> | ||
It can be usual | It can be usual | ||
<pre> | <pre> | ||
Line 43: | Line 52: | ||
==== Port setup ==== | ==== Port setup ==== | ||
− | One more important thing. 115200 in the example above is the rate of emitting port. | + | One more important thing. 115200 in the example above is the rate of emitting port. Receiving port must also work at the same rate. For example, to tune ttyS0 rate use stty program like this: |
<pre> | <pre> | ||
stty 115200 < /dev/ttyS0 | stty 115200 < /dev/ttyS0 | ||
Line 69: | Line 78: | ||
Save the file, then recompile the kernel: | Save the file, then recompile the kernel: | ||
− | + | # make bzImage && make modules && make modules_install | |
− | # make bzImage && make modules && make modules_install | + | |
− | |||
Update your bootloader for the new updated kernel. In my case I use LILO so I just type lilo at the prompt. | Update your bootloader for the new updated kernel. In my case I use LILO so I just type lilo at the prompt. | ||
Line 80: | Line 88: | ||
Next you want your netconsole to send the request to somewhere. Load netconsole module, specifying the remote server parameters: | Next you want your netconsole to send the request to somewhere. Load netconsole module, specifying the remote server parameters: | ||
− | + | # modprobe netconsole netconsole=4444@10.0.2.1/eth0,6666@10.0.2.2/00:05:5D:34:11:AF | |
This will load the module with your settings. Replace your local IP address with where <tt>10.0.2.1</tt> is, <tt>eth0</tt> with your network interface card device, <tt>6666</tt> with the remote netconsole port (UDP), and <tt>10.0.2.2</tt> with your remote netconsole server IP. Also add in the mac address of your remote netconsole server, which in my case was 00:05:5D:34:11:AF. You can get the MAC address using arp utility: | This will load the module with your settings. Replace your local IP address with where <tt>10.0.2.1</tt> is, <tt>eth0</tt> with your network interface card device, <tt>6666</tt> with the remote netconsole port (UDP), and <tt>10.0.2.2</tt> with your remote netconsole server IP. Also add in the mac address of your remote netconsole server, which in my case was 00:05:5D:34:11:AF. You can get the MAC address using arp utility: | ||
Line 86: | Line 94: | ||
<pre> | <pre> | ||
# ping -c 1 10.0.2.2 | # ping -c 1 10.0.2.2 | ||
− | # /sbin/arp - | + | # /sbin/arp -n 10.0.2.2 |
− | + | Address HWtype HWaddress Flags Mask Iface | |
+ | 10.0.2.2 ether 00:05:5D:34:11:AF C eth0 | ||
</pre> | </pre> | ||
+ | |||
+ | If the remote netconsole server is outside of local network area, use mac address of default gateway or router on local network area and IP address of remote netconsole server (loging via WAN). Mac address of default gateway or router you can get the same way (ping to gateway/router and see mac address via arp command). | ||
Netconsole documentation is available from <tt>Documentation/networking/netconsole.txt</tt> file under your kernel source directory. | Netconsole documentation is available from <tt>Documentation/networking/netconsole.txt</tt> file under your kernel source directory. | ||
− | === Setting | + | === Setting from initrd === |
+ | |||
+ | To log the boot process before root filesystem is mounted, network device driver and netconsole modules must be loaded from initd. | ||
+ | |||
+ | |||
+ | RedHat 5/CentOS 5: | ||
+ | echo 'MODULES+="<network-driver-module> netconsole "' > /etc/sysconfig/mkinitrd/netconsole | ||
+ | chmod +x /etc/sysconfig/mkinitrd/netconsole | ||
+ | echo 'options netconsole netconsole=<sport>@<saddr>/<dev>,<dport>@<daddr>/<dmac>' >> /etc/modprobe.conf | ||
− | + | Debian/Ubuntu: | |
+ | echo '<network-driver-module>' >> /etc/initramfs-tools/modules | ||
+ | echo 'netconsole netconsole=<sport>@<saddr>/<dev>,<dport>@<daddr>/<dmac>' >> /etc/initramfs-tools/modules | ||
− | + | and rebuild initrd. | |
+ | |||
+ | === Setting up rsyslogd === | ||
+ | |||
+ | |||
+ | /etc/rsyslog.d/netconsole.conf | ||
+ | |||
+ | $template NetconsoleFile,"/var/log/netconsole/%FROMHOST%-%$NOW%.log" | ||
+ | $template NetconsoleFormat,"%rawmsg%" | ||
+ | |||
+ | $EscapeControlCharactersOnReceive off | ||
+ | $DropTrailingLFOnReception off | ||
+ | $RepeatedMsgReduction off | ||
+ | |||
+ | $RuleSet NetconsoleRuleset | ||
+ | *.* ?NetconsoleFile;NetconsoleFormat | ||
+ | $RuleSet RSYSLOG_DefaultRuleset | ||
+ | |||
+ | $ModLoad imudp | ||
+ | $InputUDPServerBindRuleset NetconsoleRuleset | ||
+ | $UDPServerRun 6666 | ||
+ | |||
+ | === Setting up remote side === | ||
+ | |||
+ | Set up '''netcat''' ('''nc''' on some Linux distributions) on your console server to listen on port 6666 UDP: | ||
+ | |||
+ | netcat -u -l -p6666 | ||
+ | |||
+ | or | ||
+ | |||
+ | nc -lu 6666 | ||
+ | |||
+ | or | ||
+ | |||
+ | socat udp-listen:6666,reuseaddr - | ||
When your kernel prints something on the console, the text will be also captured on this netconsole server. | When your kernel prints something on the console, the text will be also captured on this netconsole server. | ||
+ | ==== Adding to inittab ==== | ||
+ | For automatic care about capturing on console server you can use init respawn feature in this way: | ||
+ | |||
+ | echo "n1:23:respawn:/bin/netcat -u -l -p 6666 >> /var/log/netconsole" >> /etc/inittab | ||
+ | telinit q | ||
+ | |||
+ | ==== Adding date/time to messages ==== | ||
+ | |||
+ | If you want the log to contain date/time of each line, you can use '''awk''' like this: | ||
+ | |||
+ | netcat -u -l -p6666 | awk '{print strftime("%d %b %Y %H:%M:%S"), $0; fflush(stdout);}' >> /var/log/netconsole | ||
+ | |||
+ | See man strftime for info about how to tailor strftime() argument to your needs. | ||
+ | |||
+ | Note that if you want to add this to /etc/inittab, it should be done like this: | ||
+ | |||
+ | echo "netcat -u -l -p6666 | awk '{print \ | ||
+ | strftime("%d %b %Y %H:%M:%S"), \$0; fflush(stdout);}' \ | ||
+ | >> /var/log/netconsole" > /usr/local/sbin/netconsole | ||
+ | chmod a+x /usr/local/sbin/netconsole | ||
+ | echo "n1:23:respawn:/usr/local/sbin/netconsole" >> /etc/inittab | ||
+ | telinit q | ||
+ | |||
+ | ==== Configuring logrotate ==== | ||
+ | |||
+ | For long term capturing you would like to do log rotating some way. With logrotate you can do it by creating config file /etc/logrotate.d/netconsole: | ||
+ | |||
+ | <pre> | ||
+ | /var/log/netconsole { | ||
+ | weekly | ||
+ | rotate 8 | ||
+ | missingok | ||
+ | compress | ||
+ | copytruncate | ||
+ | notifempty | ||
+ | # Need to restart logger after log file move | ||
+ | postrotate | ||
+ | # Below line assumes netcat will be restarted by init | ||
+ | killall -TERM netcat > /dev/null 2>&1 || true | ||
+ | } | ||
+ | </pre> | ||
+ | |||
+ | <!--T:53--> | ||
+ | For more details, see man logrotate. | ||
+ | |||
+ | === Testing netconsole === | ||
+ | First, check log level of console messages on OpenVZ side by: | ||
+ | |||
+ | cat /proc/sys/kernel/printk | ||
+ | |||
+ | First number should be 7 for testing. You can arrange it by: | ||
+ | |||
+ | sysctl -w kernel.printk="7 4 1 7" | ||
+ | |||
+ | After testing you can restore previous setting the same way. | ||
+ | |||
+ | Load '''netconsole''' module (see above) and on the console server run netcat (nc) command. On OpenVZ side provoke any console message, for example connect any USB hardware or try command: | ||
+ | |||
+ | modprobe tun | ||
+ | |||
+ | If you see any console message on OpenVZ side, you should see message on console server too. If not, something is wrong. When debugging a problem, do not use tcpdump on OpenVZ side — it is not able to show netconsole packets. Instead, use tcpdump on console server. Quite a common source of problems with netconsole are firewalls. | ||
+ | |||
+ | == See also == | ||
+ | * [[SysRq debugger]] | ||
+ | * [[Kernel debug options]] | ||
+ | |||
+ | == External links == | ||
+ | * [http://kb.odin.com/en/10044 How to configure kdump (kernel crash dump)] | ||
+ | |||
+ | [[Category:QA]] | ||
[[Category:HOWTO]] | [[Category:HOWTO]] | ||
[[Category:Kernel]] | [[Category:Kernel]] | ||
[[Category:Troubleshooting]] | [[Category:Troubleshooting]] |
Latest revision as of 02:57, 27 November 2018
In case you are experiencing a kernel crash (oops) and have already tested your hardware, you should report what kernel says to the console (i.e. an oops text) to bug tracker. Sometimes kernel crashes so badly that syslogd is not working and what kernel says is never written to a file. If this is the case, you have to catch what kernel says. There are several ways possible.
Contents
KDumpEdit
With RHEL6-based servers, kdump is pre-configured. See http://kb.odin.com/en/10044 to check the configuration. Dumps can be found under /var/crash/
directory.
Manual/PhotoEdit
If kernel backtrace is not long enough there are chances that it can fit into a single screen. In that case, you can just make a photo of the kernel crash screen and attach it to the bug report. If you do not have a camera, you still can carefully write down (using a piece of paper and a pen, that is) what you see on the screen, and later type it into the bug report.
Serial consoleEdit
Here is a description of a common routine that is necessary to set up a serial console.
Hardware setupEdit
First of all you should make sure that your node has a serial port. If there is no such port then unfortunately this way is not for you.
Then you need to find a second machine with a serial port on it. This machine will be used to collect logs from your primary machine. Further you need to acquire so-called null modem cable (a.k.a. serial cable) and it must be long enough to connect these two machines.
Software setupEdit
Sending sideEdit
In your boot loader add the following kernel parameters:
console=ttyS0,115200 console=tty0
Warning: make sure kernel command line does not contain the word quiet , otherwise most of the kernel messages will not be printed to console.
|
For example, in GRUB boot loader configuration file /boot/grub/grub.conf it looks like this:
title Fedora Core (2.6.16-026test014.1-smp) root (hd0,0) kernel /vmlinuz-2.6.16-026test014.1-smp ro root=LABEL=/ console=ttyS0,115200 console=tty debug silencelevel=8 initrd /initrd-2.6.16-026test014.1-smp.img
Kernel loaded with such parameters will send all kernel messages to /dev/ttyS0 (first serial port, a.k.a. COM1). If you have several ports, make sure that your null modem cable is connected to the appropriate port.
Receiving sideEdit
On the second node you should run any software that can log from /dev/ttyS0.
It can be usual
cat /dev/ttyS0 > /var/log/serial.log &
or something more sophisticated: syslogd, watchtty etc.
Port setupEdit
One more important thing. 115200 in the example above is the rate of emitting port. Receiving port must also work at the same rate. For example, to tune ttyS0 rate use stty program like this:
stty 115200 < /dev/ttyS0
Some other serial port parameters, like parity, number of stop bits etc. should also be the same on both sides.
NetconsoleEdit
Kernel recompilationEdit
If you use binary kernel from openvz.org, it already has netconsole module compiled in, so just skip to next section.
If you build the kernel yourself, you might need to check if netconsole is compiled. To that effect, change to your kernel source directory and grep your kernel .config for NETCONSOLE:
# cd /usr/src/openvz/linux-2.6.16 # grep NETCONSOLE .config
If you see nothing or "# CONFIG_NETCONSOLE is not set" you need to recompile the kernel.
Edit your kernel configuration file .config with a text editor (nano .config or vi .config). Set netconsole to Y or M (depending on whether you want it as a module or built into the kernel; I have compiled it as a module):
CONFIG_NETCONSOLE=m
Save the file, then recompile the kernel:
# make bzImage && make modules && make modules_install
Update your bootloader for the new updated kernel. In my case I use LILO so I just type lilo at the prompt.
Reboot into new kernel.
Setting up OpenVZ sideEdit
Next you want your netconsole to send the request to somewhere. Load netconsole module, specifying the remote server parameters:
# modprobe netconsole netconsole=4444@10.0.2.1/eth0,6666@10.0.2.2/00:05:5D:34:11:AF
This will load the module with your settings. Replace your local IP address with where 10.0.2.1 is, eth0 with your network interface card device, 6666 with the remote netconsole port (UDP), and 10.0.2.2 with your remote netconsole server IP. Also add in the mac address of your remote netconsole server, which in my case was 00:05:5D:34:11:AF. You can get the MAC address using arp utility:
# ping -c 1 10.0.2.2 # /sbin/arp -n 10.0.2.2 Address HWtype HWaddress Flags Mask Iface 10.0.2.2 ether 00:05:5D:34:11:AF C eth0
If the remote netconsole server is outside of local network area, use mac address of default gateway or router on local network area and IP address of remote netconsole server (loging via WAN). Mac address of default gateway or router you can get the same way (ping to gateway/router and see mac address via arp command).
Netconsole documentation is available from Documentation/networking/netconsole.txt file under your kernel source directory.
Setting from initrdEdit
To log the boot process before root filesystem is mounted, network device driver and netconsole modules must be loaded from initd.
RedHat 5/CentOS 5:
echo 'MODULES+="<network-driver-module> netconsole "' > /etc/sysconfig/mkinitrd/netconsole chmod +x /etc/sysconfig/mkinitrd/netconsole echo 'options netconsole netconsole=<sport>@<saddr>/<dev>,<dport>@<daddr>/<dmac>' >> /etc/modprobe.conf
Debian/Ubuntu:
echo '<network-driver-module>' >> /etc/initramfs-tools/modules echo 'netconsole netconsole=<sport>@<saddr>/<dev>,<dport>@<daddr>/<dmac>' >> /etc/initramfs-tools/modules
and rebuild initrd.
Setting up rsyslogdEdit
/etc/rsyslog.d/netconsole.conf
$template NetconsoleFile,"/var/log/netconsole/%FROMHOST%-%$NOW%.log" $template NetconsoleFormat,"%rawmsg%" $EscapeControlCharactersOnReceive off $DropTrailingLFOnReception off $RepeatedMsgReduction off $RuleSet NetconsoleRuleset *.* ?NetconsoleFile;NetconsoleFormat $RuleSet RSYSLOG_DefaultRuleset $ModLoad imudp $InputUDPServerBindRuleset NetconsoleRuleset $UDPServerRun 6666
Setting up remote sideEdit
Set up netcat (nc on some Linux distributions) on your console server to listen on port 6666 UDP:
netcat -u -l -p6666
or
nc -lu 6666
or
socat udp-listen:6666,reuseaddr -
When your kernel prints something on the console, the text will be also captured on this netconsole server.
Adding to inittabEdit
For automatic care about capturing on console server you can use init respawn feature in this way:
echo "n1:23:respawn:/bin/netcat -u -l -p 6666 >> /var/log/netconsole" >> /etc/inittab telinit q
Adding date/time to messagesEdit
If you want the log to contain date/time of each line, you can use awk like this:
netcat -u -l -p6666 | awk '{print strftime("%d %b %Y %H:%M:%S"), $0; fflush(stdout);}' >> /var/log/netconsole
See man strftime for info about how to tailor strftime() argument to your needs.
Note that if you want to add this to /etc/inittab, it should be done like this:
echo "netcat -u -l -p6666 | awk '{print \ strftime("%d %b %Y %H:%M:%S"), \$0; fflush(stdout);}' \ >> /var/log/netconsole" > /usr/local/sbin/netconsole chmod a+x /usr/local/sbin/netconsole echo "n1:23:respawn:/usr/local/sbin/netconsole" >> /etc/inittab telinit q
Configuring logrotateEdit
For long term capturing you would like to do log rotating some way. With logrotate you can do it by creating config file /etc/logrotate.d/netconsole:
/var/log/netconsole { weekly rotate 8 missingok compress copytruncate notifempty # Need to restart logger after log file move postrotate # Below line assumes netcat will be restarted by init killall -TERM netcat > /dev/null 2>&1 || true }
For more details, see man logrotate.
Testing netconsoleEdit
First, check log level of console messages on OpenVZ side by:
cat /proc/sys/kernel/printk
First number should be 7 for testing. You can arrange it by:
sysctl -w kernel.printk="7 4 1 7"
After testing you can restore previous setting the same way.
Load netconsole module (see above) and on the console server run netcat (nc) command. On OpenVZ side provoke any console message, for example connect any USB hardware or try command:
modprobe tun
If you see any console message on OpenVZ side, you should see message on console server too. If not, something is wrong. When debugging a problem, do not use tcpdump on OpenVZ side — it is not able to show netconsole packets. Instead, use tcpdump on console server. Quite a common source of problems with netconsole are firewalls.