www/ddb.html - view

Return to ddb.html CVS log

Up to [local] / www

File: [local] / www / ddb.html (download) (as text)

Revision 1.17, Mon Feb 6 17:24:32 2017 UTC (7 years, 3 months ago) by tb
Branch: MAIN
Changes since 1.16: +24 -26 lines

kernels are now compiled with debugging symbols, so we can skip one step.
avoid a few parenthetical remarks and simplify a number of command lines.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>OpenBSD: Crash Reports</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="description" content="How to report an OpenBSD kernel crash">
<meta name="copyright" content="This document copyright 1998-2016 by OpenBSD.">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" type="text/css" href="openbsd.css">
<link rel="canonical" href="https://www.openbsd.org/report.html">
<style type="text/css">
  h3, h4 { color: #0000e0; }
</style>
</head>

<body bgcolor="#ffffff" text="#000000" link="#23238e">

<h2>
<a href="index.html">
<font color="#0000ff"><i>Open</i></font><font color="#000084">BSD</font></a>
<font color="#e00000">Crash Reports</font>
</h2>
<hr>
<p>

<h3>Minimum information for kernel problems</h3>

Familiarize yourself with
<a href="report.html">the general bug reporting procedures</a>
first.
All of that will apply.
When reporting a kernel panic or crash, please remember:

<ul>
  <li><i>We need the console output on the screen</i>.
    Capture it and save it.
    Serial consoles are best, but if you are on a VGA console you can
    <a href="faq/faq7.html">scroll the console back</a>
    and take readable pictures with a phone or camera.<br>

  <li><i>If the kernel panicked we need the traceback.</i>
    It may be displayed on the screen.
    If you are at a
    <tt><a href="http://man.openbsd.org/ddb.4">ddb</a>&gt;</tt>
    prompt, type <tt>trace</tt>.
    If you are running SMP, use the <tt>mach ddbcpu N</tt> command for each
    of the <tt>N</tt> processors you have and repeat the <tt>trace</tt>
    command for each processor.<br>

  <li><i>We need the process list.</i>
    Use the command <tt>ps</tt> to get that.
</ul>

<i>
Reports without the above information are useless.
This is the minimum we need to be able to track down the issue.
</i>

<h3>Additional information you can send</h3>

In some situations more information is desirable.
Below are outlined some additional steps you can take in certain situations:

<ul>
  <li><i>If your crash appears to involve filesystems.</i>
    The following additional things would be helpful
    <ul>
      <li>The output of the
        <tt><a href="http://man.openbsd.org/ddb.4">ddb</a>&gt;</tt> command
        <tt>show uvm</tt>
      <li>The output of the
        <tt><a href="http://man.openbsd.org/ddb.4">ddb</a>&gt;</tt>
        command <tt>show bcstats</tt>
      <li>The output of the <tt>mount</tt> command from your running machine, so
        we know what filesystems are mounted and how.
    </ul>
  <li> ... XXX boot crash? XXX
  <li> ... XXX show regs? XXX
</ul>

<h3>Lost the panic message?</h3>

Under some circumstances, you may lose the very first message of a panic,
stating the reason for the panic.

<blockquote><pre>
ddb> <b>show panic</b>
0:      kernel: page fault trap, code=0
ddb>
</pre></blockquote>

<h3>Note for SMP systems</h3>

You should get a trace from each processor as part of your report:

<blockquote><pre>
ddb{0}> <b>trace</b>
pool_get(d05e7c20,0,dab19ef8,d0169414,80) at pool_get+0x226
fxp_add_rfabuf(d0a62000,d3c12b00,dab19f10,dab19f10) at fxp_add_rfabuf+0xa5
fxp_intr(d0a62000) at fxp_intr+0x1e7
Xintr_ioapic0() at Xintr_ioapic0+0x6d
--- interrupt ---
idle_loop+0x21:
ddb{0}> <b>machine ddbcpu 1</b>
Stopped at      Debugger+0x4:   leave
ddb{1}> <b>trace</b>
Debugger(d0319e28,d05ff5a0,dab1bee8,d031cc6e,d0a61800) at Debugger+0x4
i386_ipi_db(d0a61800,d05ff5a0,dab1bef8,d01eb997) at i386_ipi_db+0xb
i386_ipi_handler(b0,d05f0058,dab10010,d01d0010,dab10010) at i386_ipi_handler+0x
4a
Xintripi() at Xintripi+0x47
--- interrupt ---
i386_softintlock(0,58,dab10010,dab10010,d01e0010) at i386_softintlock+0x37
Xintrltimer() at Xintrltimer+0x47
--- interrupt ---
idle_loop+0x21:
ddb{1}>
</pre></blockquote>

Repeat the <tt>machine ddbcpu x</tt> followed by <tt>trace</tt> for each
processor in your machine.

<h3>How do I gather further information from a kernel crash?</h3><p>

A typical kernel crash on OpenBSD might look like this:

<blockquote><pre>
kernel: page fault trap, code=0
Stopped at    <b>_pf_route+0x263</b>:        mov     0x40(%edi),%edx
ddb>
</pre></blockquote>

This crash happened at offset <tt>0x263</tt> in the function <tt>_pf_route</tt>.

<p>
The first command to run from the
<a href="http://man.openbsd.org/ddb">ddb(4)</a> prompt is <tt>trace</tt>:

<blockquote><pre>
ddb> <b>trace</b>
<b>_pf_route</b>(e28cb7e4,e28bc978,2,1fad,d0b8b120) at <b>_pf_route+0x263</b>
_pf_test(2,1f4ad,e28cb7e4,b4c1) at _pf_test+0x706
_pf_route(e28cbb00,e28bc978,2,d0a65440,d0b8b120) at _pf_route+0x207
_pf_test(2,d0a65440,e28cbb00,d023c282) at _pf_test+0x706
_ip_output(d0b6a200,0,0,0,0) at _ip_output+0xb67
_icmp_send(d0b6a200,0,1,a012) at _icmp_send+0x57
_icmp_reflect(d0b6a200,0,1,0,3) at _icmp_reflect+0x26b
_icmp_input(d0b6a200,14,0,0,d0b6a200) at _icmp_input+0x42c
_ipv4_input(d0b6a200,e289f140,d0a489e0,e289f140) at _ipv4_input+0x6eb
_ipintr(10,10,e289f140,e289f140,e28cbd38) at _ipintr+0x8d
Bad frame pointer: 0xe28cbcac
ddb>
</pre></blockquote>

This tells us what function calls lead to the crash.

<p>
To find out the particular line of C code that caused the crash, you can
do the following:

<p>
Find the source file where the crashing function is defined.
In this example, that would be <tt>pf_route()</tt> in <tt>/sys/net/pf.c</tt>.
Use <a href="http://man.openbsd.org/objdump">objdump(1)</a> to get the
disassembly:

<blockquote><pre>
$ <b>cd /sys/arch/$(uname -m)/compile/GENERIC</b>
$ <b>objdump -dlr obj/pf.o &gt;/tmp/pf.dis</b>
</pre></blockquote>

In the output, grep for the function name:

<blockquote><pre>
$ <b>grep "&lt;_pf_route&gt;:" /tmp/pf.dis</b>
0000<b>7d88</b> &lt;_pf_route&gt;:
</pre></blockquote>

Take this first hex number <tt>7d88</tt> and add the offset <tt>0x263</tt> from
the <tt>Stopped at</tt> line:

<blockquote><pre>
$ <b>printf '%x\n' $((0x7d88 + 0x263))</b>
7feb
</pre></blockquote>

Scroll down to that line (the assembler instruction should match the one
quoted in the <tt>Stopped at</tt> line), then up to the nearest C line number:

<blockquote><pre>
$ <b>more /tmp/pf.dis</b>
/sys/net/pf.c:<b>3872</b>
    7fe7:       0f b7 43 02             movzwl 0x2(%ebx),%eax
    <b>7feb</b>:       8b 57 40                mov    0x40(%edi),%edx
    7fee:       39 d0                   cmp    %edx,%eax
    7ff0:       0f 87 92 00 00 00       ja     8088 &lt;_pf_route+0x300&gt;
</pre></blockquote>

So, it's precisely line <tt>3872</tt> of <tt>pf.c</tt> that crashes:

<blockquote><pre>
$ <b>nl -ba /sys/net/pf.c | sed -n 3872p</b>
  3872		if ((u_int16_t)ip-&gt;ip_len &lt;= ifp-&gt;if_mtu) {
</pre></blockquote>

The kernel that produced the crash output and the object file for objdump must
be compiled from the exact same source file, otherwise the offsets won't match.

<p>
If you provide both the ddb trace output and the relevant objdump section,
that's very helpful.

<p>
</body>
</html>