Tuesday, March 6, 2012

So I found a bug with SMP, NMIs and KDB...

If two (or more) unknown NMIs arrive on different CPUs, there is a large chance both CPUs will wind up inside panic(). This is fine, unless you want to enter KDB -since now KDB cannot round up all CPUs, because some of them are stuck inside panic_smp_self_stop with NMI latched. This is easy to replicate with QEMU. Boot with -smp 4 and send an untargetted, broadcast NMI using the monitor.

Solution for this is simple - add a new call, try_panic, which will be invoked in cases where some special behavior is desired if someone else is already panicking. For handling unknown NMIs, we now call try_panic instead. If panic() is already active in the system, just exit out of the NMI handler. This lets KDB roundup CPUs.

This affects linux-next.

https://lkml.org/lkml/2012/3/1/50
https://github.com/andreiw/andreiw-wip/blob/master/linux/3.2/kgdb/0007-x86-NMI-Be-smarter-about-invoking-panic-inside-NMI-h.patch

No comments:

Post a Comment