Lenny's Xen Kernel 2.6.26 Causes DomU Freezes

Moblog


Trackbacks

No Trackbacks

Comments
Display comments as (Linear | Threaded)

*having same problem, thanks for reporting this

Jc
#1 jc on 2009-03-08 15:16 (Reply)
*Thanks for your comments, I was trying today this installation and I had the problems that you have described. Thanks for the info ;-)
#2 Teleyinex on 2009-05-07 14:07 (Reply)
*Any known fix yet? Stock linux 2.6.30 kernels freeze as well...
#3 Robert Lacroix (Homepage) on 2009-07-10 20:37 (Reply)
*Any update on this problem?

I don't know a fix, but I have a workaround - or at least in my case it works:
Limit the Dom0 to one cpu ("(dom0-cpus 1)") and also every DomU ("vcpus = 1"). Until the kernel update yesterday I had an uptime of approx. 50 days and no problems.
But unfortunately there's another bug to consider if you want to use this workaround:
http://old.nabble.com/Domain-status-after-shutdown-command:----s---td15565767.html

I just wanted to write this because the Etch kernel won't get security updates much longer.
#4 Aaron on 2010-02-01 10:34 (Reply)
*Thank you for your information. Unfortunately, I need multicores inside my DomUs. Did you try limiting the Dom0 only? Is that even possible?

I guess you have tried the latest Xen+Lenny kernels? I cannot believe they still haven't fixed this.

The second issue - problems with DomU shutdown - on the other hand isn't as bad, I never shut down DomU's. ;-)
#4.1 Moritz (Homepage) on 2010-02-01 13:08 (Reply)
*Yes, at first I tried limiting the Dom0 only - but that didn't solve it for me.
I got the idea from this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=524571

I have the latest (official) packages for Debian Lenny installed - but I use the workaround for about 50 days now and don't have a problem with limiting each DomU to one CPU, so I didn't test if the bug is still there.

And I agree, the second issue isn't really a problem ;-)
#4.1.1 Aaron (Homepage) on 2010-02-01 14:24 (Reply)
*Many people are seeing this bug, me included.

Do you guys have a reliable/fast way to reproduce this?

What should be done is to get a stack trace for each guest vcpu to see what's happening (going wrong), so it can be debugged and fixed.

In dom0, like this:
/usr/lib/xen/bin/xenctx -s System.map-2.6.26-2-xen-686

Repeat that command for each guest vcpu.
The first vcpu is number 0, next is number 1 etc.

If you have a 64bit dom0, xenctx might be under /usr/lib64/.

The System.map file should be the actual correct System.map for the guest kernel.
#5 Pasi Kärkkäinen on 2010-03-05 15:57 (Reply)
*I have same problem, I've updated Dom0
kernel to 2.6.26-2-xen-amd64 and guests have 2.6.26-2-686-bigmem kernel.

Guests (mainly webserver) crashes randomly 3-8 times / day, but usually when there's a bit
more load.

Only option is just destroy running webserver and restart it (xm destroy & create).

If I take a console to a crashed guest there's CPU Soft lock, once I luckily got full dump:


4264.683334] BUG: soft lockup - CPU#3 stuck for 83s! [swapper:0]
[ 4264.683334] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4264.683334]
[ 4264.683334] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4264.683334] EIP: 0061:[] EFLAGS: 00000246 CPU: 3
[ 4264.683334] EIP is at _stext+0x3a7/0x1000
[ 4264.683334] EAX: 00000000 EBX: 00000001 ECX: 00000000 EDX: 00175d28
[ 4264.683334] ESI: 00000003 EDI: 00000000 EBP: 00000000 ESP: ed049fa0
[ 4264.683334] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4264.683334] CR0: 8005003b CR2: b771a000 CR3: 29186000 CR4: 00000660
[ 4264.683334] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4264.683334] DR6: ffff0ff0 DR7: 00000400
[ 4264.683334] [] xen_safe_halt+0xd/0x17
[ 4264.683334] [] xen_idle+0x0/0x3a
[ 4264.683334] [] xen_idle+0x2b/0x3a
[ 4264.683334] [] cpu_idle+0xb0/0xd0
[ 4264.683334] =======================
[ 4264.683337] BUG: soft lockup - CPU#4 stuck for 83s! [swapper:0]
[ 4264.683337] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4264.683337]
[ 4264.683337] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4264.683337] EIP: 0061:[] EFLAGS: 00000246 CPU: 4
[ 4264.683337] EIP is at _stext+0x227/0x1000
[ 4264.683337] EAX: 00030002 EBX: 00000000 ECX: 00000000 EDX: 00000201
[ 4264.683337] ESI: 00000004 EDI: 00000000 EBP: 00000000 ESP: ed04bf88
[ 4264.683337] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4264.683337] CR0: 8005003b CR2: b6c9d000 CR3: 259e3000 CR4: 00000660
[ 4264.683337] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4264.683337] DR6: ffff0ff0 DR7: 00000400
[ 4264.683337] [] force_evtchn_callback+0xa/0xc
[ 4264.683337] [] tick_nohz_restart_sched_tick+0x12d/0x134
[ 4264.683337] [] xen_idle+0x0/0x3a
[ 4264.683337] [] cpu_idle+0xc6/0xd0
[ 4264.683337] =======================
[ 4595.526344] BUG: soft lockup - CPU#2 stuck for 71s! [swapper:0]
[ 4595.526344] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4595.526344]
[ 4595.526344] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4595.526344] EIP: 0061:[] EFLAGS: 00000246 CPU: 2
[ 4595.526344] EIP is at _stext+0x3a7/0x1000
[ 4595.526344] EAX: 00000000 EBX: 00000001 ECX: 00000000 EDX: 00175d28
[ 4595.526344] ESI: 00000002 EDI: 00000000 EBP: 00000000 ESP: ed047fa0
[ 4595.526344] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4595.526344] CR0: 8005003b CR2: b6c9d000 CR3: 261a9000 CR4: 00000660
[ 4595.526344] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4595.526344] DR6: ffff0ff0 DR7: 00000400
[ 4595.526344] [] xen_safe_halt+0xd/0x17
[ 4595.526344] [] xen_idle+0x0/0x3a
[ 4595.526344] [] xen_idle+0x2b/0x3a
[ 4595.526344] [] cpu_idle+0xb0/0xd0
[ 4595.526344] =======================
[ 4595.526469] BUG: soft lockup - CPU#3 stuck for 71s! [rsyslogd:2636]
[ 4595.526469] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4595.526469]
[ 4595.526469] Pid: 2636, comm: rsyslogd Not tainted (2.6.26-2-686-bigmem #1)
[ 4595.526469] EIP: 0061:[] EFLAGS: 00000246 CPU: 3
[ 4595.526469] EIP is at do_get_write_access+0x5c/0x331 [jbd]
[ 4595.526469] EAX: 00000000 EBX: ecdb0b00 ECX: 00000000 EDX: e444efa8
[ 4595.526469] ESI: ec8042c0 EDI: ecdb0b00 EBP: e444efa8 ESP: ea115ca0
[ 4595.526469] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[ 4595.526469] CR0: 8005003b CR2: b5eb8000 CR3: 2c041000 CR4: 00000660
[ 4595.526469] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4595.526469] DR6: ffff0ff0 DR7: 00000400
[ 4595.526469] [] ? __ext3_get_inode_loc+0xcf/0x26c [ext3]
[ 4595.526469] [] ? journal_get_write_access+0x18/0x26 [jbd]
[ 4595.526469] [] ? __ext3_journal_get_write_access+0x13/0x32 [ext3]
[ 4595.526469] [] ? ext3_reserve_inode_write+0x2d/0x5d [ext3]
[ 4595.526469] [] ? ext3_mark_inode_dirty+0x11/0x27 [ext3]
[ 4595.526469] [] ? ext3_dirty_inode+0x50/0x63 [ext3]
[ 4595.526469] [] ? __mark_inode_dirty+0x21/0x12a
[ 4595.526469] [] ? ext3_generic_write_end+0x5d/0x64 [ext3]
[ 4595.526469] [] ? ext3_ordered_write_end+0xb2/0x103 [ext3]
[ 4595.526469] [] ? generic_file_buffered_write+0x13c/0x553
[ 4595.526469] [] ? cap_inode_need_killpriv+0x25/0x35
[ 4595.526469] [] ? security_inode_need_killpriv+0xc/0xd
[ 4595.526469] [] ? remove_suid+0x15/0x44
[ 4595.526469] [] ? __generic_file_aio_write_nolock+0x468/0x4cb
[ 4595.526469] [] ? generic_file_aio_write+0x52/0xa9
[ 4595.526469] [] ? ext3_file_write+0x19/0x83 [ext3]
[ 4595.526469] [] ? do_sync_write+0xbf/0x100
[ 4595.526469] [] ? get_runstate_snapshot+0x3d/0x4b
[ 4595.526469] [] ? autoremove_wake_function+0x0/0x2d
[ 4595.526469] [] ? _spin_unlock_irqrestore+0xd/0x10
[ 4595.526469] [] ? hrtick_set+0x7a/0xd8
[ 4595.526469] [] ? schedule+0x63b/0x66d
[ 4595.526469] [] ? security_file_permission+0xc/0xd
[ 4595.526469] [] ? do_sync_write+0x0/0x100
[ 4595.526469] [] ? vfs_write+0x83/0x120
[ 4595.526469] [] ? sys_write+0x3c/0x63
[ 4595.526469] [] ? syscall_call+0x7/0xb
[ 4595.526469] =======================
[ 4595.526740] BUG: soft lockup - CPU#4 stuck for 71s! [ksoftirqd/4:16]
[ 4595.526752] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4595.526752]
[ 4595.526752] Pid: 16, comm: ksoftirqd/4 Not tainted (2.6.26-2-686-bigmem #1)
[ 4595.526752] EIP: 0061:[] EFLAGS: 00000246 CPU: 4
[ 4595.526752] EIP is at _stext+0x227/0x1000
[ 4595.526752] EAX: 00030002 EBX: 00000000 ECX: 00000000 EDX: ed03e460
[ 4595.526752] ESI: ec5bbc80 EDI: ed03e460 EBP: 00000000 ESP: ed093f60
[ 4595.526752] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4595.526752] CR0: 8005003b CR2: b6c9d000 CR3: 2638d000 CR4: 00000660
[ 4595.526752] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4595.526752] DR6: ffff0ff0 DR7: 00000400
[ 4595.526752] [] force_evtchn_callback+0xa/0xc
[ 4595.526752] [] finish_task_switch+0x25/0x99
[ 4595.526752] [] schedule+0x60a/0x66d
[ 4595.526752] [] __rcu_process_callbacks+0xd7/0x154
[ 4595.526752] [] __do_softirq+0x8b/0xd3
[ 4595.526752] [] ksoftirqd+0x0/0xa6
[ 4595.526752] [] ksoftirqd+0x23/0xa6
[ 4595.526752] [] kthread+0x38/0x5d
[ 4595.526752] [] kthread+0x0/0x5d
[ 4595.526752] [] kernel_thread_helper+0x7/0x10
[ 4595.526752] =======================
[ 4763.846926] BUG: soft lockup - CPU#1 stuck for 113s! [swapper:0]
[ 4763.846929] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4763.846929]
[ 4763.846929] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4763.846929] EIP: 0061:[] EFLAGS: 00000246 CPU: 1
[ 4763.846929] EIP is at _stext+0x3a7/0x1000
[ 4763.846929] EAX: 00000000 EBX: 00000001 ECX: 00000000 EDX: 00175d28
[ 4763.846929] ESI: 00000001 EDI: 00000000 EBP: 00000000 ESP: ed045fa0
[ 4763.846929] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4763.846929] CR0: 8005003b CR2: b6c9d000 CR3: 266a0000 CR4: 00000660
[ 4763.846929] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4763.846929] DR6: ffff0ff0 DR7: 00000400
[ 4763.846929] [] xen_safe_halt+0xd/0x17
[ 4763.846929] [] xen_idle+0x0/0x3a
[ 4763.846929] [] xen_idle+0x2b/0x3a
[ 4763.846929] [] cpu_idle+0xb0/0xd0
[ 4763.846929] =======================
[ 4763.851004] BUG: soft lockup - CPU#3 stuck for 113s! [rsyslogd:2636]
[ 4763.851004] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4763.851004]
[ 4763.851004] Pid: 2636, comm: rsyslogd Not tainted (2.6.26-2-686-bigmem #1)
[ 4763.851004] EIP: 0073:[] EFLAGS: 00000206 CPU: 3
[ 4763.851004] EIP is at 0x80675a2
[ 4763.851004] EAX: 09f0d548 EBX: 09f0d548 ECX: 09ef72d8 EDX: b753d03a
[ 4763.851004] ESI: 00000000 EDI: b753d03a EBP: b753d008 ESP: b753cfa0
[ 4763.851004] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 4763.851004] CR0: 8005003b CR2: b6c9d000 CR3: 2c041000 CR4: 00000660
[ 4763.851004] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4763.851004] DR6: ffff0ff0 DR7: 00000400
[ 4763.851004] =======================
[ 4763.851005] BUG: soft lockup - CPU#4 stuck for 113s! [swapper:0]
[ 4763.851005] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4763.851005]
[ 4763.851005] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4763.851005] EIP: 0061:[] EFLAGS: 00000206 CPU: 4
[ 4763.851005] EIP is at xen_irq_disable+0x6/0xb
[ 4763.851005] EAX: f5612100 EBX: c326e020 ECX: 00000200 EDX: c326e020
[ 4763.851005] ESI: c326e020 EDI: e9c15740 EBP: ed03e460 ESP: ed04bf3c
[ 4763.851005] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4763.851005] CR0: 8005003b CR2: b6c9d000 CR3: 29c90000 CR4: 00000660
[ 4763.851005] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4763.851005] DR6: ffff0ff0 DR7: 00000400
[ 4763.851005] [] ? _spin_lock_irqsave+0x16/0x2f
[ 4763.851005] [] ? hrtick_set+0x40/0xd8
[ 4763.851005] [] ? schedule+0x63b/0x66d
[ 4763.851005] [] ? ktime_get+0xd/0x21
[ 4763.851005] [] ? tick_nohz_stop_idle+0x19/0x45
[ 4763.851005] [] ? tick_nohz_restart_sched_tick+0x12d/0x134
[ 4763.851006] [] ? xen_idle+0x0/0x3a
[ 4763.851006] [] ? cpu_idle+0xcb/0xd0
[ 4763.851006] =======================
[ 4996.739489] BUG: soft lockup - CPU#1 stuck for 63s! [sendmail:2640]
[ 4996.739506] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4996.739506]
[ 4996.739506] Pid: 2640, comm: sendmail Not tainted (2.6.26-2-686-bigmem #1)
[ 4996.739506] EIP: 0061:[] EFLAGS: 00000293 CPU: 1
[ 4996.739506] EIP is at prio_tree_insert+0x150/0x1e9
[ 4996.739506] EAX: 0000013a EBX: ec8b87c0 ECX: ec8b87c0 EDX: 00000139
[ 4996.739506] ESI: ec8b881c EDI: ecdd8570 EBP: e95ea53c ESP: ea115e60
[ 4996.739506] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4996.739506] CR0: 8005003b CR2: b77b52a0 CR3: 2a136000 CR4: 00000660
[ 4996.739506] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4996.739506] DR6: ffff0ff0 DR7: 00000400
[ 4996.739506] [] ? vma_prio_tree_insert+0x17/0x2a
[ 4996.739506] [] ? vma_adjust+0x1b8/0x3ab
[ 4996.739506] [] ? kmem_cache_alloc+0x53/0x87
[ 4996.739506] [] ? split_vma+0xc3/0xd3
[ 4996.739506] [] ? do_munmap+0xb4/0x1ba
[ 4996.739506] [] ? mmap_region+0x6f/0x392
[ 4996.739506] [] ? arch_get_unmapped_area_topdown+0x0/0x120
[ 4996.739506] [] ? do_mmap_pgoff+0x25d/0x2b0
[ 4996.739506] [] ? sys_mmap_pgoff+0x9b/0xc2
[ 4996.739506] [] ? syscall_call+0x7/0xb
[ 4996.739506] =======================
[ 4996.743606] BUG: soft lockup - CPU#2 stuck for 63s! [rsyslogd:2641]
[ 4996.743606] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4996.743606]
[ 4996.743606] Pid: 2641, comm: rsyslogd Not tainted (2.6.26-2-686-bigmem #1)
[ 4996.743606] EIP: 0061:[] EFLAGS: 00000202 CPU: 2
[ 4996.743606] EIP is at __brelse+0x4/0x25
[ 4996.743606] EAX: ecdc46a0 EBX: ecdb0ac8 ECX: 00000000 EDX: 02e80000
[ 4996.743606] ESI: ea139c64 EDI: c3255760 EBP: 00000008 ESP: ea139c34
[ 4996.743606] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[ 4996.743606] CR0: 8005003b CR2: 0809f020 CR3: 2c041000 CR4: 00000660
[ 4996.743606] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4996.743606] DR6: ffff0ff0 DR7: 00000400
[ 4996.743606] [] ? __find_get_block+0x16e/0x180
[ 4996.743606] [] ? __getblk+0x27/0x24e
[ 4996.743606] [] ? do_get_write_access+0x2f8/0x331 [jbd]
[ 4996.743606] [] ? __ext3_get_inode_loc+0xcf/0x26c [ext3]
[ 4996.743606] [] ? ext3_reserve_inode_write+0x19/0x5d [ext3]
[ 4996.743606] [] ? ext3_mark_inode_dirty+0x11/0x27 [ext3]
[ 4996.743606] [] ? ext3_dirty_inode+0x50/0x63 [ext3]
[ 4996.743606] [] ? __mark_inode_dirty+0x21/0x12a
[ 4996.743606] [] ? ext3_generic_write_end+0x5d/0x64 [ext3]
[ 4996.743606] [] ? ext3_ordered_write_end+0xb2/0x103 [ext3]
[ 4996.743606] [] ? generic_file_buffered_write+0x13c/0x553
[ 4996.743606] [] ? cap_inode_need_killpriv+0x25/0x35
[ 4996.743606] [] ? security_inode_need_killpriv+0xc/0xd
[ 4996.743606] [] ? remove_suid+0x15/0x44
[ 4996.743606] [] ? __generic_file_aio_write_nolock+0x468/0x4cb
[ 4996.743606] [] ? generic_file_aio_write+0x52/0xa9
[ 4996.743606] [] ? ext3_file_write+0x19/0x83 [ext3]
[ 4996.743606] [] ? do_sync_write+0xbf/0x100
[ 4996.743606] [] ? get_runstate_snapshot+0x3d/0x4b
[ 4996.743606] [] ? autoremove_wake_function+0x0/0x2d
[ 4996.743606] [] ? _spin_unlock_irqrestore+0xd/0x10
[ 4996.743606] [] ? hrtick_set+0x7a/0xd8
[ 4996.743606] [] ? schedule+0x63b/0x66d
[ 4996.743606] [] ? security_file_permission+0xc/0xd
[ 4996.743606] [] ? do_sync_write+0x0/0x100
[ 4996.743606] [] ? vfs_write+0x83/0x120
[ 4996.743606] [] ? sys_write+0x3c/0x63
[ 4996.743606] [] ? syscall_call+0x7/0xb
[ 4996.743606] =======================
[ 4996.747880] BUG: soft lockup - CPU#3 stuck for 64s! [apache2:2042]
[ 4996.747880] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4996.747880]
[ 4996.747880] Pid: 2042, comm: apache2 Not tainted (2.6.26-2-686-bigmem #1)
[ 4996.747880] EIP: 0073:[] EFLAGS: 00000202 CPU: 3
[ 4996.747880] EIP is at 0xb68abca3
[ 4996.747880] EAX: 000005e7 EBX: b6b22dcc ECX: 000005e7 EDX: 00000ec4
[ 4996.747880] ESI: 0000000a EDI: 0000000a EBP: bfb51108 ESP: bfb51090
[ 4996.747880] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 4996.747880] CR0: 80050033 CR2: b6c9d000 CR3: 27dc7000 CR4: 00000660
[ 4996.747880] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4996.747880] DR6: ffff0ff0 DR7: 00000400
[ 4996.747880] =======================
[ 4996.751947] BUG: soft lockup - CPU#4 stuck for 64s! [swapper:0]
[ 4996.751947] Modules linked in: ipv6 loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys
[ 4996.751947]
[ 4996.751947] Pid: 0, comm: swapper Not tainted (2.6.26-2-686-bigmem #1)
[ 4996.751947] EIP: 0061:[] EFLAGS: 00000246 CPU: 4
[ 4996.751947] EIP is at _stext+0x227/0x1000
[ 4996.751947] EAX: 00030002 EBX: 00000000 ECX: 00000000 EDX: ed0824e0
[ 4996.751947] ESI: ed14ee40 EDI: ed0824e0 EBP: 00000001 ESP: ed04bf3c
[ 4996.751947] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 4996.751947] CR0: 8005003b CR2: b6c9d000 CR3: 28982000 CR4: 00000660
[ 4996.751947] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 4996.751947] DR6: ffff0ff0 DR7: 00000400
[ 4996.751947] [] ? force_evtchn_callback+0xa/0xc
[ 4996.751947] [] ? finish_task_switch+0x25/0x99
[ 4996.751947] [] ? schedule+0x60a/0x66d
[ 4996.751947] [] ? ktime_get+0xd/0x21
[ 4996.751947] [] ? tick_nohz_stop_idle+0x19/0x45
[ 4996.751947] [] ? tick_nohz_restart_sched_tick+0x12d/0x134
[ 4996.751947] [] ? xen_idle+0x0/0x3a
[ 4996.751947] [] ? cpu_idle+0xcb/0xd0
[ 4996.751947] =======================

This is kinda annoying on production environment, I needed to make a script which monitors
state of DomU and destroys & creates it automatically when crash occurs.
#6 Code78 on 2010-09-10 07:30 (Reply)
*So you're able to reproduct it easily. Good.

Make sure you have:

on_crash="preserve"

set up in /etc/xen/ cfgfile.

Then when the domU crashes, run this command for each domU vcpu:

/usr/lib/xen/bin/xenctx -s System.map-domUkernelversion

If you're running 64bit dom0, then xenctx might be under "/usr/lib64/".

You need to use the System.map file for the exact kernel version running in the domU.

Please post those stack traces somewhere, for each vcpu.

That should help debugging the problem.

I can forward that stuff to xen-devel mailinglist, if you don't want to yourself (Would be easier if you did it though).

Thanks!
#6.1 Pasi Karkkainen on 2010-09-10 15:49 (Reply)

Choose Language

Twitter

Recent Entries

Quicksearch

 

Creative Commons License - Some Rights Reserved

Open Data Network