ChangeSet 1.1187, 2003/05/20 12:09:54-07:00, david-b@pacbell.net [PATCH] USB: SMP ehci-q.c 1010 BUG() Stefano Barbato wrote: > Dual PIII > kernel 2.4.21-rc2 (w/ SMP) (2.5.69 below) > ... > > I put a few printk before the BUG() and I found that the offending if() is > this: > if(qh->qh_state != QH_STATE_LINKED > && qh->qh_state != QH_STATE_UNLINK_WAIT) > > because qh_state were QH_STATE_COMPLETING. I got a similar SMP report recently, but without info about which clause was failing -- which is a key clue, thanks!! The COMPLETING state is used only while a QH is being scanned for completed TDs. (Think CPU-0 irq handler.) Looking at the handful of places that call the routine reporting the BUG(), a couple seem like they could make trouble with multiple CPUs in the driver. diff -Nru a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c --- a/drivers/usb/host/ehci-q.c Tue May 20 17:24:56 2003 +++ b/drivers/usb/host/ehci-q.c Tue May 20 17:24:56 2003 @@ -918,11 +918,11 @@ qh->qh_state = QH_STATE_IDLE; qh->qh_next.qh = 0; qh_put (ehci, qh); // refcount from reclaim - ehci->reclaim = 0; - ehci->reclaim_ready = 0; /* other unlink(s) may be pending (in QH_STATE_UNLINK_WAIT) */ next = qh->reclaim; + ehci->reclaim = next; + ehci->reclaim_ready = 0; qh->reclaim = 0; qh_completions (ehci, qh, regs); @@ -941,8 +941,10 @@ timer_action (ehci, TIMER_ASYNC_OFF); } - if (next) + if (next) { + ehci->reclaim = 0; start_unlink_async (ehci, next); + } } /* makes sure the async qh will become idle */ @@ -1046,7 +1048,8 @@ if (list_empty (&qh->qtd_list)) { if (qh->stamp == ehci->stamp) action = TIMER_ASYNC_SHRINK; - else if (!ehci->reclaim) + else if (!ehci->reclaim + && qh->qh_state == QH_STATE_LINKED) start_unlink_async (ehci, qh); }