Reputation: 396
We are using a custom board running version 3.2 of the kernel, but we've come across some issues when testing the robustness of the MMC layer.
First of all, our MMC slot does not have a card detection pin. The issue in question consists of the following:
cat foo.txt
)Now, I've tracked the issue to the following code section, in drivers/mmc/card/queue.c
:
static int mmc_queue_thread(void *d)
{
struct mmc_queue *mq = d;
struct request_queue *q = mq->queue;
current->flags |= PF_MEMALLOC;
down(&mq->thread_sem);
do {
struct request *req = NULL;
struct mmc_queue_req *tmp;
spin_lock_irq(q->queue_lock);
set_current_state(TASK_INTERRUPTIBLE);
req = blk_fetch_request(q);
mq->mqrq_cur->req = req;
spin_unlock_irq(q->queue_lock);
if (req || mq->mqrq_prev->req) {
set_current_state(TASK_RUNNING);
mq->issue_fn(mq, req);
} else {
if (kthread_should_stop()) {
set_current_state(TASK_RUNNING);
break;
}
up(&mq->thread_sem);
schedule();
down(&mq->thread_sem);
}
/* Current request becomes previous request and vice versa. */
mq->mqrq_prev->brq.mrq.data = NULL;
mq->mqrq_prev->req = NULL;
tmp = mq->mqrq_prev;
mq->mqrq_prev = mq->mqrq_cur;
mq->mqrq_cur = tmp;
} while (1);
up(&mq->thread_sem);
return 0;
}
Investigating this code, I've found that it hangs inside the mq->issue_fn(mq, req)
call. This function prepares and issues the appropriate commands to fufill the request passed into it, and it is aware of any errors that happens when it can't communicate with the card. The error is handled in an awkward (in my opinion) manner, and does not 'bubble up' to the mmc_queue_thread
. However, I've checked my version's code against that of the most recent kernel release (4.9), and I did not see any difference in the treatment of such errors, apart from a better separation of each error case (the treatment is very similar).
I suspect that the issue is caused by the inability ofthe upper layers to stop making new requests to read from the MMC card.
What I've Tried:
ret = mq->issue_fn(mq, req)
.kthread_stop
, calling mmc_queue_suspend
, __blk_end_request
, and others. With these, the most I've been able to accomplish was keep the thread in a "harmless" state, where it still existed, but did not consume any resources. However, the user space program that triggered the call does not return, locking up in an uninterruptible state.My questions:
Update: I've found out that you can tell the driver to use polling to detect card insertion/removal. It can be done by adding the MMC_CAP_NEEDS_POLL
flag to the mmc's .caps
on driver initialization (on newer kernels, you can use the broken-cd
property on your DT). However, the problem still happens after this modification.
Upvotes: 1
Views: 402
Reputation: 396
Figured out the solution!
The issue was, as I suspected, that the block requests kept being issued even after the card was removed. The error was fixed in commit a8ad82cc1b22d04916d9cdb1dc75052e80ac803c
from Texas' kernel repository:
commit a8ad82cc1b22d04916d9cdb1dc75052e80ac803c
Author: Sujit Reddy Thumma <[email protected]>
Date: Thu Dec 8 14:05:50 2011 +0530
mmc: card: Kill block requests if card is removed
Kill block requests when the host realizes that the card is
removed from the slot and is sure that subsequent requests
are bound to fail. Do this silently so that the block
layer doesn't output unnecessary error messages.
Signed-off-by: Sujit Reddy Thumma <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Signed-off-by: Chris Ball <[email protected]>
The key in the commit is that a return BLKPREP_KILL
was added to the mmc_prep_request()
, besides some minor tweaks that added some 'granularity' to the error treatment, adding a specific code for the cases where the card was removed.
Upvotes: 1