dczulak
dczulak

Reputation: 135

mmap() slower than write() copy_from_user(), why?

I need to transfer big blocks of data (~6MB) to my driver from user space. In the driver, I allocate 2 3MB chunks per block using pci_alloc_consistent(). I then mmap() each block (i.e. 2 chunks) to a single vma using vm_insert_page(). This allows user space to read/write each block after mmap'ing it. It seems to work but the performance is not acceptable.

I also implemented another way of writing/reading to/from the memory allocated by pci_alloc_consistent() in the driver. I use write() from user space and then copy_from_user() in the driver to move content of each chunk in the block to the above memory. I do the opposite for reads.

I found that the first approach was at least 2-3 times slower and used ~40% more cpu. I expected that introduction of an additional buffer copy in the second case would make it slower. However, that was not the case.

I ran thest tests on x86 64-bit platforms, kernels: 2.6.* and 3.*.

Do the above results make sense? If yes, can someone please provide some background on what is taking place?

Thanks.

Upvotes: 3

Views: 1147

Answers (1)

Raghu
Raghu

Reputation: 501

caching is probably disabled. Did you ioremap_cache() the chunks that you allocated and vm_inserted? Iv come across this kind of problem on x86/x86_64 and has to do with PAT(page attribute table). You need to ioremap_cache() the physical pages to set the memory type as cache-able and then call vm_insert_page. That should fix your performance issue.

Upvotes: 3

Related Questions