Reputation: 379
I used the Arria V GX FPGA Starter Kit connected to a computer via PCI Express (PCIE). In the Kit, I implemented my Direct Memory Access (DMA) Read/Write using the pipeline transfer. The DMA read the data from the PC's memory then write to another region of the PC's memory via the PCIE.
The IP I used is Avalon-MM Arria V Hard IP for PCIE with the configuration: Gen1 x8, 32-bit Avalon-MM address width. The software on the PC is Visual Studio programming by C++ and using the 12.0.0 Jungo Windriver.
The project works fine but the transferring speed, especially the reading speed, is too slow. I had done a lot of projects with this DMA, so I don't think the problem is because of my DMA. I have checked the Altera SignalTap of the project, and find out that: + (Figure 1) There are always over 100 clocks since the DMA began to read (the first time 'read' signal is asserted) to the first returning data (the first time 'readdatavalid' signal is asserted) + (Figure 2) After that, there are always about 20 to 50 standby clocks between two returning data, which is too slow.
My design needs to read the data from PC: (1) very little data (about 5 to 10 data for each time); (2) random access (that's why I didn't use burst transfer). But every time a new transferring session started, over 100 clocks are wasted at the beginning and I don't know why. To conclude, Avalon memory-mapped read pipeline costs about 200 clocks just to read 5 data from the PC's memory via PCIE.
My questions are: (1) Why there are so many clocks being wasted in the read pipeline transfer via PCIE? (2) Is there anything else I can do to speed up the transfer rate?
Thank you very much
Upvotes: 0
Views: 907
Reputation: 2370
PCIE was really designed to maximize I/O bandwidth but not to minimize latency. In my designs, I have often seen such a long latency for the first read response from system memory and I do not have any suggestions on how to reduce it.
To get performance from PCIE you have to design around the latency, using burst transfers and pipelining requests so that you can keep many requests in flight in order to maximize bandwidth.
Upvotes: 1