Reputation: 325
The DirectX 9 application/game I inherited uses dynamic vertex buffers. Each frame, it:
My question is, is the part with the temporary buffer necessary? Is there a reason why I shouldn't write vertex data directly into the vertex buffer?
I haven't found any evidence of this practice in the official documentation, and I don't trust the previous programmer enough.
Upvotes: 4
Views: 1787
Reputation: 146910
There is no need for the temporary buffer. The pointer you are given back from Lock
is, in essence, actually already a temporary buffer. The driver can only realistically begin any meaningful operations on it once you unlock the buffer.
If you use D3DLOCK_DISCARD
, then the driver has no obligation to honour reads with any sensible data. So the implementation can perfectly well return malloc(size)
.
If you don't use D3DLOCK_DISCARD
, then, well, that's a separate question, really.
Upvotes: 1
Reputation: 342
The temporary buffer is not required, with a caveat.
DirectX dynamic vertex buffers are optimised for read access by the GPU and write access by the CPU. The write access optimisation is called write combining, and involves a different mechanism than normal memory caching. The CPU will batch writes together, given that you write to the memory in 4/8/16 byte chunks and in order.
Note that it's up to the driver to decide what kind of memory you get back from a lock on a dynamic buffer, it may not be write combined, but treating it as such is the best bet.
Write combined memory is not cached, so reading from it is a performance disaster.
It might explain why the game you inherited uses a temporary buffer if it reads as well as writes to the temporary buffer, or makes no effort to write components in order - positions first then texture coordinates for example.
Upvotes: 2
Reputation: 120711
Discaimer: I don't know how DirectX vertex buffers work, I might be wrong here.
It would probably be slower: a vertex buffer is allocated to optimize access from the GPU, i.e. preferrably somewhere in the GPU's own memory. That means directly accessing it from the CPU is much slower than accessing ordinary RAM. Copying a whole array on the other hand can be done relatively fast, so it is better to prepare such an array in main memory and copying it to the vertex buffer in one go.
Upvotes: 2