Reputation: 63
My application uses DPDK-stable-20.11.10. A Segmentation fault happened this morning when i tried to run my application as usual. Then I used gdb to trace the problem and found that the segmentation fault occured due to an illegal memory access in DPDK. The problem happed in function rxq_cq_decompress_v
, defined in drivers\net\mlx5\mlx5_rxtx_vec_neon.h
(This function is implemented differently across various architectures, my server is aarch64). Here is the problem code:
static inline uint16_t
rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cq,
struct rte_mbuf **elts)
{
......
/*
* A. load mCQEs into a 128bit register.
* B. store rearm data to mbuf.
* C. combine data from mCQEs with rx_descriptor_fields1.
* D. store rx_descriptor_fields1.
* E. store flow tag (rte_flow mark).
*/
for (pos = 0; pos < mcqe_n; ) {
uint8_t *p = (void *)&mcq[pos % 8];
uint8_t *e0 = (void *)&elts[pos]->rearm_data;
uint8_t *e1 = (void *)&elts[pos + 1]->rearm_data;
uint8_t *e2 = (void *)&elts[pos + 2]->rearm_data;
uint8_t *e3 = (void *)&elts[pos + 3]->rearm_data;
uint16x4_t byte_cnt;
#ifdef MLX5_PMD_SOFT_COUNTERS
uint16x4_t invalid_mask =
vcreate_u16(mcqe_n - pos < MLX5_VPMD_DESCS_PER_LOOP ?
-1UL << ((mcqe_n - pos) *
sizeof(uint16_t) * 8) : 0);
#endif
......
The overflow happened in line uint8_t *e3 = (void *)&elts[pos + 3]->rearm_data;
, when DPDK tried to visit elts[pos + 3]
.
In my subsequent attempts to reproduce the error, I discovered that it could occur inelts[pos + 3]
or elts[pos + 2]
or some other accesses to elts
array around above lines.
I read the related code and found tha elts
seems to be a ring buffer. In rxq_burst_v
function which calls rxq_cq_decompress_v
, a mask e_mask
was used to access this array:
static inline uint16_t
rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts,
uint16_t pkts_n, uint64_t *err, bool *no_cq)
{
......
const uint16_t e_n = 1 << rxq->elts_n;
const uint16_t e_mask = e_n - 1;
......
if (rcvd_pkt > 0) {
......
rxq_copy_mbuf_v(&(*rxq->elts)[rxq->rq_pi & e_mask],
pkts, rcvd_pkt);
......
}
elts_idx = rxq->rq_pi & e_mask;
elts = &(*rxq->elts)[elts_idx];
I am sorry that i cannot provide any code pieces or picture of my application for some confidentiality requirements, but the prolem seems happen inside DPDK. I am confused that:
e_mask
?Appreciate for your help! :)
Upvotes: 1
Views: 49