-
Notifications
You must be signed in to change notification settings - Fork 5
xsk: introduce pre-allocated memory per xsk CQ #6504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bpf-next_base
Are you sure you want to change the base?
xsk: introduce pre-allocated memory per xsk CQ #6504
Conversation
|
Upstream branch: 6f0b824 |
b55822e to
d01de08
Compare
|
Upstream branch: e7a0adb |
3e18f18 to
b8d7d29
Compare
d01de08 to
855781f
Compare
|
Upstream branch: ec439c3 |
b8d7d29 to
bd2315a
Compare
855781f to
2b47d31
Compare
|
Upstream branch: ec439c3 |
bd2315a to
b5f071c
Compare
2b47d31 to
abf45ae
Compare
|
Upstream branch: 3d60306 |
b5f071c to
12d627e
Compare
abf45ae to
74af854
Compare
|
Upstream branch: d2749ae |
12d627e to
080f65f
Compare
74af854 to
5cc3146
Compare
|
Upstream branch: f785a31 |
This is a prep that will be used to store the addr(s) of descriptors so that each skb going to the end of life can publish corresponding addr(s) in its completion queue that can be read by userspace. Signed-off-by: Jason Xing <kernelxing@tencent.com>
Before the commit 30f241f ("xsk: Fix immature cq descriptor production"), there is one issue[1] which causes the wrong publish of descriptors in race condidtion. The above commit fixes the issue but adds more memory operations in the xmit hot path and interrupt context, which can cause side effect in performance. Based on the existing infrastructure, this patch tries to propose a new solution to fix the problem by using a pre-allocated memory that is local completion queue to avoid frequently performing memory functions. The benefit comes from replacing xsk_tx_generic_cache with local cq. The core logics are as show below: 1. allocate a new local completion queue when setting the real queue. 2. write the descriptors into the local cq in the xmit path. And record the prod as @start_pos that reflects the start position of skb in this queue so that later the skb can easily write the desc addr(s) from local cq to cq addrs in the destruction phase. 3. initialize the upper 24 bits of destructor_arg to store @start_pos in xsk_skb_init_misc(). 4. Initialize the lower 8 bits of destructor_arg to store how many descriptors the skb owns in xsk_inc_num_desc(). 5. write the desc addr(s) from the @start_addr from the local cq one by one into the real cq in xsk_destruct_skb(). In turn sync the global state of the cq as before. The format of destructor_arg is designed as: ------------------------ -------- | start_pos | num | ------------------------ -------- Using upper 24 bits is enough to keep the temporary descriptors. And it's also enough to use lower 8 bits to show the number of descriptors that one skb owns. [1]: https://lore.kernel.org/all/20250530095957.43248-1-e.kubanski@partner.samsung.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com>
080f65f to
97d7859
Compare
Pull request for series with
subject: xsk: introduce pre-allocated memory per xsk CQ
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1033607