forked from pytorch/FBGEMM
-
Notifications
You must be signed in to change notification settings - Fork 9
Pull requests: ROCm/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Implement asynchronous LDS loads for MI350
enhancement
New feature or request
#138
opened Dec 19, 2025 by
avbokovoy
Loading…
Optimizations for index_select_scalar_cumsum_kernel
#137
opened Dec 16, 2025 by
amd-wsung102
Loading…
1 task
Optimize
group_index_select_or_add_2d_kernel by adding a separate codepath for small embedding dimensions
#135
opened Dec 16, 2025 by
aryaman-gupta
Loading…
Fixes bug in one specialized HIP instantiation of the
warp-per-row kernel
#134
opened Dec 5, 2025 by
aryaman-gupta
Loading…
tuned grid size by reducing num_warps_per_threadblock to 4
#117
opened Aug 26, 2025 by
kudomcho
Loading…
1 task
refactored the host anr kernel functions to perforn condition on packedMode_L that does bag packing L when num_packed_bag_L>1
#98
opened Mar 17, 2025 by
kudomcho
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-11-24.