Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][test] Add model-derived PyTorch attention backend test suite
#15536 opened Jun 23, 2026 by yuxianq Collaborator Loading…
[None][chore] Clean deprecated CppMambaCacheManager
#15533 opened Jun 23, 2026 by bo-nv Collaborator Loading…
1 task done
[#14874][feat] AutoDeploy : Perf optimization for gpt-oss-120b for low conc AutoDeploy <NV> AutoDeploy Backend
#15531 opened Jun 23, 2026 by taylor-yb-lee Collaborator Loading…
1 task done
[None][chore] Autodeploy disable the pipeline cache by default
#15530 opened Jun 22, 2026 by nvchenghaoz Collaborator Loading…
1 task
[None][CI] Waive flaky test_vbench_dimension_score_wan (nvbugs/6357628)
#15529 opened Jun 22, 2026 by chang-l Collaborator Loading…
[None][feat] Support FP8 base weights for MoE LoRA
#15528 opened Jun 22, 2026 by brb-nv Collaborator Draft
1 task
[None][feat] Add prefix-aware scheduling config flag to support opt-out
#15526 opened Jun 22, 2026 by SimengLiu-nv Collaborator Loading…
1 task done
[TRTLLM-12557][feat] WideEP FT: add AlltoAll watchdog (1a.4)
#15524 opened Jun 22, 2026 by chienchunhung Collaborator Loading…
[None][fix] Preserve Kimi 2.5 tool call IDs
#15523 opened Jun 22, 2026 by hvagadia Contributor Loading…
[TRTLLM-11608][feat] Chunked KV cache transfer with early block release
#15519 opened Jun 22, 2026 by athena-nv Collaborator Loading…
1 task done
[None][feat] DSA indexer Top-K cross-layer reuse (IndexCache)
#15513 opened Jun 22, 2026 by murphymatt Loading…
4 tasks done
ProTip! What’s not been updated in a month: updated:<2026-05-23.