NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 2.5k
Star 13.9k

Code
Issues 596
Pull requests 825
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 66 Milestones 1

New pull request New

825 Open 10,662 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[None][test] Add model-derived PyTorch attention backend test suite

#15536 opened Jun 23, 2026 by yuxianq Collaborator

Loading…

[https://nvbugs/6150288][fix] Use persistent per-stream workspace in cublas_mm for CUDA-graph safety

#15534 opened Jun 23, 2026 by pamelap-nvidia Collaborator

Loading…

2 of 4 tasks

[None][chore] Clean deprecated CppMambaCacheManager

#15533 opened Jun 23, 2026 by bo-nv Collaborator

Loading…

1 task done

[None][feat] Qwen-Image: NVFP4 SVDQuant (NVFP4 residual + rank-r BF16 LoRA)

#15532 opened Jun 23, 2026 by jingyu-ml

Loading…

[#14874][feat] AutoDeploy : Perf optimization for gpt-oss-120b for low conc AutoDeploy

<NV> AutoDeploy Backend

#15531 opened Jun 23, 2026 by taylor-yb-lee Collaborator

Loading…

1 task done

[None][chore] Autodeploy disable the pipeline cache by default

#15530 opened Jun 22, 2026 by nvchenghaoz Collaborator

Loading…

1 task

[None][CI] Waive flaky test_vbench_dimension_score_wan (nvbugs/6357628)

#15529 opened Jun 22, 2026 by chang-l Collaborator

Loading…

[None][feat] Support FP8 base weights for MoE LoRA

#15528 opened Jun 22, 2026 by brb-nv Collaborator • Draft

1 task

[https://nvbugs/6276842][test] Loosen rtol/atol on encoder CUDA graph logits parity check

#15527 opened Jun 22, 2026 by tingyangk Collaborator

Loading…

1 task done

[None][feat] Add prefix-aware scheduling config flag to support opt-out

#15526 opened Jun 22, 2026 by SimengLiu-nv Collaborator

Loading…

1 task done

[TRTLLM-13543][feat] WideEP FT: add EPLB mask-only reconfigure (1b.1)

#15525 opened Jun 22, 2026 by chienchunhung Collaborator

Loading…

[TRTLLM-12557][feat] WideEP FT: add AlltoAll watchdog (1a.4)

#15524 opened Jun 22, 2026 by chienchunhung Collaborator

Loading…

[None][fix] Preserve Kimi 2.5 tool call IDs

#15523 opened Jun 22, 2026 by hvagadia Contributor

Loading…

[#14882][fix] Make kv_cache_aware router robust to a missing KV-event stream

#15522 opened Jun 22, 2026 by GodlyDonuts

Loading…

[doc] Clarify dtype='auto' resolution for LLM and KvCacheConfig

#15520 opened Jun 22, 2026 by ojas4414

Loading…

[TRTLLM-11608][feat] Chunked KV cache transfer with early block release

#15519 opened Jun 22, 2026 by athena-nv Collaborator

Loading…

1 task done

[TRTLLM-12714][feat] KV pool rebalance: gate for multi-GPU and coordinate attention-DP

#15518 opened Jun 22, 2026 by thorjohnsen Collaborator

Loading…

[#15516][fix] Guard PRE_MLP NVFP4 fusion when dense MLP is unquantized

#15515 opened Jun 22, 2026 by muma378 • Draft

1 task done

[None][feat] DSA indexer Top-K cross-layer reuse (IndexCache)

#15513 opened Jun 22, 2026 by murphymatt

Loading…

4 tasks done

[None][bugfix] Fix executor test response timeout

#15502 opened Jun 19, 2026 by fallintoplace

Loading…

[None][bugfix] Fix Mamba preloaded HF model loading

#15501 opened Jun 19, 2026 by fallintoplace

Loading…

[None][fix] Make NIXL port-lock path configurable via TRTLLM_NIXL_PORT_LOCK_PATH

#15500 opened Jun 19, 2026 by CodersAcademy006

Loading…

4 tasks done

[https://nvbugs/6329227][fix] Use pkgutil.extend_path to merge the two flash_attn distributions before…

#15498 opened Jun 19, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[https://nvbugs/6337228][fix] In tests/unittest/tools/test_layer_wise_benchmarks.py, replace check_call with…

#15497 opened Jun 19, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

[https://nvbugs/6316980][fix] Added a runtime guard in FlashInferTrtllmGenAttention.is_supported using the…

#15496 opened Jun 19, 2026 by tensorrt-cicd Collaborator

Loading…

2 tasks done

Previous 1 2 3 4 5 … 32 33 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-05-23.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!