Update dependency Microsoft.ML.OnnxRuntime to 1.27.0 by renovate[bot] · Pull Request #140 · argon-chat/server

renovate · 2026-06-22T20:06:48Z

This PR contains the following updates:

Package	Change	Age	Confidence
Microsoft.ML.OnnxRuntime	`1.26.0` → `1.27.0`

Release Notes

Microsoft/onnxruntime (Microsoft.ML.OnnxRuntime)

`v1.27.0`: ONNX Runtime v1.27.0

n.b. This release is targeting ONNX 1.21. ONNX 1.22 will be supported in ORT 1.28.
n.b. This changelog was generated via LLM. Only the contributor list has been verified. As always, only trust the commit history.

Announcements & Breaking Changes

CUDA 12 package files are now explicitly named as such.
CUDA 12 packages are deprecated, please move to CUDA 13 ASAP.

Security Fixes

Fixed out-of-bounds read in SoftmaxCrossEntropyLoss via label bounds validation (#28004)
Hardened OneHot input validation and output-size computation (#28014)
Added SafeInt overflow protection in Expand and capped constant-folding output sizes (#28055)
Bounded total output allocation size in Tile kernel (#28070)
Added mask/input shape consistency checks in MaxpoolWithMask::Compute (#28223)
Fixed BitShift UB for shift amounts greater than or equal to bit width (#28272)
Validated sequence bounds in GQA (seqlens_k vs cos_cache) (#28277)
Validated conv bias shape in WordConvEmbedding to prevent OOB reads (#28279)
Fixed int32 overflow in CUDA Cast and UnaryElementWise kernels for very large tensors (#28386)
Fixed out-of-bounds read in CropBase scale handling (#28399)
Fixed rank-underflow bug in Inverse kernel trailing-dimension indexing (#28400)
Added sparse tensor external file path validation and additional external-path hardening (#28408, #28709, #28725)
Switched remaining torch.load() calls to weights_only=True (#28421)
Added CPU cache-indirection beam-index validation (#28486)
Added additional overflow/bounds checks and test coverage in runtime buffers (#28713, #28747)

New Features

Execution Provider Plugin API

Added zero-copy I/O for plugin EPs with HOST_ACCESSIBLE memory (#28037)
Added OrtEp::OnSessionInitializationEnd() callback (#28319)
Added plugin EP session-options getters (#28377)
Added CUDA Plugin EP provider options for streams and external allocators (#28603)

Core APIs & Runtime

Added support for ONNX overloaded functions (IR v10+) (#28275)
Added FLOAT8E8M0 datatype support in ONNX Runtime (#28381)
Added CPU Cast support for FLOAT8E8M0 (#28435)
Added kOrtEpDevice_EpMetadataKey_OSDriverVersion example and docs (#28282)

Quantization & Training Tooling

Added calibration cache support to quantize_static (#28221)
Added ActivationRestrictedAsymmetric quantization option (#28237)
Added opset-21 block_size attribute support to QDQ quantization (#28522)
Added CPU fallback for FusedAdam optimizer in ORT Training (#28233)

Execution Provider Updates

NVIDIA CUDA EP

Added ConvTranspose-22 support (#27710)
Filled CUDA opset gaps for LSTM, RNN, Reshape, Cast, Round/Equal, ReduceMax/ReduceMin, Sin/Cos, and Random* ops (#27737, #27743, #27742, #27744, #27754, #27755, #27756, #27759)
Added LpNormalization support for CUDA EP (#28724)
Added chunked dequant+GEMM for MatMulNBits to reduce peak GPU memory (#28712)
Added QMoE tests for standard swiglu and improved decode-path routing/softmax kernels (#28741, #29026)
Fixed CUDA Attention dispatch mismatch for GQA head-size cases (#28358)
Fixed CUTLASS FMHA bias-loader alignment on unaligned kernel path (#28369)

WebGPU EP

Added LSTM support on WebGPU (#27881)
Added per-graph buffer manager for multi-graph capture (#28260)
Added QKV and MLP layer fusions for Qwen3-style models (#28280)
Added QKV bias support in FlashAttention for MultiHeadAttention (#28380)
Added shader dump-to-file environment variable and nightly validation checks (#28674)
Added opset-24 + KV-shared decoder support (Gemma 4) (#28501)
Performance improvements: FlashAttention M4 Max optimization and subgroup-based LinearAttention tuning (#27780, #28412, #28519, #28520)
Fixed numerical and correctness issues in QMoE, LayerNorm/SkipLayerNorm, and MatMul bias indexing (#28427, #28434, #28475)

CoreML EP

Added support for pre-opset-13 Split (split attribute path) and scalar Gather indices (#28270, #28278)
Added FusedConv, Identity, Ceil, Tile, Cast(bool), Sin, Cos, and GatherND support (#28289, #28293, #28595, #28596, #28598)

TensorRT / DML / QNN

Improved TensorRT RTX compatibility (multi-GPU tests, API guards, and subgraph fixes) (#27837, #28361, #28611, #28503)
Added diagnostics for DML failure paths (#28495)
Updated QNN ETW log level rule handling (#27593)

Web & JavaScript

JavaScript / Node.js

Updated JavaScript dependencies (next, postcss, tmp, qs, body-parser, and other npm packages) (#27705, #27894, #28304, #28547, #28644, #28683, [#28694]

CPU & Core Optimizations

MLAS / Quantization / Attention

Added NHWC convolution path in MLAS to reduce transpose overhead (#26834)
Added CPU QMoE 2-bit support and LUT GEMM fast path (#28185)
Added quantized KV-cache support for CPU GroupQueryAttention with SIMD optimizations and tiled compute (#28576, #28578, #28606, #28695)
Added RVV-optimized NCHWc convolution/pooling and LLM operators for RISC-V (#28411, #28518)
Parallelized CPU ScatterElements and optimized MatMulNBits 2-bit float-zero-point path (#28588, #28589)
Added optional bias support in MatMulNBits CPU LUT GEMM path (#28742)

Graph, Optimizer, and Fusion

Added DiT attention fusion for F5-TTS and diffusion transformer models (#27999)
Improved ONNX Attention dispatch path and removed legacy unfused MHA path (#27992)
Added support for optional present_key/present_value outputs in GQA and Gemma4 support (#28242)
Added implicit-input handling for partitioning/fusion around control-flow nodes (#28608, #28690)

Language Bindings

Python

Fixed runtime-unresolvable type annotations in Session and InferenceSession (#27802)
Made sympy an optional runtime dependency (#28141)
Added PEP 561 py.typed marker to the onnxruntime package (#28438)

C#

Added EP tests for CUDA Plugin EP (#28375)

Bug Fixes

Critical & Correctness Fixes

Fixed session use-after-free when UserLoggingFunction is used (#28314)
Prevented double-free in OrtModelEditorApi ownership transfer (#28123)
Fixed plugin EP provider-library load refcount leak and added regression test (#28396, #28430)
Fixed dangling pointer from temporary return value (#28419)
Fixed PRelu returning NaN for infinite inputs on CPU EP (#28750)
Fixed CPU Attention softcap/attn_mask ordering and added CUDA spec coverage consolidation (#28379)
Fixed Reshape allowzero=1 handling for chained zero-size tensors (#28455)
Fixed CPU QLinearConv per-channel weight zero-point handling for distinct values (#28456)
Fixed Unicode-path handling issues on Windows and AppContainer path canonicalization (#28390, #28509)

Portability Fixes

Build and portability fixes across AIX, GCC-15, S390x, libc++/Clang, FreeBSD, and cross-compilation flows (#26704, #27191, #28016, #28049, #28074, #28362, #28567, #28507)
Fixed CPUIDInfo bounds handling for unknown ARM vendors (#28344)
Fixed Resize nearest-mode rounding bug for negative halfway values (#28345)

Contributors

Thanks to our 106 contributors for this release!

@adrastogi, @adrianlizarraga, @AIFrameworksIntegration, @AlekseiNikiforovIBM, @angelser, @angelserMS, @ankitm3k, @anzzraju1997-glitch, @apsonawane, @arajendra, @ayappanec, @bachelor-dou, @badranX, @baijumeswani, @BoarQing, @bopeng1234, @bsosnader, @cbourjau, @chilo-ms, @chunghow-qti, @chwarr, @Craigacp, @daijh, @derdeljan-msft, @dparikh79, @edgchen1, @elwhyjay, @ericcraw, @eserscor, @feich-ms, @fs-eire, @gaugarg-nv, @gblong1, @GopalakrishnanN, @gramalingam, @guschmue, @hariharans29, @HectorSVC, @intbf, @ishwar-raut1, @Jaswanth51, @jatinwadhwa921, @javier-intel, @jchen10, @jiafatom, @Jiawei-Shao, @jnagi-intel, @jocelyn-stericker, @JonathanC-ARM, @justinchuby, @jwludzik, @kevinch-nv, @Kotomi-Du, @kpkbandi, @KV2773, @Laan33, @lhrios, @maxwbuckley, @MayureshV1, @mdvoretc-intel, @mingyueliuh, @mklimenk, @mustjab, @n1harika, @nazanin-beheshti, @orlmon01, @prathikr, @preetha-intel, @psakhamoori, @qiurui144, @qjia7, @qti-ashwshan, @qti-hungjuiw, @qti-yuduo, @rajatmonga, @RajeevSekar, @Rishi-Dave, @rvandermeulen, @RyanMetcalfeInt8, @SamuelLess, @sanaa-hamel-microsoft, @sfatimar, @sgbihu, @shiyi9801, @simonbyrne, @skottmckay, @susbhere, @sushraja-msft, @tairenpiao, @TejalKhade28, @theHamsta, @tianleiwu, @titaiwangms, @umangb-09, @velonica0, @vraspar, @vthaniel, @wenqinI, @xadupre, @xenova, @xhan65, @xiaofeihan1, @xieofxie, @yuslepukhin, @ZackyLake, @zejianzhang1982, @zz002

Full Changelog: microsoft/onnxruntime@v0.1.4...v1.27.0

Configuration

📅 Schedule: (UTC)

Branch creation
- At any time (no schedule defined)
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

Update dependency Microsoft.ML.OnnxRuntime to 1.27.0

f395b6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update dependency Microsoft.ML.OnnxRuntime to 1.27.0#140

Update dependency Microsoft.ML.OnnxRuntime to 1.27.0#140
renovate[bot] wants to merge 1 commit into
masterfrom
renovate/microsoft.ml.onnxruntime-1.x

renovate Bot commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Uh oh!

Conversation

renovate Bot commented Jun 22, 2026

Release Notes

v1.27.0: ONNX Runtime v1.27.0

Announcements & Breaking Changes

Security Fixes

New Features

Execution Provider Plugin API

Core APIs & Runtime

Quantization & Training Tooling

Execution Provider Updates

NVIDIA CUDA EP

WebGPU EP

CoreML EP

TensorRT / DML / QNN

Web & JavaScript

JavaScript / Node.js

CPU & Core Optimizations

MLAS / Quantization / Attention

Graph, Optimizer, and Fusion

Language Bindings

Python

C#

Bug Fixes

Critical & Correctness Fixes

Portability Fixes

Contributors

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

`v1.27.0`: ONNX Runtime v1.27.0