[mono][llvm] Use llvm.minimum/maximum for scalar Math.Min/Max float ops#129593
Open
lewing wants to merge 2 commits into
Open
[mono][llvm] Use llvm.minimum/maximum for scalar Math.Min/Max float ops#129593lewing wants to merge 2 commits into
lewing wants to merge 2 commits into
Conversation
The scalar OP_FMIN/OP_FMAX/OP_RMIN/OP_RMAX lowering in the Mono LLVM
backend used `fcmp ULE/UGE + select`. That sequence has two problems:
1. **Wrong NaN semantics.** With LLVMRealULE, ULE(NaN, x) is true so
select returns lhs=NaN, but ULE(x, NaN) is also true so select
returns lhs=x. That makes `Math.Min(x, NaN)` return `x` on every
Mono+LLVM target, violating Math.Min/MathF.Min/IEEE 754-2019
`minimum` semantics ("if either input is NaN, NaN is returned").
2. **AArch64 backend miscompile.** Under LLVM 23 (the toolchain that
arrives via emsdk 5.0.6, see dotnet#129299), the AArch64 ISel matches
`select(fcmp ogt/ult/... a, b), b, a` and lowers it to
`fminnm`/`fmaxnm` (IEEE 754-2008 minNum/maxNum, NaN-suppressing),
which returns the non-NaN operand for any NaN input. The
resulting NaN-suppression silently miscompiles the System.Half
software conversion path (`Half.op_Explicit(float)`), turning
`Half.AcosPi(NaN)`, `Half.Lerp(+inf, -inf, t)`,
`Math.Min(NaN, x) -> Half`, etc. into +/-Infinity instead of NaN
and causing 30 HalfTests failures on iossimulator-arm64 with
LLVM 23 (issue dotnet#129507).
Fix:
* Add llvm.minimum / llvm.maximum intrinsics for f32/f64 to
llvm-intrinsics.h. Both have existed in LLVM since v12, so all
toolchains currently in use have them.
* Split the scalar OP_FMIN/FMAX/RMIN/RMAX cases out of the shared
integer min/max group in mini-llvm.c and emit them via
llvm.minimum.f32 / llvm.maximum.f32 (and f64) instead of fcmp+select.
These intrinsics are documented as NaN-propagating (IEEE 754-2019),
which matches the .NET BCL spec, and on AArch64 they lower to
`fmin`/`fmax` (also NaN-propagating, not `fminnm`/`fmaxnm`).
* In intrinsics.c, remove the `mono_use_fast_math` gate on the
Math.Min/Max(float|double) recognition. The gate existed because
the old lowering had wrong NaN behavior; with llvm.minimum the
lowering is NaN-correct, so the gate is no longer needed.
Removing it also makes the scalar intrinsic fire for the common
case (which is what fixes the Half miscompile: the C# `Math.Min`
body's `IsNaN(val1)` guard is otherwise stripped by LLVM after
inlining into call sites with a constant `val1`, exposing the
buggy `select` pattern to the AArch64 backend).
Validation: with this patch applied on PR dotnet#129299's branch
(emsdk 5.0.6 / LLVM 23), `dotnet build /t:Test System.Runtime.Tests`
for `System.Tests.HalfTests` and `System.Tests.HalfTests_GenericMath`
on iossimulator-arm64 with full AOT + LLVM goes from 30 failures
(1424/1442 + 344/356) to **0 failures** (1442/1442 + 356/356).
The semantic fix (asymmetric NaN handling) also benefits non-AArch64
targets and non-Half code paths.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @steveisok, @vitek-karas |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates Mono’s LLVM backend lowering for scalar floating-point min/max so it uses LLVM’s llvm.minimum/llvm.maximum intrinsics (IEEE 754-2019, NaN-propagating), instead of the existing fcmp + select sequence.
Changes:
- Add scalar
llvm.minimum/llvm.maximumintrinsic entries forfloat/doubleand route scalarOP_FMIN/OP_FMAX/OP_RMIN/OP_RMAXthrough them. - Remove the
mono_use_fast_mathgate for recognizingMath.Min/Max(float|double)as intrinsics so the improved lowering applies in the default configuration.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/mono/mono/mini/mini-llvm.c | Switch scalar float/double min/max lowering from fcmp+select to llvm.minimum/maximum intrinsics. |
| src/mono/mono/mini/llvm-intrinsics.h | Register llvm.minimum/llvm.maximum overloads for scalar f32/f64. |
| src/mono/mono/mini/intrinsics.c | Recognize `Math.Min/Max(float |
Copilot reviewer noted the source comments only mentioned MathF.Min/Max, but these intrinsic recognition + lowering paths handle Math.Min/Max (double,double) and (float,float) — the MathF forwarders just call into the Math overloads. Comment-only change; no codegen impact. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The scalar
OP_FMIN/OP_FMAX/OP_RMIN/OP_RMAXlowering in the Mono LLVM backend usesfcmp ULE/UGE + select. That sequence has two problems:1. Wrong NaN semantics (all targets)
With
LLVMRealULE:ULE(NaN, x)is true →selectreturnslhs = NaN✓ULE(x, NaN)is also true →selectreturnslhs = x✗So today
Math.Min(x, NaN)returnsxon every Mono+LLVM target. That violatesMath.Min/MathF.Min("if either input is NaN, NaN is returned") and IEEE 754‑2019minimumsemantics.2. AArch64 backend miscompile (LLVM 23)
Under LLVM 23 (the toolchain that arrives via the emsdk 5.0.6 upgrade in #129299), the AArch64 ISel matches
select(fcmp ogt/ult/… a, b), b, aand lowers it tofminnm/fmaxnm— IEEE 754‑2008minNum/maxNum, which is NaN‑suppressing. Result: for any NaN input the non‑NaN operand is returned, silently dropping NaN through theSystem.Halfsoftware conversion path (Half.op_Explicit(float)clamps withfloat.Min(MaxHalfValueBelowInfinity, value)).Concretely on
iossimulator-arm64with full AOT + LLVM (the failing CI leg in #129299):Tracked as #129507.
Fix
src/mono/mono/mini/llvm-intrinsics.h: addllvm.minimum.f32/.f64andllvm.maximum.f32/.f64. Both intrinsics have been in LLVM since v12, so every toolchain in use here has them.src/mono/mono/mini/mini-llvm.c: split the scalarOP_FMIN/OP_FMAX/OP_RMIN/OP_RMAXcases out of the shared integer min/max group and emit them viallvm.minimum/llvm.maximuminstead offcmp + select. These intrinsics are NaN‑propagating (IEEE 754‑2019), matching the BCL spec; on AArch64 they lower tofmin/fmax(also NaN‑propagating) rather thanfminnm/fmaxnm.src/mono/mono/mini/intrinsics.c: remove themono_use_fast_mathgate on theMath.Min/Max(float|double)intrinsic recognition. The gate existed because the old lowering had wrong NaN behavior; withllvm.minimumit is NaN‑correct, so the gate is no longer needed. Removing it also makes the intrinsic actually fire in the common case (which is what closes the Half miscompile — the C#Math.Minbody'sIsNaN(val1)guard is otherwise stripped by LLVM after inlining into call sites with a constantval1, exposing the buggyselectpattern to the AArch64 backend).Validation
On a worktree of PR #129299 (emsdk 5.0.6 / LLVM 23
23.1.0-alpha.1.26314.2), iPhone 11 Pro iOS 26.5 simulator on M5 Max / macOS 26.5.1, full AOT + LLVM (MonoForceInterpreter=false /p:MonoEnableLLVM=true):System.Tests.HalfTestsSystem.Tests.HalfTests_GenericMathThis closes #129507 once the emsdk upgrade in #129299 lands.
Risk / scope
Math.Min/Max(float|double)semantics change on all Mono+LLVM targets: today's behavior is asymmetric and spec‑violating; new behavior matches the BCL docs. Existing tests that depended on the wrong asymmetric behavior would break, but none are expected.llvm.minimum/maximumlower to a single instruction on AArch64 (fmin/fmax) and to a short compare‑and‑select sequence on x86 without AVX‑10 (similar shape to today'sfcmp + select).aarch64_neon_fmin/fmax); no codegen difference for the SIMD path.cc @vargaz @kotlarmilos @lambdageek @tannergooding
Note
This pull request was produced by GitHub Copilot during an AI-assisted investigation. See https://gist.github.com/davidnguyen-tech/a8e373243f9cb0b0a5bde847a08323a1 for the original repro write-up and #129507 for the tracking issue. The miscompile was bisected to the AArch64 SDAG combine introducing
fminnmafter inlining; the fix uses LLVM's NaN-propagating min/max intrinsics, which also corrects long-standing asymmetric NaN behavior in Mono'sMath.Min/MathF.Minlowering on every target.