Sort enum-case unions by class and case name instead of describe()#5929
Conversation
UnionTypeHelper::sortTypes runs on every UnionType construction and sorts enum-case members through describe(VerbosityLevel), the documented "never sort via describe()" anti-pattern. It is the dominant cost of sorting a large-enum union, which a match/switch over the enum re-sorts once per arm. Compare enum cases by className.'::'.caseName instead - the same key IntersectionType::getFiniteTypes() already uses. That is the describe(typeOnly) string for an enum case (enums are never generic), so the sort order is identical, without the per-comparison VerbosityLevel dispatch and sprintf. The instanceof EnumCaseObjectType gets a baseline entry alongside the two existing ones for that rule (the same internal type-system usage). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Please show the code of the reproducer for this perf improvement. In which real world file was the bottleneck observed? |
|
Here's a self-contained reproducer. The only thing that builds (and repeatedly re-sorts) a large enum-case union is an exhaustive php -r '$N=800; $s="<?php\n\nenum Status: int\n{\n";
for($i=0;$i<$N;$i++)$s.=" case C$i = $i;\n"; $s.="}\n\nfunction handle(Status \$x): string\n{\n return match (\$x) {\n";
for($i=0;$i<$N;$i++)$s.=" Status::C$i => \"v$i\",\n"; $s.=" };\n}\n"; file_put_contents("repro.php",$s);'
vendor/bin/phpstan analyse -l 8 repro.phpA/B swapping only
On the real-world file: honestly, I don't have one. Large enums do exist in the wild (Tempest's One bound, in case it comes up: |
|
so how did you come to this optimization? does it fix a real world problem in one of the projects you analyzed? |
I've been running https://github.com/SanderMuller/boost-skills/blob/main/resources/boost/skills/autoresearch/SKILL.md for almost entire yesterday, with a second claude instance on |
|
I welcome this 👍 it's okay that AI comes up with this if we find real improvements |
|
I am thinking whether the perf fix should be in |
|
Good idea, I tried it. I put a small per-verbosity memo on
So the memo matches the fast-path on this workload, and you're right that it's broader. Instrumenting it on the same run showed 4.8k computes against 6.35M cache hits (99.9%): the same case instances recur across the per-arm narrowing, so one memo serves the whole sort plus every other The trade-off is just where the cost lands. The memo adds a per-instance Happy to switch the PR to the memo if you'd prefer that home for it. It's the cleaner and broader location; the only thing to weigh against it is the always-paid per-instance cache. Your call. |
|
thank you! |
UnionTypeHelper::sortTypes()runs on everyUnionTypeconstruction. Object and enum-case members fall through to the final branch, which sorts them viadescribe():For enum cases that is the documented "never sort or compare types via
describe()" anti-pattern, and it is the dominant cost of sorting a large-enum union: amatch/switchover the enum reconstructs the subject union (the enum minus the matched cases) once per arm, so each arm re-sorts the shrinking union and re-runs thedescribe()machinery (VerbosityLevel::handledispatch plussprintf) over every member.This adds an enum-case fast-path that compares
className.'::'.caseName, the same key the maintainer already uses for enum cases inIntersectionType::getFiniteTypes()(enum cases use this key there, other finite types fall back todescribe(typeOnly)). The sort order is identical because the comparator'sstrcasecmpprimary is case-insensitive, so it is unaffected by the only difference fromdescribe(): the raw stored class name can differ in case from the canonical reflection namedescribe()resolves. The fast-path skips theVerbosityLeveldispatch and thesprintfper comparison.The numbers come from a stress test, a
matchover a 1,500-case enum: CPU 7.50s to 3.67s (−51%), output byte-identical. That size is a scaling demonstration rather than typical code; real enums rarely get that large, so the win is per-file on the rare file with a large-enummatch/switch(e.g. aLocale-style enum) and does not move whole-project time. It removes the per-comparison constant factor, not the O(N²) per-arm re-sort itself.instanceof EnumCaseObjectTypegets a baseline entry next to the two existing ones for that rule (phpstanApi.instanceofType);IntersectionTypeandEnumCaseObjectTypealready use the same instanceof for internal type-system code.Verified: Type 2941, Analyser 2851, and the match-exhaustiveness plus enum rule suites 432 all pass with output unchanged.