Add a ConstraintAnalysis pass by kripken · Pull Request #8853 · WebAssembly/binaryen

kripken · 2026-06-17T17:55:25Z

This simply parses IR into constraints, flows them around, and sees
if we can prove things.

This is a minimal version, without conditional propagation etc.

…raint

Co-authored-by: Thomas Lively <tlively123@gmail.com>

…raint

tlively

Comments on code so far. Will look at tests next.

tlively · 2026-06-17T18:18:39Z

+      // We now know the values at the end of the block. If something changed,
+      // flow it onward.
+      if (constraints != block->contents.endConstraints) {
+        block->contents.endConstraints = std::move(constraints);


Can we prove this will actually converge? Can we create a pathological case where the analysis alternates between two different constraint sets forever?

It must converge now because we simply drop extra things in approximateAnd. If we did something more complex, we'd need to be careful and define a total order, I think.

I'm worried that a sequence of ORs (control flow merges) and ANDs (from the contents of blocks) could change the order of constraints so that the one that is dropped is not stable.

tlively · 2026-06-17T18:27:07Z

+  // For each local index, we track the constraints we know about it. We only do
+  // so at the end of each block, which is enough for the analysis below.
+  LocalConstraintMap endConstraints;


Why not keep the beginning constraints instead? Then when we can get to the end of a block, we can merge the current constraints (and eventually any additional constraints due to the specific control flow edge) into the beginning constraints of each successor block and only process that successor block again if its starting constraints are different.

In contrast, the current approach may reprocess successor blocks even if they don't learn anything new from the single predecessor that was updated.

Storing the beginning constraints would also let us avoid re-merging all the predecessors for each block in the optimization phase.

This would also help prove convergence. If we can show that the merge operation on constraint sets is monotonic on some partial order on constraint sets and converges after a bounded number of steps, then we will know the analysis will converge. In contrast, it's hard to say anything about how the end constraints will change over time because they are the result of non-monotonic meet operations over the course of the block.

Interesting, yeah, storing the start might have benefits. It does mean adding a bottom element though, so we can merge incrementally like that.

However, about the very last point: storing the beginning or the end is NFC, so I don't see how it helps with convergence?

Instead of a bottom element it might be cleaner to use std::optional in the pass. That is, there is no logical (as in mathematical logic) meaning to the bottom element - it is not a logical constraint - so a "null" can be handled by the users, rather than inside constraint.h.

However - thinking more on this, I'm not sure it's right. We can't simply keep merging in content as it flows around. The input to a block is, effectively, X || Y || Z, and it matters which of those we update. If we update X three times with a == 10 that is very different than if we update X, Y, Z to that constraint - only then we can apply something.

So I think it is best to do this as it is written: merge the inputs in a loop, seeing them all at once.

Instead of a bottom element it might be cleaner to use std::optional in the pass. That is, there is no logical (as in mathematical logic) meaning to the bottom element - it is not a logical constraint - so a "null" can be handled by the users, rather than inside constraint.h.

Sure, bottom is not literally a bounded set of constraints like other values would be, but it is certainly a meaningful point in the space of possible known constraints. Representing it in constraint.h keeps the analysis code simpler while ensuring that we can never "forget" that we have observed a contradiction. It also lets us write unit tests for it.

If we update X three times with a == 10 that is very different than if we update X, Y, Z to that constraint - only then we can apply something.

I'm not sure what you mean here.

If we receive data from blocks X, Y, and Z, then we only have a valid state after seeing all of X, Y, and Z. That is, if our state starts at some null/bottom, and X arrives, we cannot flow X onward.

Concretely, if X has a == 10 but we haven't seen Y or Z yet, then we don't know if a == 10 is true in this block, and it would be invalid to apply a == 10 in the block and/or to flow it onward.

We only find the valid state of inputs to the block after merging X, Y, and Z. Doing so at once is the simplest way to get the valid state.

When an analysis uses a proper join-semilattice, it's fine to start out with incomplete information because we know that eventually we will reach a sound fixed point where we know everything we need to know.

If you're right that it's not sound to start with incomplete information for one block, then it seems that worklist algorithm for flowing information around would not be sound, since it necessarily starts with incomplete information at loop heads.

But I think it should be fine to start with incomplete information. We don't have a join-semilattice because the OR operation does not go to a unique least upper bound, but it is at least monotonic. We know that the analysis will only ever learn more possibilities (i.e. loosen or drop constraints) as information flows around. So if we start out knowing X has a == 10 and propagate some constraints based on that information, but then later we process Y or Z and learn that a might have some other value, then we will recalculate what we can infer (which will be fewer/looser constraints in general) and then propagate those looser constraints to the successors.

When an analysis uses a proper join-semilattice, it's fine to start out with incomplete information because we know that eventually we will reach a sound fixed point where we know everything we need to know.

Let me put it this way: if we see this as an abstract interpretation situation, then we must calculate the entire transfer function at each node. And the entire transfer function is f(X, Y, Z) = X || Y || Z in this example: we cannot receive only X and propagate that onwards.

Or, here is the actual problem: again, if one predecessor X supplies a == 10, and we are in a block with if (a == 10), then if we just use a == 10 at the start of the block and start to apply it, we'd optimize that if condition. But we can't. Again, we must calculate the entire transfer function at the start of the block before doing anything with that data, so that we have something valid.

Now, we could define the transfer function so that f(X, null, null) == null - effectively blocking propagation until we get info from all sides - but I think it is far more natural to just do the expected thing from the point of view of mathematical logic: we are computing X || Y || Z, so just compute that at the top of each block.

That is, what does null mean from the point of view of logic? I am trying to keep the math pure here 😄

tlively · 2026-06-17T18:32:01Z

+      if (pred == *block->in.begin()) {
+        // This is the first. Just copy.
+        constraints = predConstraints;
+      } else {
+        // Merge in subsequent ones.
+        constraints.approximateOr(predConstraints);
+      }


If we had a bottom value to initialize constraints to, then we wouldn't have to distinguish the first constraints like this.

True, yeah. Maybe worth adding, though this might be the only place it helps?

Sign in to view

+      auto value = Literal::makeZero(get->type);
+      return LocalConstraint{get->index, Constraint{Abstract::Eq, {value}}};
+    }
+    return {};


kripken added 30 commits May 7, 2026 15:49

go

fdd5024

go

aabd860

Merge remote-tracking branch 'origin/main' into range.analysis

3332b8a

work

8b18012

work

002d237

work

229ae81

work

5598aec

work

ed255d4

work

e5d8299

work

fa4805f

work

ae146cf

work

4748cc5

work

f7e33cc

work

80a843e

work

2b10d87

work

c9f17bc

form

18a691b

work

5e8c7c3

work

3b22e48

work

08e419d

work

70add1c

work

15c806a

work

c30dcda

work

c74abed

work

da40546

work

ddc615e

work

2707bb1

format

6390870

work

3cedbcf

work

f54afaa

kripken and others added 20 commits June 16, 2026 12:28

remove Invalid

c924041

Merge remote-tracking branch 'myself/constraint.by.itself' into const…

8e5e075

…raint

merg

90aabfa

merg

81714b2

work

edf8059

go

b1522c2

fix

97ec6b5

go

966676d

Update src/ir/constraint.h

9645e71

Co-authored-by: Thomas Lively <tlively123@gmail.com>

Update src/ir/constraint.h

64e0a86

Co-authored-by: Thomas Lively <tlively123@gmail.com>

Update src/ir/constraint.h

0e0bfb7

Co-authored-by: Thomas Lively <tlively123@gmail.com>

Update src/ir/constraint.h

6947fc3

Co-authored-by: Thomas Lively <tlively123@gmail.com>

format

2b30208

Merge remote-tracking branch 'myself/constraint.by.itself' into const…

464aba5

…raint

Merge remote-tracking branch 'origin/main' into constraint

82aad7d

merg

62e5ee9

clean

835d863

fix

6aec092

nice

c014633

form

81f2421

kripken requested a review from tlively June 17, 2026 17:55

kripken requested a review from a team as a code owner June 17, 2026 17:55

kripken added 4 commits June 17, 2026 10:55

help

4f9fb9d

avoid warning

dab23fb

avoid warning

26b1b62

avoid warning

bda6665

tlively reviewed Jun 17, 2026

View reviewed changes

kripken added 3 commits June 17, 2026 13:09

add todo

9dbf1d9

rename

1981731

format

ec2c0d5

Conversation

kripken commented Jun 17, 2026

Uh oh!

tlively left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants