Skip to content

Add plan_any/plan_all to ExprPlanner to decouple functions from sql crate#22967

Open
Jefffrey wants to merge 1 commit into
apache:mainfrom
Jefffrey:decouple-sql-nested-fn
Open

Add plan_any/plan_all to ExprPlanner to decouple functions from sql crate#22967
Jefffrey wants to merge 1 commit into
apache:mainfrom
Jefffrey:decouple-sql-nested-fn

Conversation

@Jefffrey

@Jefffrey Jefffrey commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

  • N/A

Rationale for this change

sql crate toml states that it should not depend on the functions crates:

# Note the sql planner should not depend directly on the datafusion-function packages
# so that it can be used in a standalone manner with other function implementations.
#
# They are used for testing purposes only, so they are in the dev-dependencies section.
[dependencies]
arrow = { workspace = true }
bigdecimal = { workspace = true }
chrono = { workspace = true }
datafusion-common = { workspace = true, features = ["sql"] }
datafusion-expr = { workspace = true, features = ["sql"] }
datafusion-functions-nested = { workspace = true, features = ["sql"] }

However on line 59 it is still depending on the nested functions crate, due to any/all planning support:

/// Plans a `<left> <op> ANY(<right>)` expression for non-subquery operands.
fn plan_any_op(
left_expr: Expr,
right_expr: Expr,
compare_op: &BinaryOperator,
) -> Result<Expr> {
match compare_op {
BinaryOperator::Eq => Ok(array_has(right_expr, left_expr)),
BinaryOperator::NotEq => {
let min = array_min(right_expr.clone());

  • Since needs access to array_has function. Aim to decouple these crates via using ExprPlanner trait.

What changes are included in this PR?

Introduce plan_all and plan_any to ExprPlanner, and move the implementation code into datafusion-functions-nested, removing dependency of datafusion-functions-nested from datafusion-sql

Are these changes tested?

Existing tests.

Are there any user-facing changes?

Hopefully not? Theres new methods on ExprPlanner but they have default implementations, and if using NestedFunctionPlanner (which should be there by default I believe) it'll still plan any/all correctly.

@github-actions github-actions Bot added sql SQL Planner logical-expr Logical plan and expressions functions Changes to functions implementation labels Jun 16, 2026
)))
}

fn plan_any(&self, args: RawAnyAllExpr) -> Result<PlannerResult<RawAnyAllExpr>> {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this implementation code is copied essentially verbatim from sql/src/expr/mod.rs; the only modifications being some renaming of variables (for any it called them left/right which are now aligned to needle/haystack) and some plumbing to make it work with the signatures of plan_any/plan_all

Comment thread datafusion/sql/Cargo.toml
chrono = { workspace = true }
datafusion-common = { workspace = true, features = ["sql"] }
datafusion-expr = { workspace = true, features = ["sql"] }
datafusion-functions-nested = { workspace = true, features = ["sql"] }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main benefit here

array_position(haystack.clone(), lit(ScalarValue::Null), lit(1i64))
.is_not_null();

let decisive_condition = match op {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also moving this planning code into datafusion-functions-nested should allow us to more easily create dedicated UDFs for any/all without needing to expose these UDFs beyond the crate

  • e.g. instead of needing to create new UDFs array_has_all_eq/array_has_all_gt that must be public for the planning code to reach it (from sql/src/expr/mod.rs) we can keep it private within this crate

related:

let right_expr = self.sql_to_expr(*right, schema, planner_context)?;
plan_any_op(left_expr, right_expr, &compare_op)
let needle = self.sql_to_expr(*left, schema, planner_context)?;
let haystack = self.sql_to_expr(*right, schema, planner_context)?;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one part i'm not sure about was whether to include this subquery case as part of this plan_any/plan_all call; maybe its better to just name these as plan_any_nested/plan_all_nested to differentiate it from the subquery planning, which we'll keep in datafusion-sql?

@Jefffrey Jefffrey marked this pull request as ready for review June 16, 2026 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation logical-expr Logical plan and expressions sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant