1. [SPARK-26572][SQL] fix aggregate codegen result evaluation (commit: 0d0c9ffef9fbb2e5c1d6caa3f3802a5a099a9eb3) (details)
Commit 0d0c9ffef9fbb2e5c1d6caa3f3802a5a099a9eb3 by wenchen
[SPARK-26572][SQL] fix aggregate codegen result evaluation
This PR is a correctness fix in `HashAggregateExec` code generation. It
forces evaluation of result expressions before calling `consume()` to
avoid multiple executions.
This PR fixes a use case where an aggregate is nested into a broadcast
join and appears on the "stream" side. The issue is that Broadcast join
generates it's own loop. And without forcing evaluation of
`resultExpressions` of `HashAggregateExec` before the join's loop these
expressions can be executed multiple times giving incorrect results.
New UT was added.
Closes #23731 from peter-toth/SPARK-26572.
Authored-by: Peter Toth <> Signed-off-by: Wenchen
Fan <>
(cherry picked from commit 2228ee51ce3550d7e6740a1833aae21ab8596764)
Signed-off-by: Wenchen Fan <>
(commit: 0d0c9ffef9fbb2e5c1d6caa3f3802a5a099a9eb3)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala (diff)