1. [SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consider (commit: acf20d235b6ffe2ca5d8ba433f97f83367fcc160) (details)
Commit acf20d235b6ffe2ca5d8ba433f97f83367fcc160 by dongjoon
[SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consider
NULL as False
## What changes were proposed in this pull request?
In `ReplaceExceptWithFilter` we do not consider properly the case in
which the condition returns NULL. Indeed, in that case, since negating
NULL still returns NULL, so it is not true the assumption that negating
the condition returns all the rows which didn't satisfy it, rows
returning NULL may not be returned. This happens when constraints
inferred by `InferFiltersFromConstraints` are not enough, as it happens
with `OR` conditions.
The rule had also problems with non-deterministic conditions: in such a
scenario, this rule would change the probability of the output.
The PR fixes these problem by:
- returning False for the condition when it is Null (in this way we do
return all the rows which didn't satisfy it);
- avoiding any transformation when the condition is non-deterministic.
## How was this patch tested?
added UTs
Closes #23372 from mgaido91/SPARK-26366_2.3_2.
Authored-by: Marco Gaido <> Signed-off-by:
Dongjoon Hyun <>
(commit: acf20d235b6ffe2ca5d8ba433f97f83367fcc160)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceExceptWithFilter.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceOperatorSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala (diff)