1. [SPARK-24781][SQL] Using a reference from Dataset in Filter/Sort might (details)
Commit 9cf375f5be3d359912bde9b6ba5766425e8cf3bb by gatorsmile
[SPARK-24781][SQL] Using a reference from Dataset in Filter/Sort might
not work
## What changes were proposed in this pull request?
When we use a reference from Dataset in filter or sort, which was not
used in the prior select, an AnalysisException occurs, e.g.,
```scala val df = Seq(("test1", 0), ("test2", 1)).toDF("name", "id")"name")).filter(df("id") === 0).show()
```scala org.apache.spark.sql.AnalysisException: Resolved attribute(s)
id#6 missing from name#5 in operator !Filter (id#6 = 0).;;
!Filter (id#6 = 0)
  +- AnalysisBarrier
     +- Project [name#5]
        +- Project [_1#2 AS name#5, _2#3 AS id#6]
           +- LocalRelation [_1#2, _2#3]
``` This change updates the rule `ResolveMissingReferences` so `Filter`
and `Sort` with non-empty `missingInputs` will also be transformed.
## How was this patch tested?
Added tests.
Author: Liang-Chi Hsieh <>
Closes #21745 from viirya/SPARK-24781.
(cherry picked from commit dfd7ac9887f89b9b51b7b143ab54d01f11cfcdb5)
Signed-off-by: Xiao Li <>
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala (diff)