SuccessChanges

Summary

  1. [SPARK-24085][SQL] Query returns UnsupportedOperationException when (commit: 4a10df0f66f74ec2c995f9832d1ab74112bdeb16) (details)
  2. [SPARK-24104] SQLAppStatusListener overwrites metrics (commit: df45ddb9dea9bf42d18c1164cf35067c7cac5d6f) (details)
Commit 4a10df0f66f74ec2c995f9832d1ab74112bdeb16 by gatorsmile
[SPARK-24085][SQL] Query returns UnsupportedOperationException when
scalar subquery is present in partitioning expression
## What changes were proposed in this pull request? In this case, the
partition pruning happens before the planning phase of scalar subquery
expressions. For scalar subquery expressions, the planning occurs late
in the cycle (after the physical planning)  in "PlanSubqueries" just
before execution. Currently we try to execute the scalar subquery
expression as part of partition pruning and fail as it implements
Unevaluable.
The fix attempts to ignore the Subquery expressions from partition
pruning computation. Another option can be to somehow plan the
subqueries before the partition pruning. Since this may not be a
commonly occuring expression, i am opting for a simpler fix.
Repro
``` SQL CREATE TABLE test_prc_bug ( id_value string
) partitioned by (id_type string) location '/tmp/test_prc_bug' stored as
parquet;
insert into test_prc_bug values ('1','a'); insert into test_prc_bug
values ('2','a'); insert into test_prc_bug values ('3','b'); insert into
test_prc_bug values ('4','b');
select * from test_prc_bug where id_type = (select 'b');
```
## How was this patch tested? Added test in SubquerySuite and
hive/SQLQuerySuite
Author: Dilip Biswal <dbiswal@us.ibm.com>
Closes #21174 from dilipbiswal/spark-24085.
(cherry picked from commit 3fd297af6dc568357c97abf86760c570309d6597)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
(commit: 4a10df0f66f74ec2c995f9832d1ab74112bdeb16)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala (diff)
Commit df45ddb9dea9bf42d18c1164cf35067c7cac5d6f by vanzin
[SPARK-24104] SQLAppStatusListener overwrites metrics
onDriverAccumUpdates instead of updating them
## What changes were proposed in this pull request?
Event `SparkListenerDriverAccumUpdates` may happen multiple times in a
query - e.g. every `FileSourceScanExec` and `BroadcastExchangeExec` call
`postDriverMetricUpdates`. In Spark 2.2 `SQLListener` updated the map
with new values. `SQLAppStatusListener` overwrites it. Unless `update`
preserved it in the KV store (dependant on `exec.lastWriteTime`), only
the metrics from the last operator that does `postDriverMetricUpdates`
are preserved.
## How was this patch tested?
Unit test added.
Author: Juliusz Sompolski <julek@databricks.com>
Closes #21171 from juliuszsompolski/SPARK-24104.
(cherry picked from commit 8614edd445264007144caa6743a8c2ca2b5082e0)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
(commit: df45ddb9dea9bf42d18c1164cf35067c7cac5d6f)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListenerSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala (diff)