SuccessChanges

Summary

  1. [SPARK-27080][SQL] bug fix: mergeWithMetastoreSchema with uniform lower (commit: b6d5b0a6347faf4dd95321c9646e78d8bb6bb00d) (details)
  2. [SPARK-27111][SS] Fix a race that a continuous query may fail with (commit: 4d1d0a41a862c234acb9b8b68e96da7bf079eb8d) (details)
Commit b6d5b0a6347faf4dd95321c9646e78d8bb6bb00d by wenchen
[SPARK-27080][SQL] bug fix: mergeWithMetastoreSchema with uniform lower
case comparison
When reading parquet file with merging metastore schema and file schema,
we should compare field names using uniform case. In current
implementation, lowercase is used but one omission. And this patch fix
it.
Unit test
Closes #24001 from codeborui/mergeSchemaBugFix.
Authored-by: CodeGod <> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(cherry picked from commit a29df5fa02111f57965be2ab5e208f5c815265fe)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: b6d5b0a6347faf4dd95321c9646e78d8bb6bb00d)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala (diff)
The file was modifiedsql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala (diff)
Commit 4d1d0a41a862c234acb9b8b68e96da7bf079eb8d by zsxwing
[SPARK-27111][SS] Fix a race that a continuous query may fail with
InterruptedException
Before a Kafka consumer gets assigned with partitions, its offset will
contain 0 partitions. However, runContinuous will still run and launch a
Spark job having 0 partitions. In this case, there is a race that epoch
may interrupt the query execution thread after `lastExecution.toRdd`,
and either `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)`
or the next `runContinuous` will get interrupted unintentionally.
To handle this case, this PR has the following changes:
- Clean up the resources in `queryExecutionThread.runUninterruptibly`.
This may increase the waiting time of `stop` but should be minor because
the operations here are very fast (just sending an RPC message in the
same process and stopping a very simple thread).
- Clear the interrupted status at the end so that it won't impact the
`runContinuous` call. We may clear the interrupted status set by `stop`,
but it doesn't affect the query termination because `runActivatedStream`
will check `state` and exit accordingly.
I also updated the clean up codes to make sure exceptions thrown from
`epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` won't stop
the clean up.
Jenkins
Closes #24034 from zsxwing/SPARK-27111.
Authored-by: Shixiong Zhu <zsxwing@gmail.com> Signed-off-by: Shixiong
Zhu <zsxwing@gmail.com>
(cherry picked from commit 6e1c0827ece1cdc615196e60cb11c76b917b8eeb)
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
(commit: 4d1d0a41a862c234acb9b8b68e96da7bf079eb8d)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousExecution.scala (diff)