1. [SPARK-23778][CORE] Avoid unneeded shuffle when union gets an empty RDD (commit: bc111463a766a5619966a282fbe0fec991088ceb) (details)
  2. address review comment (commit: cdaaf99cbf65c6766bf5dd3b769211882434084c) (details)
Commit bc111463a766a5619966a282fbe0fec991088ceb by wenchen
[SPARK-23778][CORE] Avoid unneeded shuffle when union gets an empty RDD
## What changes were proposed in this pull request?
When a `union` is invoked on several RDDs of which one is an empty RDD,
the result of the operation is a `UnionRDD`. This causes an unneeded
extra-shuffle when all the other RDDs have the same partitioning.
The PR ignores incoming empty RDDs in the union method.
## How was this patch tested?
added UT
Author: Marco Gaido <>
Closes #21333 from mgaido91/SPARK-23778.
(commit: bc111463a766a5619966a282fbe0fec991088ceb)
The file was modifiedcore/src/main/scala/org/apache/spark/SparkContext.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/rdd/RDDSuite.scala (diff)
Commit cdaaf99cbf65c6766bf5dd3b769211882434084c by ishizaki
address review comment
refactoring to reduce # of lines
(commit: cdaaf99cbf65c6766bf5dd3b769211882434084c)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ (diff)