FailedChanges

Summary

  1. [SPARK-18409][ML][FOLLOWUP] LSH approxNearestNeighbors optimization 2 (details)
  2. [SPARK-32908][SQL] Fix target error calculation in `percentile_approx()` (details)
  3. [SPARK-32906][SQL] Struct field names should not change after (details)
  4. [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function (details)
  5. [SPARK-32905][CORE][YARN] ApplicationMaster fails to receive (details)
  6. [SPARK-32930][CORE] Replace deprecated isFile/isDirectory methods (details)
  7. [SPARK-32911][CORE] Free memory in (details)
  8. [SPARK-32874][SQL][FOLLOWUP][TEST-HIVE1.2][TEST-HADOOP2.7] Fix (details)
  9. [SPARK-32936][SQL] Pass all `external/avro` module UTs in Scala 2.13 (details)
  10. [SPARK-32808][SQL] Pass all test of sql/core module in Scala 2.13 (details)
Commit 9d6221b9368ab3d23c63a9f24a2ba42a6f709d54 by ruifengz
[SPARK-18409][ML][FOLLOWUP] LSH approxNearestNeighbors optimization 2
### What changes were proposed in this pull request? 1, simplify the
aggregation by get `count` via `summary.count` 2, ignore nan values like
the old impl:
```
     val relativeError = 0.05
     val approxQuantile = numNearestNeighbors.toDouble / count +
relativeError
     val modelDatasetWithDist = modelDataset.withColumn(distCol,
hashDistCol)
     if (approxQuantile >= 1) {
       modelDatasetWithDist
     } else {
       val hashThreshold = modelDatasetWithDist.stat
         .approxQuantile(distCol, Array(approxQuantile), relativeError)
       // Filter the dataset where the hash value is less than the
threshold.
       modelDatasetWithDist.filter(hashDistCol <= hashThreshold(0))
     }
```
### Why are the changes needed? simplify the aggregation
### Does this PR introduce _any_ user-facing change? No
### How was this patch tested? existing testsuites
Closes #29778 from zhengruifeng/lsh_nit.
Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by:
zhengruifeng <ruifengz@foxmail.com>
The file was modifiedmllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala (diff)
Commit 75dd86400c3c2348a4139586fbbead840512b909 by gurwls223
[SPARK-32908][SQL] Fix target error calculation in `percentile_approx()`
### What changes were proposed in this pull request? 1. Change the
target error calculation according to the paper [Space-Efficient Online
Computation of Quantile
Summaries](http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf).
It says that the error `e = max(gi, deltai)/2` (see the page 59). Also
this has clear explanation [ε-approximate
quantiles](http://www.mathcs.emory.edu/~cheung/Courses/584/Syllabus/08-Quantile/Greenwald.html#proofprop1).
2. Added a test to check different accuracies. 3. Added an input CSV
file `percentile_approx-input.csv.bz2` to the resource folder
`sql/catalyst/src/main/resources` for the test.
### Why are the changes needed? To fix incorrect percentile calculation,
see an example in SPARK-32908.
### Does this PR introduce _any_ user-facing change? Yes
### How was this patch tested?
- By running existing tests in `QuantileSummariesSuite` and in
`ApproximatePercentileQuerySuite`.
- Added new test `SPARK-32908: maximum target error in
percentile_approx` to `ApproximatePercentileQuerySuite`.
Closes #29784 from MaxGekk/fix-percentile_approx-2.
Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: HyukjinKwon
<gurwls223@apache.org>
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala (diff)
The file was addedsql/core/src/test/resources/test-data/percentile_approx-input.csv.bz2
Commit b49aaa33e13814a448be51a7e65a29cb515b8248 by viirya
[SPARK-32906][SQL] Struct field names should not change after
normalizing floats
### What changes were proposed in this pull request?
This PR intends to fix a minor bug when normalizing floats for struct
types;
``` scala> import
org.apache.spark.sql.execution.aggregate.HashAggregateExec scala> val df
= Seq(Tuple1(Tuple1(-0.0d)), Tuple1(Tuple1(0.0d))).toDF("k") scala> val
agg = df.distinct() scala> agg.explain()
== Physical Plan ==
*(2) HashAggregate(keys=[k#40], functions=[])
+- Exchange hashpartitioning(k#40, 200), true, [id=#62]
  +- *(1) HashAggregate(keys=[knownfloatingpointnormalized(if
(isnull(k#40)) null else named_struct(col1,
knownfloatingpointnormalized(normalizenanandzero(k#40._1)))) AS k#40],
functions=[])
     +- *(1) LocalTableScan [k#40]
scala> val aggOutput = agg.queryExecution.sparkPlan.collect { case a:
HashAggregateExec => a.output.head } scala> aggOutput.foreach { attr =>
println(attr.prettyJson) }
### Final Aggregate ###
[ {
"class" :
"org.apache.spark.sql.catalyst.expressions.AttributeReference",
"num-children" : 0,
"name" : "k",
"dataType" : {
   "type" : "struct",
   "fields" : [ {
     "name" : "_1",
               ^^^
     "type" : "double",
     "nullable" : false,
     "metadata" : { }
   } ]
},
"nullable" : true,
"metadata" : { },
"exprId" : {
   "product-class" : "org.apache.spark.sql.catalyst.expressions.ExprId",
   "id" : 40,
   "jvmId" : "a824e83f-933e-4b85-a1ff-577b5a0e2366"
},
"qualifier" : [ ]
} ]
### Partial Aggregate ###
[ {
"class" :
"org.apache.spark.sql.catalyst.expressions.AttributeReference",
"num-children" : 0,
"name" : "k",
"dataType" : {
   "type" : "struct",
   "fields" : [ {
     "name" : "col1",
               ^^^^
     "type" : "double",
     "nullable" : true,
     "metadata" : { }
   } ]
},
"nullable" : true,
"metadata" : { },
"exprId" : {
   "product-class" : "org.apache.spark.sql.catalyst.expressions.ExprId",
   "id" : 40,
   "jvmId" : "a824e83f-933e-4b85-a1ff-577b5a0e2366"
},
"qualifier" : [ ]
} ]
```
### Why are the changes needed?
bugfix.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added tests.
Closes #29780 from maropu/FixBugInNormalizedFloatingNumbers.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by:
Liang-Chi Hsieh <viirya@gmail.com>
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala (diff)
Commit 8b09536cdf5c5477114cc11601c8b68c70408279 by wenchen
[SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function
### What changes were proposed in this pull request? The `NTH_VALUE`
function is an ANSI SQL. For examples:
``` CREATE TEMPORARY TABLE empsalary (
   depname varchar,
   empno bigint,
   salary int,
   enroll_date date
);
INSERT INTO empsalary VALUES
('develop', 10, 5200, '2007-08-01'),
('sales', 1, 5000, '2006-10-01'),
('personnel', 5, 3500, '2007-12-10'),
('sales', 4, 4800, '2007-08-08'),
('personnel', 2, 3900, '2006-12-23'),
('develop', 7, 4200, '2008-01-01'),
('develop', 9, 4500, '2008-01-01'),
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
select first_value(salary) over(order by salary range between 1000
preceding and 1000 following),
lead(salary) over(order by salary range between 1000 preceding and 1000
following),
nth_value(salary, 1) over(order by salary range between 1000 preceding
and 1000 following),
salary from empsalary;
first_value | lead | nth_value | salary
-------------+------+-----------+--------
       3500 | 3900 |      3500 |   3500
       3500 | 4200 |      3500 |   3900
       3500 | 4500 |      3500 |   4200
       3500 | 4800 |      3500 |   4500
       3900 | 4800 |      3900 |   4800
       3900 | 5000 |      3900 |   4800
       4200 | 5200 |      4200 |   5000
       4200 | 5200 |      4200 |   5200
       4200 | 6000 |      4200 |   5200
       5000 |      |      5000 |   6000
(10 rows)
```
There are some mainstream database support the syntax.
**PostgreSQL:**
https://www.postgresql.org/docs/8.4/functions-window.html
**Vertica:**
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Analytic/NTH_VALUEAnalytic.htm?tocpath=SQL%20Reference%20Manual%7CSQL%20Functions%7CAnalytic%20Functions%7C_____23
**Oracle:**
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/NTH_VALUE.html#GUID-F8A0E88C-67E5-4AA6-9515-95D03A7F9EA0
**Redshift**
https://docs.aws.amazon.com/redshift/latest/dg/r_WF_NTH.html
**Presto** https://prestodb.io/docs/current/functions/window.html
**MySQL**
https://www.mysqltutorial.org/mysql-window-functions/mysql-nth_value-function/
### Why are the changes needed? The `NTH_VALUE` function is an ANSI SQL.
The `NTH_VALUE` function is very useful.
### Does this PR introduce _any_ user-facing change? No
### How was this patch tested? Exists and new UT.
Closes #29604 from beliefer/support-nth_value.
Lead-authored-by: gengjiaan <gengjiaan@360.cn> Co-authored-by: beliefer
<beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part2.sql (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-functions/sql-expression-schema.md (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part1.sql (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/window.sql (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part3.sql (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/functions.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/postgreSQL/window_part3.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/window.sql.out (diff)
Commit 9e9d4b6994a29fb139fd50d24b5418a900c7f072 by wenchen
[SPARK-32905][CORE][YARN] ApplicationMaster fails to receive
UpdateDelegationTokens message
### What changes were proposed in this pull request?
With a long-running application in kerberized mode, the AMEndpiont
handles `UpdateDelegationTokens` message wrong, which is an
OneWayMessage that should be handled in the `receive` function.
```java 20-09-15 18:53:01 INFO yarn.YarnAllocator: Received 22
containers from YARN, launching executors on 0 of them. 20-09-16
12:52:28 ERROR netty.Inbox: Ignoring error
org.apache.spark.SparkException:
NettyRpcEndpointRef(spark-client://YarnAM) does not implement 'receive'
at
org.apache.spark.rpc.RpcEndpoint$$anonfun$receive$1.applyOrElse(RpcEndpoint.scala:70)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) 20-09-17 06:52:28 ERROR
netty.Inbox: Ignoring error org.apache.spark.SparkException:
NettyRpcEndpointRef(spark-client://YarnAM) does not implement 'receive'
at
org.apache.spark.rpc.RpcEndpoint$$anonfun$receive$1.applyOrElse(RpcEndpoint.scala:70)
at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```
### Why are the changes needed?
bugfix, without a proper token refresher, the long-running apps are
going to fail potentially in kerberized cluster
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
Passing jenkins
and verify manually
I am running the sub-module `kyuubi-spark-sql-engine` of
https://github.com/yaooqinn/kyuubi
The simplest way to reproduce the bug and verify this fix is to follow
these steps
#### 1 build the `kyuubi-spark-sql-engine` module
``` mvn clean package -pl :kyuubi-spark-sql-engine
```
#### 2. config the spark with Kerberos settings towards your secured
cluster
#### 3. start it in the background
``` nohup bin/spark-submit --class
org.apache.kyuubi.engine.spark.SparkSQLEngine
../kyuubi-spark-sql-engine-1.0.0-SNAPSHOT.jar > kyuubi.log &
```
#### 4. check the AM log and see
"Updating delegation tokens ..." for SUCCESS
"Inbox: Ignoring error ...... does not implement 'receive'" for FAILURE
Closes #29777 from yaooqinn/SPARK-32905.
Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
The file was modifiedresource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala (diff)
Commit 78928879810a2e96dbb6ec4608b548a0072a040f by gurwls223
[SPARK-32930][CORE] Replace deprecated isFile/isDirectory methods
### What changes were proposed in this pull request?
This PR aims to replace deprecated `isFile` and `isDirectory` methods.
```diff
- fs.isDirectory(hadoopPath)
+ fs.getFileStatus(hadoopPath).isDirectory
```
```diff
- fs.isFile(new Path(inProgressLog))
+ fs.getFileStatus(new Path(inProgressLog)).isFile
```
### Why are the changes needed?
It shows deprecation warnings.
-
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2-hive-2.3/1244/consoleFull
```
[warn]
/home/jenkins/workspace/spark-master-test-sbt-hadoop-3.2-hive-2.3/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:815:
method isFile in class FileSystem is deprecated: see corresponding
Javadoc for more information.
[warn]             if (!fs.isFile(new Path(inProgressLog))) {
```
```
[warn]
/home/jenkins/workspace/spark-master-test-sbt-hadoop-3.2-hive-2.3/core/src/main/scala/org/apache/spark/SparkContext.scala:1884:
method isDirectory in class FileSystem is deprecated: see corresponding
Javadoc for more information.
[warn]           if (fs.isDirectory(hadoopPath)) {
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the Jenkins.
Closes #29796 from williamhyun/filesystem.
Authored-by: William Hyun <williamhyun3@gmail.com> Signed-off-by:
HyukjinKwon <gurwls223@apache.org>
The file was modifiedcore/src/test/scala/org/apache/spark/deploy/history/EventLogFileWritersSuite.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala (diff)
The file was modifiedstreaming/src/main/scala/org/apache/spark/streaming/util/HdfsUtils.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/SparkContext.scala (diff)
Commit 105225ddbc4574a8b79e4a483124a6f998a03bc1 by wenchen
[SPARK-32911][CORE] Free memory in
UnsafeExternalSorter.SpillableIterator.spill() when all records have
been read
### What changes were proposed in this pull request?
This PR changes `UnsafeExternalSorter.SpillableIterator` to free its
memory (except for the page holding the last record) if it is forced to
spill after all of its records have been read. It also makes sure that
`lastPage` is freed if `loadNext` is never called the again. The latter
was necessary to get my test case to succeed (otherwise it would
complain about a leak).
### Why are the changes needed?
No memory is freed after calling
`UnsafeExternalSorter.SpillableIterator.spill()` when all records have
been read, even though it is still holding onto some memory. This may
cause a `SparkOutOfMemoryError` to be thrown, even though we could have
just freed the memory instead.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
A test was added to `UnsafeExternalSorterSuite`.
Closes #29787 from tomvanbussel/SPARK-32911.
Authored-by: Tom van Bussel <tom.vanbussel@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
The file was modifiedcore/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java (diff)
The file was modifiedcore/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java (diff)
Commit e2a740147c04a15e4f94c20c6039ed4f6888e0ed by wenchen
[SPARK-32874][SQL][FOLLOWUP][TEST-HIVE1.2][TEST-HADOOP2.7] Fix
spark-master-test-sbt-hadoop-2.7-hive-1.2
### What changes were proposed in this pull request?
Found via discussion
https://github.com/apache/spark/pull/29746#issuecomment-694726504
and the root cause it that hive-1.2 does not recognize NULL
```scala sbt.ForkMain$ForkError: java.sql.SQLException: Unrecognized
column type: NULL
at
org.apache.hive.jdbc.JdbcColumn.typeStringToHiveType(JdbcColumn.java:160)
at
org.apache.hive.jdbc.HiveResultSetMetaData.getHiveType(HiveResultSetMetaData.java:48)
at
org.apache.hive.jdbc.HiveResultSetMetaData.getPrecision(HiveResultSetMetaData.java:86)
at
org.apache.spark.sql.hive.thriftserver.SparkThriftServerProtocolVersionsSuite.$anonfun$new$35(SparkThriftServerProtocolVersionsSuite.scala:358)
at
org.apache.spark.sql.hive.thriftserver.SparkThriftServerProtocolVersionsSuite.$anonfun$new$35$adapted(SparkThriftServerProtocolVersionsSuite.scala:351)
at
org.apache.spark.sql.hive.thriftserver.SparkThriftServerProtocolVersionsSuite.testExecuteStatementWithProtocolVersion(SparkThriftServerProtocolVersionsSuite.scala:66)
at
org.apache.spark.sql.hive.thriftserver.SparkThriftServerProtocolVersionsSuite.$anonfun$new$34(SparkThriftServerProtocolVersionsSuite.scala:351)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:189)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:176)
at
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:187)
at
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:199)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at
org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:199)
at
org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:181)
at
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:61)
at
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
at
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:61)
at
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:232)
at
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
at
org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:232)
at
org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:231)
at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1562)
at org.scalatest.Suite.run(Suite.scala:1112)
at org.scalatest.Suite.run$(Suite.scala:1094)
at
org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1562)
at
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:236)
at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:236)
at
org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:235)
at
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:61)
at
org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:61)
at
org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:318)
at
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:513)
at sbt.ForkMain$Run$2.call(ForkMain.java:296)
at sbt.ForkMain$Run$2.call(ForkMain.java:286)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```
In this PR, we simply ignore these checks for hive 1.2
### Why are the changes needed?
fix jenkins
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
test itself.
Closes #29803 from yaooqinn/SPARK-32874-F.
Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
The file was modifiedsql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkThriftServerProtocolVersionsSuite.scala (diff)
Commit 664a1719de2855d913c3bb1d2a94bd8681bc1a0d by gurwls223
[SPARK-32936][SQL] Pass all `external/avro` module UTs in Scala 2.13
### What changes were proposed in this pull request? This pr fix all 14
failed cases in `external/avro` module in Scala 2.13, the main change of
this pr as follow:
- Manual call `toSeq` in `AvroDeserializer#newWriter` and
`SchemaConverters#toSqlTypeHelper` method because the object  type for
case match is `ArrayBuffer` not `Seq` in Scala 2.13
- Specified `Seq` to `s.c.Seq` when we call
`Row.get(i).asInstanceOf[Seq]` because the data maybe `mutable.ArraySeq`
but `Seq` is `immutable.Seq` in Scala 2.13
### Why are the changes needed? We need to support a Scala 2.13 build.
### Does this PR introduce _any_ user-facing change? No
### How was this patch tested?
- Scala 2.12: Pass the Jenkins or GitHub Action
- Scala 2.13: Pass 2.13 Build GitHub Action and do the following:
``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests  -pl
external/avro -Pscala-2.13 -am mvn clean test -pl external/avro
-Pscala-2.13
```
**Before**
``` Tests: succeeded 197, failed 14, canceled 0, ignored 2, pending 0
*** 14 TESTS FAILED ***
```
**After**
``` Tests: succeeded 211, failed 0, canceled 0, ignored 2, pending 0 All
tests passed.
```
Closes #29801 from LuciferYang/fix-external-avro-213.
Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: HyukjinKwon
<gurwls223@apache.org>
The file was modifiedexternal/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala (diff)
The file was modifiedexternal/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala (diff)
The file was modifiedexternal/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala (diff)
Commit 2128c4f14b498e3bc98e79f0dd42d9023e718112 by srowen
[SPARK-32808][SQL] Pass all test of sql/core module in Scala 2.13
### What changes were proposed in this pull request?
After https://github.com/apache/spark/pull/29660 and
https://github.com/apache/spark/pull/29689 there are 13 remaining
failed cases of sql core module with Scala 2.13.
The reason for the remaining failed cases is the optimization result of
`CostBasedJoinReorder` maybe different with same input in Scala 2.12 and
Scala 2.13 if there are more than one same cost candidate plans.
In this pr give a way to make the  optimization result deterministic as
much as possible to pass all remaining failed cases of `sql/core` module
in Scala 2.13, the main change of this pr as follow:
- Change to use `LinkedHashMap` instead of `Map` to store `foundPlans`
in `JoinReorderDP.search` method to ensure same iteration order with
same insert order because iteration order of `Map` behave differently
under Scala 2.12 and 2.13
- Fixed `StarJoinCostBasedReorderSuite` affected by the above change
- Regenerate golden files affected by the above change.
### Why are the changes needed? We need to support a Scala 2.13 build.
### Does this PR introduce _any_ user-facing change? No
### How was this patch tested?
- Scala 2.12: Pass the Jenkins or GitHub Action
- Scala 2.13: All tests passed.
Do the following:
``` dev/change-scala-version.sh 2.13 mvn clean install -DskipTests  -pl
sql/core -Pscala-2.13 -am mvn test -pl sql/core -Pscala-2.13
```
**Before**
``` Tests: succeeded 8485, failed 13, canceled 1, ignored 52, pending 0
*** 13 TESTS FAILED ***
```
**After**
``` Tests: succeeded 8498, failed 0, canceled 1, ignored 52, pending 0
All tests passed.
```
Closes #29711 from LuciferYang/SPARK-32808-3.
Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Sean Owen
<srowen@gmail.com>
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q31.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q24a.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q17.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q62.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q84.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q61.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q24b.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q80.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q85.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-modified/q27.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q66.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q80.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q91.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q13.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q6.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q6.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-modified/q27.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q50.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-modified/q7.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q80a.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q25.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q72.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q6.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q85.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q25.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q24b.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q17.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q62.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-modified/q7.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q72.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q72.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q84.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q13.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q45.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q31.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q61.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q72.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q6.sf100/explain.txt (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/joinReorder/StarJoinCostBasedReorderSuite.scala (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q91.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q29.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q50.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q99.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q45.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q24a.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q29.sf100/explain.txt (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q80a.sf100/explain.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q99.sf100/simplified.txt (diff)
The file was modifiedsql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q66.sf100/simplified.txt (diff)