FailedChanges

Summary

  1. [SPARK-30083][SQL] visitArithmeticUnary should wrap PLUS case with (commit: 65552a81d1c12c7133e85695f6681799954ff6b1) (details)
  2. [SPARK-30082][SQL] Do not replace Zeros when replacing NaNs (commit: 8c2849a6954a9f4d2a7e6dbf5ac34bb5e5c63271) (details)
  3. [SPARK-29477] Improve tooltip for Streaming tab (commit: a3394e49a7bcc66dad551458376aa33c55ca9861) (details)
  4. [SPARK-30012][CORE][SQL] Change classes extending scala collection (commit: 4193d2f4cc2bb100625b073e3a2e8599c3b4cb7c) (details)
  5. [SPARK-30106][SQL][TEST] Fix the test of DynamicPartitionPruningSuite (commit: 196ea936c39d621a605740ea026cd18974da1112) (details)
  6. [SPARK-30060][CORE] Rename metrics enable/disable configs (commit: 60f20e5ea2000ab8f4a593b5e4217fd5637c5e22) (details)
  7. [SPARK-30051][BUILD] Clean up hadoop-3.2 dependency (commit: f3abee377d1b86826498a1be329a1c82203162f5) (details)
Commit 65552a81d1c12c7133e85695f6681799954ff6b1 by wenchen
[SPARK-30083][SQL] visitArithmeticUnary should wrap PLUS case with
UnaryPositive for type checking
### What changes were proposed in this pull request?
`UnaryPositive` only accepts numeric and interval as we defined, but
what we do for this in  `AstBuider.visitArithmeticUnary` is just
bypassing it.
This should not be omitted for the type checking requirement.
### Why are the changes needed?
bug fix, you can find a pre-discussion here
https://github.com/apache/spark/pull/26578#discussion_r347350398
### Does this PR introduce any user-facing change? yes,
+non-numeric-or-interval is now invalid.
```
-- !query 14 select +date '1900-01-01'
-- !query 14 schema struct<DATE '1900-01-01':date>
-- !query 14 output 1900-01-01
-- !query 15 select +timestamp '1900-01-01'
-- !query 15 schema struct<TIMESTAMP '1900-01-01 00:00:00':timestamp>
-- !query 15 output 1900-01-01 00:00:00
-- !query 16 select +map(1, 2)
-- !query 16 schema struct<map(1, 2):map<int,int>>
-- !query 16 output
{1:2}
-- !query 17 select +array(1,2)
-- !query 17 schema struct<array(1, 2):array<int>>
-- !query 17 output
[1,2]
-- !query 18 select -'1'
-- !query 18 schema struct<(- CAST(1 AS DOUBLE)):double>
-- !query 18 output
-1.0
-- !query 19 select -X'1'
-- !query 19 schema struct<>
-- !query 19 output org.apache.spark.sql.AnalysisException cannot
resolve '(- X'01')' due to data type mismatch: argument 1 requires
(numeric or interval) type, however, 'X'01'' is of binary type.; line 1
pos 7
-- !query 20 select +X'1'
-- !query 20 schema struct<X'01':binary>
-- !query 20 output
```
### How was this patch tested?
add ut check
Closes #26716 from yaooqinn/SPARK-30083.
Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: 65552a81d1c12c7133e85695f6681799954ff6b1)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/literals.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/literals.sql (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/interval.sql.out (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/typeCoercion/native/promoteStrings.sql.out (diff)
The file was modifieddocs/sql-migration-guide.md (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/ansi/literals.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/operators.sql.out (diff)
Commit 8c2849a6954a9f4d2a7e6dbf5ac34bb5e5c63271 by wenchen
[SPARK-30082][SQL] Do not replace Zeros when replacing NaNs
### What changes were proposed in this pull request? Do not cast `NaN`
to an `Integer`, `Long`, `Short` or `Byte`. This is because casting
`NaN` to those types results in a `0` which erroneously replaces `0`s
while only `NaN`s should be replaced.
### Why are the changes needed? This Scala code snippet:
``` import scala.math;
println(Double.NaN.toLong)
``` returns `0` which is problematic as if you run the following Spark
code, `0`s get replaced as well:
```
>>> df = spark.createDataFrame([(1.0, 0), (0.0, 3), (float('nan'), 0)],
("index", "value"))
>>> df.show()
+-----+-----+
|index|value|
+-----+-----+
|  1.0|    0|
|  0.0|    3|
|  NaN|    0|
+-----+-----+
>>> df.replace(float('nan'), 2).show()
+-----+-----+
|index|value|
+-----+-----+
|  1.0|    2|
|  0.0|    3|
|  2.0|    2|
+-----+-----+
```
### Does this PR introduce any user-facing change? Yes, after the PR,
running the same above code snippet returns the correct expected
results:
```
>>> df = spark.createDataFrame([(1.0, 0), (0.0, 3), (float('nan'), 0)],
("index", "value"))
>>> df.show()
+-----+-----+
|index|value|
+-----+-----+
|  1.0|    0|
|  0.0|    3|
|  NaN|    0|
+-----+-----+
>>> df.replace(float('nan'), 2).show()
+-----+-----+
|index|value|
+-----+-----+
|  1.0|    0|
|  0.0|    3|
|  2.0|    0|
+-----+-----+
```
### How was this patch tested?
Added unit tests to verify replacing `NaN` only affects columns of type
`Float` and `Double`
Closes #26738 from johnhany97/SPARK-30082.
Lead-authored-by: John Ayad <johnhany97@gmail.com> Co-authored-by: John
Ayad <jayad@palantir.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: 8c2849a6954a9f4d2a7e6dbf5ac34bb5e5c63271)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DataFrameNaFunctionsSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala (diff)
Commit a3394e49a7bcc66dad551458376aa33c55ca9861 by sean.owen
[SPARK-29477] Improve tooltip for Streaming tab
### What changes were proposed in this pull request? Added tooltip for
duration columns in the batch table of streaming tab of Web UI.
### Why are the changes needed? Tooltips will help users in
understanding columns of batch table of streaming tab.
### Does this PR introduce any user-facing change? Yes
### How was this patch tested? Manually tested.
Closes #26467 from iRakson/streaming_tab_tooltip.
Authored-by: root1 <raksonrakesh@gmail.com> Signed-off-by: Sean Owen
<sean.owen@databricks.com>
(commit: a3394e49a7bcc66dad551458376aa33c55ca9861)
The file was modifiedstreaming/src/main/scala/org/apache/spark/streaming/ui/BatchPage.scala (diff)
The file was modifiedstreaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala (diff)
Commit 4193d2f4cc2bb100625b073e3a2e8599c3b4cb7c by dhyun
[SPARK-30012][CORE][SQL] Change classes extending scala collection
classes to work with 2.13
### What changes were proposed in this pull request?
Move some classes extending Scala collections into parallel source
trees, to support 2.13; other minor collection-related modifications.
Modify some classes extending Scala collections to work with 2.13 as
well as 2.12. In many cases, this means introducing parallel source
trees, as the type hierarchy changed in ways that one class can't
support both.
### Why are the changes needed?
To support building for Scala 2.13 in the future.
### Does this PR introduce any user-facing change?
There should be no behavior change.
### How was this patch tested?
Existing tests. Note that the 2.13 changes are not tested by the PR
builder, of course. They compile in 2.13 but can't even be tested
locally. Later, once the project can be compiled for 2.13, thus tested,
it's possible the 2.13 implementations will need updates.
Closes #26728 from srowen/SPARK-30012.
Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by:
Dongjoon Hyun <dhyun@apple.com>
(commit: 4193d2f4cc2bb100625b073e3a2e8599c3b4cb7c)
The file was removedcore/src/main/scala/org/apache/spark/util/TimeStampedHashMap.scala
The file was removedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamProgress.scala
The file was modifiedsql/core/pom.xml (diff)
The file was modifiedcore/pom.xml (diff)
The file was addedsql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
The file was modifiedcore/src/main/scala/org/apache/spark/util/collection/CompactBuffer.scala (diff)
The file was addedcore/src/main/scala-2.12/org/apache/spark/util/TimeStampedHashMap.scala
The file was addedcore/src/main/scala-2.13/org/apache/spark/util/TimeStampedHashMap.scala
The file was removedcore/src/main/scala/org/apache/spark/util/BoundedPriorityQueue.scala
The file was removedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala
The file was addedsql/core/src/main/scala-2.13/org/apache/spark/sql/execution/streaming/StreamProgress.scala
The file was removedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
The file was addedsql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala
The file was addedsql/core/src/main/scala-2.12/org/apache/spark/sql/execution/streaming/StreamProgress.scala
The file was addedsql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
The file was modifiedrepl/pom.xml (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala (diff)
The file was addedsql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
The file was modifieddev/change-scala-version.sh (diff)
The file was addedcore/src/main/scala-2.12/org/apache/spark/util/BoundedPriorityQueue.scala
The file was removedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
The file was modifiedsql/catalyst/pom.xml (diff)
The file was addedsql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala
The file was addedcore/src/main/scala-2.13/org/apache/spark/util/BoundedPriorityQueue.scala
The file was addedsql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala
Commit 196ea936c39d621a605740ea026cd18974da1112 by dhyun
[SPARK-30106][SQL][TEST] Fix the test of DynamicPartitionPruningSuite
### What changes were proposed in this pull request? Changed the test
**DPP triggers only for certain types of query** in
**DynamicPartitionPruningSuite**.
### Why are the changes needed? The sql has no partition key. The
description "no predicate on the dimension table" is not right. So fix
it.
```
     Given("no predicate on the dimension table")
     withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key ->
"true") {
       val df = sql(
         """
           |SELECT * FROM fact_sk f
           |JOIN dim_store s
           |ON f.date_id = s.store_id
         """.stripMargin)
```
### Does this PR introduce any user-facing change? No
### How was this patch tested? Updated UT
Closes #26744 from deshanxiao/30106.
Authored-by: xiaodeshan <xiaodeshan@xiaomi.com> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(commit: 196ea936c39d621a605740ea026cd18974da1112)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala (diff)
Commit 60f20e5ea2000ab8f4a593b5e4217fd5637c5e22 by dhyun
[SPARK-30060][CORE] Rename metrics enable/disable configs
### What changes were proposed in this pull request? This proposes to
introduce a naming convention for Spark metrics configuration parameters
used to enable/disable metrics source reporting using the Dropwizard
metrics library:   `spark.metrics.sourceNameCamelCase.enabled` and
update 2 parameters to use this naming convention.
### Why are the changes needed? Currently Spark has a few parameters to
enable/disable metrics reporting. Their naming pattern is not uniform
and this can create confusion.  Currently we have:
`spark.metrics.static.sources.enabled`
`spark.app.status.metrics.enabled`
`spark.sql.streaming.metricsEnabled`
### Does this PR introduce any user-facing change? Update parameters for
enabling/disabling metrics reporting new in Spark 3.0:
`spark.metrics.static.sources.enabled` ->
`spark.metrics.staticSources.enabled`,
`spark.app.status.metrics.enabled`  ->
`spark.metrics.appStatusSource.enabled`. Note:
`spark.sql.streaming.metricsEnabled` is left unchanged as it is already
in use in Spark 2.x.
### How was this patch tested? Manually tested
Closes #26692 from LucaCanali/uniformNamingMetricsEnableParameters.
Authored-by: Luca Canali <luca.canali@cern.ch> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(commit: 60f20e5ea2000ab8f4a593b5e4217fd5637c5e22)
The file was modifieddocs/monitoring.md (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/status/AppStatusSource.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/internal/config/package.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/internal/config/Status.scala (diff)
Commit f3abee377d1b86826498a1be329a1c82203162f5 by dhyun
[SPARK-30051][BUILD] Clean up hadoop-3.2 dependency
### What changes were proposed in this pull request?
This PR aims to cut `org.eclipse.jetty:jetty-webapp`and
`org.eclipse.jetty:jetty-xml` transitive dependency from
`hadoop-common`.
### Why are the changes needed?
This will simplify our dependency management by the removal of unused
dependencies.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Pass the GitHub Action with all combinations and the Jenkins UT with
(Hadoop-3.2).
Closes #26742 from dongjoon-hyun/SPARK-30051.
Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(commit: f3abee377d1b86826498a1be329a1c82203162f5)
The file was modifieddev/deps/spark-deps-hadoop-3.2-hive-2.3 (diff)
The file was modifiedpom.xml (diff)