SuccessChanges

Summary

  1. [SPARK-23660] Fix exception in yarn cluster mode when application ended (commit: 5c1c03d080d58611f7ac6e265a7432b2ee76e880) (details)
  2. [SPARK-23644][CORE][UI][BACKPORT-2.3] Use absolute path for REST call in (commit: 2f82c037d90114705c0d0bd0bd7f82215aecfe3b) (details)
  3. [SPARK-23691][PYTHON][BRANCH-2.3] Use sql_conf util in PySpark tests (commit: c854b6ca7ba4dc33138c12ba4606ff8fbe82aef2) (details)
  4. [SPARK-23649][SQL] Skipping chars disallowed in UTF-8 (commit: 0b880db65b647e549b78721859b1712dff733ec9) (details)
Commit 5c1c03d080d58611f7ac6e265a7432b2ee76e880 by vanzin
[SPARK-23660] Fix exception in yarn cluster mode when application ended
fast
## What changes were proposed in this pull request?
Yarn throws the following exception in cluster mode when the application
is really small:
``` 18/03/07 23:34:22 WARN netty.NettyRpcEnv: Ignored failure:
java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask7c974942
rejected from
java.util.concurrent.ScheduledThreadPoolExecutor1eea9d2d[Terminated,
pool size = 0, active threads = 0, queued tasks = 0, completed tasks =
0] 18/03/07 23:34:22 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)
at
org.apache.spark.deploy.yarn.YarnAllocator.<init>(YarnAllocator.scala:102)
at
org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:77)
at
org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:450)
at
org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:493)
at
org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:345)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:810)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at
org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:809)
at
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:834)
at
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already
stopped.
at
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:158)
at
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:135)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:229)
at
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:523)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:91)
... 17 more 18/03/07 23:34:22 INFO yarn.ApplicationMaster: Final app
status: FAILED, exitCode: 13, (reason: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult: )
```
Example application:
``` object ExampleApp {
def main(args: Array[String]): Unit = {
   val conf = new SparkConf().setAppName("ExampleApp")
   val sc = new SparkContext(conf)
   try {
     // Do nothing
   } finally {
     sc.stop()
   }
}
```
This PR pauses user class thread after `SparkContext` created and keeps
it so until application master initialises properly.
## How was this patch tested?
Automated: Existing unit tests Manual: Application submitted into small
cluster
Author: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Closes #20807 from gaborgsomogyi/SPARK-23660.
(cherry picked from commit 5f4deff19511b6870f056eba5489104b9cac05a9)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
(commit: 5c1c03d080d58611f7ac6e265a7432b2ee76e880)
The file was modifiedresource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala (diff)
Commit 2f82c037d90114705c0d0bd0bd7f82215aecfe3b by sshao
[SPARK-23644][CORE][UI][BACKPORT-2.3] Use absolute path for REST call in
SHS
## What changes were proposed in this pull request?
SHS is using a relative path for the REST API call to get the list of
the application is a relative path call. In case of the SHS being
consumed through a proxy, it can be an issue if the path doesn't end
with a "/".
Therefore, we should use an absolute path for the REST call as it is
done for all the other resources.
## How was this patch tested?
manual tests Before the change:
![screen shot 2018-03-10 at 4 22 02
pm](https://user-images.githubusercontent.com/8821783/37244190-8ccf9d40-2485-11e8-8fa9-345bc81472fc.png)
After the change:
![screen shot 2018-03-10 at 4 36 34 pm
1](https://user-images.githubusercontent.com/8821783/37244201-a1922810-2485-11e8-8856-eeab2bf5e180.png)
Author: Marco Gaido <marcogaido91@gmail.com>
Closes #20847 from mgaido91/SPARK-23644_2.3.
(commit: 2f82c037d90114705c0d0bd0bd7f82215aecfe3b)
The file was modifiedcore/src/main/resources/org/apache/spark/ui/static/historypage.js (diff)
Commit c854b6ca7ba4dc33138c12ba4606ff8fbe82aef2 by hyukjinkwon
[SPARK-23691][PYTHON][BRANCH-2.3] Use sql_conf util in PySpark tests
where possible
## What changes were proposed in this pull request?
This PR backports https://github.com/apache/spark/pull/20830 to reduce
the diff against master and restore the default value back in PySpark
tests.
https://github.com/apache/spark/commit/d6632d185e147fcbe6724545488ad80dce20277e
added an useful util. This backport extracts and brings this util:
```python contextmanager def sql_conf(self, pairs):
   ...
```
to allow configuration set/unset within a block:
```python with self.sql_conf({"spark.blah.blah.blah", "blah"})
   # test codes
```
This PR proposes to use this util where possible in PySpark tests.
Note that there look already few places affecting tests without
restoring the original value back in unittest classes.
## How was this patch tested?
Likewise, manually tested via:
```
./run-tests --modules=pyspark-sql --python-executables=python2
./run-tests --modules=pyspark-sql --python-executables=python3
```
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #20863 from HyukjinKwon/backport-20830.
(commit: c854b6ca7ba4dc33138c12ba4606ff8fbe82aef2)
The file was modifiedpython/pyspark/sql/tests.py (diff)
Commit 0b880db65b647e549b78721859b1712dff733ec9 by wenchen
[SPARK-23649][SQL] Skipping chars disallowed in UTF-8
## What changes were proposed in this pull request?
The mapping of UTF-8 char's first byte to char's size doesn't cover
whole range 0-255. It is defined only for 0-253:
https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java#L60-L65
https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java#L190
If the first byte of a char is 253-255, IndexOutOfBoundsException is
thrown. Besides of that values for 244-252 are not correct according to
recent unicode standard for UTF-8:
http://www.unicode.org/versions/Unicode10.0.0/UnicodeStandard-10.0.pdf
As a consequence of the exception above, the length of input string in
UTF-8 encoding cannot be calculated if the string contains chars started
from 253 code. It is visible on user's side as for example crashing of
schema inferring of csv file which contains such chars but the file can
be read if the schema is specified explicitly or if the mode set to
multiline.
The proposed changes build correct mapping of first byte of UTF-8 char
to its size (now it covers all cases) and skip disallowed chars (counts
it as one octet).
## How was this patch tested?
Added a test and a file with a char which is disallowed in UTF-8 - 0xFF.
Author: Maxim Gekk <maxim.gekk@databricks.com>
Closes #20796 from MaxGekk/skip-wrong-utf8-chars.
(cherry picked from commit 5e7bc2acef4a1e11d0d8056ef5c12cd5c8f220da)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 0b880db65b647e549b78721859b1712dff733ec9)
The file was modifiedcommon/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java (diff)
The file was modifiedcommon/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java (diff)