SuccessChanges

Summary

  1. [SPARK-27672][SQL] Add `since` info to string expressions (commit: 0aa1e970d68f17b84db28f8f36c9749e3b116c74) (details)
  2. [SPARK-27673][SQL] Add `since` info to random, regex, null expressions (commit: 0d5533c3543cb2c75fce9248dfd55590497a271b) (details)
  3. [SPARK-27347][MESOS] Fix supervised driver retry logic for outdated (commit: fd177267e185c493a076f8e245bd6d576f7bebf2) (details)
Commit 0aa1e970d68f17b84db28f8f36c9749e3b116c74 by dhyun
[SPARK-27672][SQL] Add `since` info to string expressions
This PR adds since information to the all string expressions below:
SPARK-8241 ConcatWs SPARK-16276 Elt SPARK-1995 Upper / Lower SPARK-20750
StringReplace SPARK-8266 StringTranslate SPARK-8244 FindInSet SPARK-8253
StringTrimLeft SPARK-8260 StringTrimRight SPARK-8267 StringTrim
SPARK-8247 StringInstr SPARK-8264 SubstringIndex SPARK-8249 StringLocate
SPARK-8252 StringLPad SPARK-8259 StringRPad SPARK-16281 ParseUrl
SPARK-9154 FormatString SPARK-8269 Initcap SPARK-8257 StringRepeat
SPARK-8261 StringSpace SPARK-8263 Substring SPARK-21007 Right
SPARK-21007 Left SPARK-8248 Length SPARK-20749 BitLength SPARK-20749
OctetLength SPARK-8270 Levenshtein SPARK-8271 SoundEx SPARK-8238 Ascii
SPARK-20748 Chr SPARK-8239 Base64 SPARK-8268 UnBase64 SPARK-8242 Decode
SPARK-8243 Encode SPARK-8245 format_number SPARK-16285 Sentences
N/A
Closes #24578 from HyukjinKwon/SPARK-27672.
Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(cherry picked from commit 3442fcaa9bbe2e9306ef33a655fb6d1fe75ceb47)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 0aa1e970d68f17b84db28f8f36c9749e3b116c74)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala (diff)
Commit 0d5533c3543cb2c75fce9248dfd55590497a271b by dhyun
[SPARK-27673][SQL] Add `since` info to random, regex, null expressions
We should add since info to all expressions.
SPARK-7886 Rand / Randn
https://github.com/apache/spark/commit/af3746ce0d724dc624658a2187bde188ab26d084
RLike, Like (I manually checked that it exists from 1.0.0) SPARK-8262
Split SPARK-8256 RegExpReplace SPARK-8255 RegExpExtract
https://github.com/apache/spark/commit/9aadcffabd226557174f3ff566927f873c71672e
Coalesce / IsNull / IsNotNull (I manually checked that it exists from
1.0.0) SPARK-14541 IfNull / NullIf / Nvl / Nvl2 SPARK-9080 IsNaN
SPARK-9168 NaNvl
N/A
Closes #24579 from HyukjinKwon/SPARK-27673.
Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(cherry picked from commit c71f217de1e0b2265f585369aa556ed26db98589)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 0d5533c3543cb2c75fce9248dfd55590497a271b)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala (diff)
Commit fd177267e185c493a076f8e245bd6d576f7bebf2 by dhyun
[SPARK-27347][MESOS] Fix supervised driver retry logic for outdated
tasks
## What changes were proposed in this pull request?
This patch fixes a bug where `--supervised` Spark jobs would retry
multiple times whenever an agent would crash, come back, and re-register
even when those jobs had already relaunched on a different agent.
That is:
```
- supervised driver is running on agent1
- agent1 crashes
- driver is relaunched on another agent as `<task-id>-retry-1`
- agent1 comes back online and re-registers with scheduler
- spark relaunches the same job as `<task-id>-retry-2`
- now there are two jobs running simultaneously
```
This is because when an agent would come back and re-register it would
send a status update `TASK_FAILED` for its old driver-task. Previous
logic would indiscriminately remove the `submissionId` from Zookeeper's
`launchedDrivers` node and add it to `retryList` node. Then, when a new
offer came in, it would relaunch another `-retry-`  task even though one
was previously running.
For example logs, scroll to bottom
## How was this patch tested?
- Added a unit test to simulate behavior described above
- Tested manually on a DC/OS cluster by
```
- launching a --supervised spark job
- dcos node ssh <to the agent with the running spark-driver>
- systemctl stop dcos-mesos-slave
- docker kill <driver-container-id>
- [ wait until spark job is relaunched ]
- systemctl start dcos-mesos-slave
- [ observe spark driver is not relaunched as `-retry-2` ]
```
Log snippets included below. Notice the `-retry-1` task is running when
status update for the old task comes in afterward:
``` 19/01/15 19:21:38 TRACE MesosClusterScheduler: Received offers from
Mesos:
... [offers] ... 19/01/15 19:21:39 TRACE MesosClusterScheduler: Using
offer 5d421001-0630-4214-9ecb-d5838a2ec149-O2532 to launch driver
driver-20190115192138-0001 with taskId: value:
"driver-20190115192138-0001"
... 19/01/15 19:21:42 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001 state=TASK_STARTING message=''
19/01/15 19:21:43 INFO MesosClusterScheduler: Received status update:
taskId=driver-20190115192138-0001 state=TASK_RUNNING message=''
... 19/01/15 19:29:12 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001 state=TASK_LOST
message='health check timed out' reason=REASON_SLAVE_REMOVED
... 19/01/15 19:31:12 TRACE MesosClusterScheduler: Using offer
5d421001-0630-4214-9ecb-d5838a2ec149-O2681 to launch driver
driver-20190115192138-0001 with taskId: value:
"driver-20190115192138-0001-retry-1"
... 19/01/15 19:31:15 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001-retry-1 state=TASK_STARTING
message='' 19/01/15 19:31:16 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001-retry-1 state=TASK_RUNNING
message=''
... 19/01/15 19:33:45 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001 state=TASK_FAILED
message='Unreachable agent re-reregistered'
... 19/01/15 19:33:45 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001 state=TASK_FAILED
message='Abnormal executor termination: unknown container'
reason=REASON_EXECUTOR_TERMINATED 19/01/15 19:33:45 ERROR
MesosClusterScheduler: Unable to find driver with
driver-20190115192138-0001 in status update
... 19/01/15 19:33:47 TRACE MesosClusterScheduler: Using offer
5d421001-0630-4214-9ecb-d5838a2ec149-O2729 to launch driver
driver-20190115192138-0001 with taskId: value:
"driver-20190115192138-0001-retry-2"
... 19/01/15 19:33:50 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001-retry-2 state=TASK_STARTING
message='' 19/01/15 19:33:51 INFO MesosClusterScheduler: Received status
update: taskId=driver-20190115192138-0001-retry-2 state=TASK_RUNNING
message=''
```
Closes #24276 from samvantran/SPARK-27347-duplicate-retries.
Authored-by: Sam Tran <stran@mesosphere.com> Signed-off-by: Dongjoon
Hyun <dhyun@apple.com>
(cherry picked from commit bcd3b61c4be98565352491a108e6394670a0f413)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: fd177267e185c493a076f8e245bd6d576f7bebf2)
The file was modifiedresource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala (diff)
The file was modifiedresource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala (diff)