SuccessChanges

Summary

  1. [SPARK-23433][SPARK-25250][CORE][BRANCH-2.3] Later created TaskSet (commit: a1ca5663725c278b6e3785042348819a25496fe4) (details)
  2. [SPARK-26604][CORE][BACKPORT-2.4] Clean up channel registration for (commit: c45f8da3af6000645ee76544940a6bdc5477884b) (details)
Commit a1ca5663725c278b6e3785042348819a25496fe4 by irashid
[SPARK-23433][SPARK-25250][CORE][BRANCH-2.3] Later created TaskSet
should learn about the finished partitions
## What changes were proposed in this pull request?
This is an optional solution for #22806 .
#21131 firstly implement that a previous successful completed task from
zombie TaskSetManager could also succeed the active TaskSetManager,
which based on an assumption that an active TaskSetManager always exists
for that stage when this happen. But that's not always true as an active
TaskSetManager may haven't been created when a previous task succeed,
and this is the reason why #22806 hit the issue.
This pr extends #21131 's behavior by adding stageIdToFinishedPartitions
into TaskSchedulerImpl, which recording the finished partition whenever
a task(from zombie or active) succeed. Thus, a later created active
TaskSetManager could also learn about the finished partition by looking
into stageIdToFinishedPartitions and won't launch any duplicate tasks.
## How was this patch tested?
Add.
Closes #24007 from Ngone51/dev-23433-25250-branch-2.3.
Authored-by: wuyi <ngone_5451@163.com> Signed-off-by: Imran Rashid
<irashid@cloudera.com>
(commit: a1ca5663725c278b6e3785042348819a25496fe4)
The file was modifiedcore/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala (diff)
Commit c45f8da3af6000645ee76544940a6bdc5477884b by vanzin
[SPARK-26604][CORE][BACKPORT-2.4] Clean up channel registration for
StreamManager
## What changes were proposed in this pull request?
This is mostly a clean backport of
https://github.com/apache/spark/pull/23521 to branch-2.4
## How was this patch tested?
I've tested this with a hack in `TransportRequestHandler` to force
`ChunkFetchRequest` to get dropped.
Then making a number of `ExternalShuffleClient.fetchChunk` requests
(which `OpenBlocks` then `ChunkFetchRequest`) and closing out of my test
harness. A heap dump later reveals that the `StreamState` references are
unreachable.
I haven't run this through the unit test suite, but doing that now.
Wanted to get this up as I think folks are waiting for it for 2.4.1
Closes #24013 from abellina/SPARK-26604_cherry_pick_2_4.
Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com> Co-authored-by:
Alessandro Bellina <abellina@yahoo-inc.com> Signed-off-by: Marcelo
Vanzin <vanzin@cloudera.com>
(cherry picked from commit 216eeec2bd319f1d6a1228c9bc8d8a579d5e6571)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
(commit: c45f8da3af6000645ee76544940a6bdc5477884b)
The file was modifiedcommon/network-common/src/main/java/org/apache/spark/network/server/StreamManager.java (diff)
The file was modifiedcommon/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java (diff)
The file was modifiedcommon/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/network/netty/NettyBlockRpcServer.scala (diff)
The file was modifiedcommon/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java (diff)
The file was modifiedcommon/network-shuffle/src/test/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandlerSuite.java (diff)
The file was modifiedcommon/network-common/src/test/java/org/apache/spark/network/server/OneForOneStreamManagerSuite.java (diff)
The file was modifiedcommon/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java (diff)