1. [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Had… (commit: ded902c3a90a9340e551091d554245df5982590c) (details)
  2. [SPARK-26680][SPARK-25767][SQL][BACKPORT-2.3] Eagerly create inputVars (commit: 373a627e99666ff047d41c8797a21deee84e23b9) (details)
Commit ded902c3a90a9340e551091d554245df5982590c by vanzin
[SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Had…
## What changes were proposed in this pull request?
Updates the attempt ID used by FileFormatWriter. Tasks in stage attempts
use the same task attempt number and could conflict. Using Spark's task
attempt ID guarantees that Hadoop TaskAttemptID instances are unique.
This is a backport of d5a97c1 to the 2.3 branch.
## How was this patch tested?
Existing tests. Also validated that we no longer detect this failure
case in our logs after deployment.
Closes #23640 from rdblue/SPARK-26682-backport-to-2.3.
Authored-by: Ryan Blue <> Signed-off-by: Marcelo Vanzin
(commit: ded902c3a90a9340e551091d554245df5982590c)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala (diff)
Commit 373a627e99666ff047d41c8797a21deee84e23b9 by dongjoon
[SPARK-26680][SPARK-25767][SQL][BACKPORT-2.3] Eagerly create inputVars
while conditions are appropriate
## What changes were proposed in this pull request?
Back port of #22789 and #23617 to branch-2.3
When a user passes a Stream to groupBy, ```CodegenSupport.consume```
ends up lazily generating ```inputVars``` from a Stream, since the field
```output``` will be a Stream. At the time ``````
is called, conditions are correct. However, by the time the map
operation actually executes, conditions are no longer appropriate. The
closure used by the map operation ends up using a reference to the
partially created ```inputVars```. As a result, a StackOverflowError
This PR ensures that ```inputVars``` is eagerly created while conditions
are appropriate. It seems this was also an issue with the code path for
creating ```inputVars``` from ```outputVars``` (SPARK-25767). I simply
extended the solution for that code path to encompass both code paths.
## How was this patch tested?
SQL unit tests new test python tests
Closes #23642 from bersprockets/SPARK-26680_branch23.
Authored-by: Bruce Robbins <> Signed-off-by:
Dongjoon Hyun <>
(commit: 373a627e99666ff047d41c8797a21deee84e23b9)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala (diff)