1. [SPARK-25591][PYSPARK][SQL][BRANCH-2.3] Avoid overwriting deserialized (commit: 70a99bac43fcc8a8ba148bec55ca318fbedf1546) (details)
  2. [SPARK-26019][PYSPARK] Allow insecure py4j gateways (commit: 30a811b4b95448cc3a1236cc0902ab715320c457) (details)
Commit 70a99bac43fcc8a8ba148bec55ca318fbedf1546 by gurwls223
[SPARK-25591][PYSPARK][SQL][BRANCH-2.3] Avoid overwriting deserialized
## What changes were proposed in this pull request?
If we use accumulators in more than one UDFs, it is possible to
overwrite deserialized accumulators and its values. We should check if
an accumulator was deserialized before overwriting it in accumulator
## How was this patch tested?
Added test.
Closes #23432 from viirya/SPARK-25591-2.3.
Authored-by: Liang-Chi Hsieh <> Signed-off-by: Hyukjin
Kwon <>
(commit: 70a99bac43fcc8a8ba148bec55ca318fbedf1546)
The file was modifiedpython/pyspark/ (diff)
The file was modifiedpython/pyspark/sql/ (diff)
Commit 30a811b4b95448cc3a1236cc0902ab715320c457 by gurwls223
[SPARK-26019][PYSPARK] Allow insecure py4j gateways
Spark always creates secure py4j connections between java and python,
but it also allows users to pass in their own connection.  This restores
the ability for users to pass in an _insecure_ connection, though it
forces them to set the env variable 'PYSPARK_ALLOW_INSECURE_GATEWAY=1',
and still issues a warning.
Added test cases verifying the failure without the extra configuration,
and verifying things still work with an insecure configuration (in
particular, accumulators, as those were broken with an insecure py4j
gateway before).
For the tests, I added ways to create insecure gateways, but I tried to
put in protections to make sure that wouldn't get used incorrectly.
Closes #23337 from squito/SPARK-26019.
Authored-by: Imran Rashid <> Signed-off-by: Hyukjin
Kwon <>
(cherry picked from commit 1e99f4ec5d030b80971603f090afa4e51079c5e7)
Signed-off-by: Hyukjin Kwon <>
(commit: 30a811b4b95448cc3a1236cc0902ab715320c457)
The file was modifiedcore/src/main/scala/org/apache/spark/api/python/PythonRDD.scala (diff)
The file was modifiedpython/pyspark/ (diff)
The file was modifiedpython/pyspark/ (diff)
The file was modifiedpython/pyspark/ (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/api/python/PythonGatewayServer.scala (diff)
The file was modifiedpython/pyspark/ (diff)