1. [SPARK-24809][SQL] Serializing LongToUnsafeRowMap in executor may result (commit: 71eb7d4682a7e85e4de580ffe110da961d84817f) (details)
Commit 71eb7d4682a7e85e4de580ffe110da961d84817f by gatorsmile
[SPARK-24809][SQL] Serializing LongToUnsafeRowMap in executor may result
in data error
When join key is long or int in broadcast join, Spark will use
`LongToUnsafeRowMap` to store key-values of the table witch will be
broadcasted. But, when `LongToUnsafeRowMap` is broadcasted to executors,
and it is too big to hold in memory, it will be stored in disk. At that
time, because `write` uses a variable `cursor` to determine how many
bytes in `page` of `LongToUnsafeRowMap` will be write out and the
`cursor` was not restore when deserializing, executor will write out
nothing from page into disk.
## What changes were proposed in this pull request? Restore cursor value
when deserializing.
Author: liulijia <>
Closes #21772 from liutang123/SPARK-24809.
(cherry picked from commit 2c54aae1bc2fa3da26917c89e6201fb2108d9fab)
Signed-off-by: Xiao Li <>
(commit: 71eb7d4682a7e85e4de580ffe110da961d84817f)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala (diff)