SuccessChanges

Summary

  1. [SPARK-30312][SQL] Preserve path permission and acl when truncate table (details)
  2. [SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for (details)
Commit b5bc3e12a629e547e32e340ee0439bc53745d862 by dhyun
[SPARK-30312][SQL] Preserve path permission and acl when truncate table
### What changes were proposed in this pull request?
This patch proposes to preserve existing permission/acls of paths when
truncate table/partition.
### Why are the changes needed?
When Spark SQL truncates table, it deletes the paths of
table/partitions, then re-create new ones. If permission/acls were set
on the paths, the existing permission/acls will be deleted.
We should preserve the permission/acls if possible.
### Does this PR introduce any user-facing change?
Yes. When truncate table/partition, Spark will keep permission/acls of
paths.
### How was this patch tested?
Unit test.
Manual test:
1. Create a table. 2. Manually change it permission/acl 3. Truncate
table 4. Check permission/acl
```scala val df = Seq(1, 2, 3).toDF
df.write.mode("overwrite").saveAsTable("test.test_truncate_table") val
testTable = spark.table("test.test_truncate_table") testTable.show()
+-----+
|value|
+-----+
|    1|
|    2|
|    3|
+-----+
// hdfs dfs -setfacl ...
// hdfs dfs -getfacl ... sql("truncate table test.test_truncate_table")
// hdfs dfs -getfacl ... val testTable2 =
spark.table("test.test_truncate_table") testTable2.show()
+-----+
|value|
+-----+
+-----+
```
![Screen Shot 2019-12-30 at 3 12 15
PM](https://user-images.githubusercontent.com/68855/71604577-c7875a00-2b17-11ea-913a-ba88096d20ab.jpg)
Closes #26956 from viirya/truncate-table-permission.
Lead-authored-by: Liang-Chi Hsieh <liangchi@uber.com> Co-authored-by:
Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun
<dhyun@apple.com>
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala (diff)
Commit f372d1cf4fff535bcd0b0be0736da18037457fde by cutlerb
[SPARK-29748][PYTHON][SQL] Remove Row field sorting in PySpark for
version 3.6+
### What changes were proposed in this pull request?
Removing the sorting of PySpark SQL Row fields that were previously
sorted by name alphabetically for Python versions 3.6 and above. Field
order will now match that as entered. Rows will be used like tuples and
are applied to schema by position. For Python versions < 3.6, the order
of kwargs is not guaranteed and therefore will be sorted automatically
as in previous versions of Spark.
### Why are the changes needed?
This caused inconsistent behavior in that local Rows could be applied to
a schema by matching names, but once serialized the Row could only be
used by position and the fields were possibly in a different order.
### Does this PR introduce any user-facing change?
Yes, Row fields are no longer sorted alphabetically but will be in the
order entered. For Python < 3.6 `kwargs` can not guarantee the order as
entered, so `Row`s will be automatically sorted.
An environment variable "PYSPARK_ROW_FIELD_SORTING_ENABLED" can be set
that will override construction of `Row` to maintain compatibility with
Spark 2.x.
### How was this patch tested?
Existing tests are run with PYSPARK_ROW_FIELD_SORTING_ENABLED=true and
added new test with unsorted fields for Python 3.6+
Closes #26496 from BryanCutler/pyspark-remove-Row-sorting-SPARK-29748.
Authored-by: Bryan Cutler <cutlerb@gmail.com> Signed-off-by: Bryan
Cutler <cutlerb@gmail.com>
The file was modifiedpython/run-tests.py (diff)
The file was modifiedpython/pyspark/sql/types.py (diff)
The file was modifieddocs/pyspark-migration-guide.md (diff)
The file was modifiedpython/pyspark/sql/tests/test_types.py (diff)