Build #49049

Environment variables

NameValue
ANDROID_HOME/home/android-sdk/
AWS_ACCESS_KEY_ID[*******]
AWS_SECRET_ACCESS_KEY[*******]
BUILD_CAUSEGHPRBCAUSE
BUILD_CAUSE_GHPRBCAUSEtrue
BUILD_DISPLAY_NAME#49049
BUILD_ID49049
BUILD_NUMBER49049
BUILD_TAGjenkins-SparkPullRequestBuilder-K8s-49049
BUILD_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49049/
CLASSPATH$CLASSPATH
DBUS_SESSION_BUS_ADDRESSunix:path=/run/user/1001/bus
EXECUTOR_NUMBER3
GITHUB_OAUTH_KEY[*******]
GIT_BRANCHSPARK-35437
GIT_COMMITf61c9bf75331fa3bde378e7b115f4ba5773ac8e7
GIT_URLhttps://github.com/apache/spark.git
HOME/home/jenkins
HUDSON_HOME/var/lib/jenkins
HUDSON_SERVER_COOKIE472906e9832aeb79
HUDSON_URLhttps://amplab.cs.berkeley.edu/jenkins/
JAVA_HOME/usr/java/latest
JENKINS_HOME/var/lib/jenkins
JENKINS_SERVER_COOKIE472906e9832aeb79
JENKINS_URLhttps://amplab.cs.berkeley.edu/jenkins/
JOB_BASE_NAMESparkPullRequestBuilder-K8s
JOB_NAMESparkPullRequestBuilder-K8s
JOB_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/
LANGen_US.UTF-8
LOGNAMEjenkins
MOTD_SHOWNpam
NODE_LABELSresearch-jenkins-worker-04 ubuntu ubuntu20
NODE_NAMEresearch-jenkins-worker-04
OLDPWD/home/jenkins
PATH/home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.6.3/bin/:/home/jenkins/gems/bin:/usr/local/go/bin:/home/jenkins/go-projects/bin:/home/jenkins/anaconda2/bin:/home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.6.3/bin/:/home/jenkins/gems/bin:/usr/local/go/bin:/home/jenkins/go-projects/bin:/home/jenkins/anaconda2/bin:$PATH
PWD/home/jenkins
ROOT_BUILD_CAUSEGHPRBCAUSE
ROOT_BUILD_CAUSE_GHPRBCAUSEtrue
RUN_ARTIFACTS_DISPLAY_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49049/display/redirect?page=artifacts
RUN_CHANGES_DISPLAY_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49049/display/redirect?page=changes
RUN_DISPLAY_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49049/display/redirect
RUN_TESTS_DISPLAY_URLhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49049/display/redirect?page=tests
SHELL/bin/bash
SHLVL0
SSH_CLIENT192.168.10.11 37420 22
SSH_CONNECTION192.168.10.11 37420 192.168.10.24 22
USERjenkins
WORKSPACE/home/jenkins/workspace/SparkPullRequestBuilder-K8s
XDG_RUNTIME_DIR/run/user/1001
XDG_SESSION_CLASSuser
XDG_SESSION_ID3
XDG_SESSION_TYPEtty
_/usr/java/latest/bin/java
ghprbActualCommit1e13812eac9e44828e0bbb5b87ae29fab711eba1
ghprbActualCommitAuthorsychen
ghprbActualCommitAuthorEmailsychen@ctrip.com
ghprbAuthorRepoGitUrlhttps://github.com/cxzl25/spark.git
ghprbCommentBodynull
ghprbCredentialsIdb7d94526-9e9b-435f-9275-d7dbf209f4a3
ghprbGhRepositoryapache/spark
ghprbPullAuthorEmail
ghprbPullAuthorLogincxzl25
ghprbPullAuthorLoginMention@cxzl25
ghprbPullDescriptionGitHub pull request #32583 of commit 1e13812eac9e44828e0bbb5b87ae29fab711eba1, no merge conflicts.
ghprbPullId32583
ghprbPullLinkhttps://github.com/apache/spark/pull/32583
ghprbPullLongDescription### What changes were proposed in this pull request?\r\nImprove partition filtering speed and reduce metastore pressure.\r\nWe can first pull all the partition names, filter by expressions, and then obtain detailed information about the corresponding partitions from the MetaStore Server.\r\n\r\n### Why are the changes needed?\r\nWhen `convertFilters` cannot take effect, cannot filter the queried partitions in advance on the hive MetaStore Server. At this time, `getAllPartitionsOf` will get all partition details.\r\n\r\nWhen the Hive client cannot use the server filter, it will first obtain the values of all partitions, and then filter.\r\n\r\nWhen we have a table with a lot of partitions and there is no way to filter it on the MetaStore Server, we will get all the partition details and filter it on the client side. This is slow and puts a lot of pressure on the MetaStore Server.\r\n\r\n\r\n\r\n\r\n### Does this PR introduce _any_ user-facing change?\r\nNo\r\n\r\n\r\n### How was this patch tested?\r\nAdd UT\r\n
ghprbPullTitle[SPARK-35437][SQL] Use expressions to filter Hive partitions at client side
ghprbSourceBranchSPARK-35437
ghprbTargetBranchmaster
ghprbTriggerAuthor
ghprbTriggerAuthorEmail
ghprbTriggerAuthorLogin
ghprbTriggerAuthorLoginMention
sha1origin/pr/32583/merge