Skip to content

v1.4.0

Latest
Compare
Choose a tag to compare
@weiting-chen weiting-chen released this 19 Jun 09:43
50dd117

Release Notes - Gluten version 1.4.0

Highlights

  • Spark 3.2.2/3.3.1/3.4.4(upgraded)/3.5.2
  • Add more spark functions support including date_format, make_date, map_filter, map_concat, from_json, btrim, array_append, and more
  • Add more spark operators support including Range, CollectLimit, and more
  • Update OAP's Velox codebase to 2025/05/12
  • Join optimizations: BNLJ full outer join
  • Shuffle optimizations: RSS ShuffleReader optimization and bug fixing
  • RSS: Celeborn 0.5.4(upgraded)/Uniffle 0.9.2(upgraded)
  • Query Plan: RAS cost model optimizations and refactor
  • Datalake: Add Iceberg/Hudi in test
  • CI: Docker image and JDK version update
  • Support dynamically adjust Stage Resource Profile
  • Support Query Trace
  • Add Qualification Tool
  • Fix OOM issues for some untracked memory

What's Changed

  • [GLUTEN-8327][CORE][Part-3] Introduce the ConfigEntry to gluten config by @yikf in #8431
  • [VL] Fix wrong warning of "Memory overhead is set to ..." under default Spark config settings by @zhztheplayer in #8448
  • [GLUTEN-8385][VL] Support write compatible-hive bucket table for Spark3.4 and Spark3.5. by @yikf in #8386
  • Revert "[CH] Disable gluten arm ci" by @lwz9103 in #8460
  • [GLUTEN-8453] [VL] Allow Heavy Batch to be Processed by ColumnarCachedBatchSerializer by @ArnavBalyan in #8454
  • [CH] Add tools to dump ActionsDAG into tree graph by @taiyang-li in #8461
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_08) by @GlutenPerfBot in #8457
  • [VL] Update document of build gluten in Docker by @FelixYBW in #8459
  • [GLUTEN-8462][CORE] Raise a meaningful error when no component is found from classpath by @zhztheplayer in #8468
  • [GLUTEN-8453][VL] Follow-up to #8454 to add a ensureVeloxBatch API for limited use cases by @zhztheplayer in #8463
  • [VL] Refactor Velox.md by @FelixYBW in #8478
  • [GLUTEN-8465] [VL] Bump Celeborn to 0.5.3 by @SteNicholas in #8467
  • [GLUTEN-8455][VL] Fallback Scan for Encrypted Parquet Files by @ArnavBalyan in #8456
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_09) by @GlutenPerfBot in #8472
  • [CORE] Refactor columnar noop write rule by @jackylee-ch in #8422
  • [GLUTEN-8462][CH] Fixed the loading of Components and Backend by @gleonSun in #8464
  • [GLUTEN-8414][VL] Override doCanonicalize in ColumnarPartialProjectEx… by @lifulong in #8415
  • [GLUTEN-8397][CH][Part-2] Fix statica_cast failed on macos by @yxheartipp in #8485
  • [GLUTEN-8343][CH]Fix cast number to decimal and improve performance of it by @KevinyhZou in #8351
  • [GLUTEN-8481][VL] Clean up shuffle reader cpp code by @marin-ma in #8482
  • [Core] Bump version to 1.4.0-SNAPSHOT by @weiting-chen in #8452
  • [GLUTEN-8483][CORE] A stable and universal way to find component files by @zhztheplayer in #8486
  • [DOC][VL] Fix typo in microbenchmark.md by @marin-ma in #8495
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250110) by @kyligence-git in #8490
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_10) by @GlutenPerfBot in #8489
  • [GLUTEN-8476][VL] Fix allocate and free memory by @jkhaliqi in #8477
  • [GLUTEN-8503][VL] Fix macro parenthesis CVE by @jkhaliqi in #8504
  • [GLUTEN-8471][VL] Fix usage of uninitialized variables by @jkhaliqi in #8470
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_11) by @GlutenPerfBot in #8507
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_12) by @GlutenPerfBot in #8508
  • [GLUTEN-8497][VL] A bad test case that fails columnar table cache query by @zhztheplayer in #8498
  • [DOC] Update README.md by @PHILO-HE in #8444
  • [GLUTEN-8319][VL] Support date_format Spark function by @PHILO-HE in #8323
  • [GLUTEN-8487][VL] adding JDK11 based Centos8 image by @zhouyuan in #8513
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_14) by @GlutenPerfBot in #8522
  • [GLUTEN-8020][VL] Remove the libhdfs3 installation script required for static linking by @JkSelf in #8013
  • [GLUTEN-8532][VL] Fix parenthesis within macro by @jkhaliqi in #8533
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_15) by @GlutenPerfBot in #8536
  • [CORE] Use RAS's cost model for legacy transition planner to evaluate cost of transitions by @zhztheplayer in #8527
  • [GLUTEN-8487][VL] adding JDK17 based Centos8 image (#8513) by @zhouyuan in #8539
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250115) by @kyligence-git in #8537
  • [GLUTEN-8479][CORE][Part-1] Remove unnecessary config by @yikf in #8480
  • [GLUTEN-8520][VL] Fix bitwise operators by @jkhaliqi in #8521
  • [GLUTEN-8524][VL] Fix input output errors by @jkhaliqi in #8525
  • [GLUTEN-6876][VL] update spark 3.5.2 in doc by @FelixYBW in #8543
  • [GLUTEN-8455][VL] Port encrypted file checks to shim layer by @ArnavBalyan in #8501
  • [CORE][VL] Cost model code refactors by @zhztheplayer in #8541
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_16) by @GlutenPerfBot in #8546
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250116) by @kyligence-git in #8544
  • [GLUTEN-8432][CH]Remove duplicate output attributes of aggregate's child by @lgbo-ustc in #8450
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_17) by @GlutenPerfBot in #8553
  • [GLUTEN-8497][CORE] A unified CallInfo API to replace AdaptiveContext by @zhztheplayer in #8551
  • [GLUTEN-8529][CH]Fix get_json_object when path has asterisk by @KevinyhZou in #8540
  • [MINOR] Fix comment of function VeloxAggregateFunctionsBuilder.create by @zml1206 in #8549
  • [CORE] Optimize duplicated code for create rel node by @zml1206 in #8548
  • [GLUTEN-7706][CORE] Support Spark-344 + JDK17 by @zhouyuan in #7789
  • [GLUTEN-8475][VL] Fix C-style casts to C++-style by @jkhaliqi in #8474
  • [GLUTEN-8534][VL] Fix allowing loops to iterate beyond end of array by @jkhaliqi in #8535
  • [GLUTEN-8538][VL] Fix incorrect calculation of buffer size by @jkhaliqi in #8542
  • [CORE][CH] Support MicroBatchScanExec with KafkaScan in batch mode by @loneylee in #8321
  • [CORE][MIRROR] Change config.defaultValue.get.toString to config.defaultValueString by @jackylee-ch in #8572
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_18) by @GlutenPerfBot in #8561
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_19) by @GlutenPerfBot in #8563
  • [GLUTEN-8406][CH] Replace from_json(s, 'Map<String, String>')[k] with get_json_object(s, '$.k') by @lgbo-ustc in #8409
  • [GLUTEN-8479][CORE][Part-2] All configurations should be defined through ConfigEntry by @yikf in #8559
  • [VL] CMake configuration cleanup to remove variable VELOX_COMPONENTS_PATH by @zhztheplayer in #8579
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250121) by @kyligence-git in #8577
  • [DOC] Fix outdated operators in documentation by @ArnavBalyan in #8582
  • [GLUTEN-8379][VL] Support query trace by @jinchengchenghh in #8380
  • [GLUTEN-8266][VL][CI] Pre-install uniffle in docker image by @zhouyuan in #8578
  • [VL] Update the Scaladoc of Component API by @zhztheplayer in #8589
  • [GLUTEN-8455][VL] Support encrypted parquet fallback for 3.5 by @ArnavBalyan in #8560
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_22) by @GlutenPerfBot in #8587
  • [GLUTEN-8580][CORE][Part-1] Clean up unnecessary code related to input file expression by @zml1206 in #8584
  • [GLUTEN-8379][VL] Fix typo in query trace document by @jinchengchenghh in #8590
  • [GLUTEN-8580][CORE][Part-2] Don't validate project generated by PushDownInputFileExpression by @zml1206 in #8585
  • [GLUTEN-3620][VL] Support Range operator for Velox Backend by @ArnavBalyan in #8161
  • [CORE] Bump iceberg version of spark 3.3 to 1.5.0 by @j7nhai in #8418
  • [GLUTEN-7544][CORE] Add Qualification Tool by @srinivasst in #8484
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_23) by @GlutenPerfBot in #8594
  • [GLUTEN-8565][VL] Remove unused code in velox batches by @ArnavBalyan in #8602
  • [VL] Remove override of test in GlutenDynamicPartitionPruningSuite by @acvictor in #8575
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_24) by @GlutenPerfBot in #8603
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250124) by @kyligence-git in #8604
  • [GLUTEN-8018][VL] Support adjusting stage resource profile dynamically by @zjuwangg in #8209
  • [GLUTEN-8410][VL] Support null type in HashAggregate by @WangGuangxin in #8411
  • [GLUTEN-8581][VL] Fix Spark legacy date formatter under case insensitive configuration by @weixiuli in #8583
  • [GLUTEN-8609][VL] Remove force cleanup in build_velox.sh by @marin-ma in #8610
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_25) by @GlutenPerfBot in #8613
  • [CH][DOC] Fix Maven Build Gluten ClickHouse Command by @jlfsdtc in #8622
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_26) by @GlutenPerfBot in #8618
  • [GLUTEN-8611][VL] Set VELOX_GFLAGS_TYPE by checking GLUTEN_VCPKG_ENABLED in build_velox.sh by @marin-ma in #8612
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_27) by @GlutenPerfBot in #8625
  • [GLUTEN-8627][VL] Fix cpp build and build script on MacOS by @marin-ma in #8628
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_28) by @GlutenPerfBot in #8629
  • [VL] Update document, remove the experimental word for spark.gluten.enabled by @FelixYBW in #8615
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_29) by @marin-ma in #8636
  • [GLUTEN-8644][CI] Bump version of upload-artifact/download-artifact to v4 by @marin-ma in #8645
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_01_31) by @GlutenPerfBot in #8642
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250201) by @kyligence-git in #8647
  • [VL][MIRROR] Fix build faile don Macos with INSTALL_PREFIX not set by @jackylee-ch in #8654
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_01) by @GlutenPerfBot in #8646
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250204) by @kyligence-git in #8658
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_04) by @GlutenPerfBot in #8657
  • [GLUTEN-8623][CH] Support File meta and row index for parquet by @baibaichen in #8624
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_05) by @GlutenPerfBot in #8661
  • [GLUTEN-8631][UNIFFLE] Bump Uniffle to 0.9.2 by @SteNicholas in #8632
  • [GLUTEN-8655][CH] Refactor: remove clickhouse.lib.path by @baibaichen in #8656
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250205) by @kyligence-git in #8660
  • [GLUTEN-8574][VL]CI: adding Spark-344 unit tests on JDK8 and adding Spark-352 unit tests on JDK17 by @zhouyuan in #8591
  • [VL] Bump GHA upload/restore action by @zhouyuan in #8672
  • [VL] nit: Remove shadowed variables in SubstraitToVeloxPlan.cc by @zhztheplayer in #8677
  • Bump junit:junit from 4.12 to 4.13.1 in /tools/qualification-tool by @dependabot in #8667
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250207) by @kyligence-git in #8681
  • [VL] Enable make_date function by @zhli1142015 in #8683
  • [GLUTEN-8616] [VL] Make filescan limit for encrypted fallback as configurable by @ArnavBalyan in #8621
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_06) by @GlutenPerfBot in #8664
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_07) by @GlutenPerfBot in #8680
  • [GLUTEN-8678] Fix jar name on macos by @marin-ma in #8679
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250208) by @kyligence-git in #8688
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_08) by @GlutenPerfBot in #8687
  • [GLUTEN-8689][VL] Enable some test cases in GlutenSQLQueryTestSuite by @marin-ma in #8690
  • [GLUTEN-8685][VL] Add null check to avoid core dump when rss push partition data size is large by @zjuwangg in #8686
  • [GLUTEN-8479][CORE][Part-3] Split backend configs to its corresponding modules by @yikf in #8586
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_09) by @GlutenPerfBot in #8691
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250210) by @kyligence-git in #8695
  • [GLUTEN-8598][CH] Fix diff for cast string to long by @exmy in #8701
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_11) by @GlutenPerfBot in #8697
  • [CORE-8569][CH] Support DeltaOptimizedWriterTransformer by @loneylee in #8570
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250211) by @kyligence-git in #8698
  • [VL] Fix timezone for Parquet timestamp write by @rui-mo in #8317
  • [GLUTEN-8675][CH] Rewrite union of multiple aggregates into one by @lgbo-ustc in #8676
  • [VL] Skip the velox download if velox_branch not exists by @FelixYBW in #8682
  • [GLUTEN-6067] Open spark 35 ut by @baibaichen in #8555
  • [GLUEN-8696][CH] Fix arm building of benchmark by @taiyang-li in #8703
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_12) by @GlutenPerfBot in #8711
  • [GLUTEN-8528][CH]Support approx_count_distinct by @taiyang-li in #8550
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250213) by @kyligence-git in #8717
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_13) by @GlutenPerfBot in #8719
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250214) by @kyligence-git in #8726
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_14) by @GlutenPerfBot in #8725
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250215) by @kyligence-git in #8735
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_15) by @GlutenPerfBot in #8734
  • [GLUTEN-8705][CH] Enable MemorySpillScheduler by @lgbo-ustc in #8706
  • [GLUTEN-8492][CH] Offload RangeExec by @taiyang-li in #8518
  • [GLUTEN-8749][CH] Explicitly cast input data type for std::min by @yxheartipp in #8750
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_18) by @GlutenPerfBot in #8755
  • [GLUTEN-8704][CH] try accelerate some spark* function by optimizing tight loops by @taiyang-li in #8708
  • [GLUTEN-8723][CH] Fix slice unexpected exception by @taiyang-li in #8759
  • [GLUTEN-8151][CORE] Remove supportRangeExec api by @taiyang-li in #8760
  • [VL] Add config for whether to enable spill on Window by @liujiayi771 in #8766
  • [GLUTEN-8769][CH] Fix failed uts introduced by approx_count_distinct by @taiyang-li in #8765
  • [VL] nit: gluten-it: Fix non-POSIX shell warnings in centos-7-deps.sh by @zhztheplayer in #8753
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_19) by @GlutenPerfBot in #8768
  • [GLUTEN-8748][CH] Support function monotonically_increasing_id by @loneylee in #8771
  • [VL] Should convert kSpillReadBufferSize and kShuffleSpillDiskWriteBufferSize to number by @boneanxs in #8684
  • [VL] Skip PartialProjectRule if spark.gluten.sql.columnar.partial.project is false by @Yohahaha in #8773
  • [Gluten-8715][CH] Fix NaN diff by @zhanglistar in #8718
  • [GLUTEN-8779][VL][Minor] Code cleanup for native validation by @marin-ma in #8780
  • [VL] Add window spill metrics by @liujiayi771 in #8777
  • [GLUTEN-8699][CH]Metric for Shuffle Read Deserializer by @loudongfeng in #8700
  • [GLUTEN-8761][VL] Fix mode in UnsafeColumnarBuildSideRelation not get properly serialize by @zjuwangg in #8762
  • [GLUTEN-8434][CH] Function bloomFilterContains process improvement by @zhanglistar in #8435
  • [VL][DOC] Move irrelevant content from VeloxGlutenUI.md by @PHILO-HE in #8786
  • [GLUTEN-8668][VL] Support complex type in ColumnarPartialProject by @WangGuangxin in #8669
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250220) by @kyligence-git in #8783
  • [WIP][VL] Fix inconsistency issue of PartitionFile path unescaping & GPL issue by @yaooqinn in #8793
  • [GLUTEN-8788][CH] Avoid unnecesssary const column materialization by @taiyang-li in #8789
  • [CH][CI] Parallel download clickhouse submodule by @lwz9103 in #8790
  • [GLUTEN-8709][VL] Support build on openEuler 24.03 LTS with Velox backend by @kevinw66 in #8710
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250221) by @kyligence-git in #8801
  • [GLUTEN-8795][CH] Support to use oss with gluten by @yxheartipp in #8796
  • [GLUTEN-8738][VL] Update GlutenSQLQueryTestSuite to exclude or overwrite failed queries for Spark3.5 by @marin-ma in #8739
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_22) by @GlutenPerfBot in #8810
  • [CH] add test for new native parquet reader by @liuneng1994 in #8797
  • [CH][DOC]Fix CMake Debug Configuration by @jlfsdtc in #8815
  • [VL] Add support for some Parquet write options to 3.4 / 3.5 to align with 3.2 / 3.3 by @zhztheplayer in #8816
  • [CORE] Add iceberg equality delete file proto definition by @liujiayi771 in #8778
  • [VL] Minor cleanups by @zhztheplayer in #8824
  • [GLUTEN-8794] Support logging as many fallback reasons as possible by @marin-ma in #8798
  • [GLUTEN-8721][VL] Native writer should keep the same compression with vanilla if hive.exec.compress.output is true by @yikf in #8722
  • [GLUTEN-8784][CH] Coalesce union of multiple scan-projects by @lgbo-ustc in #8785
  • [VL] Remove resolving ViewFs file path from scan validation by @PHILO-HE in #8829
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250226) by @kyligence-git in #8831
  • [DOC] Document how to debug Java/Scala and how to run a Java/Scala unit test by @PHILO-HE in #8841
  • [GLUTEN-8846][CH] [Part 0] Support reading Iceberg equality delete files by @baibaichen in #8847
  • [GLUTEN-8811][VL]Fix bucket scan when some partitionValue is empty by @jinchengchenghh in #8834
  • [CH][Arm] Disable -Wcast-qual warning to avoid errors with const qualifier loss in arm by @yxheartipp in #8850
  • Revert "[CORE] Change DISCLAIMER to DISCLAIMER-WIP" by @yaooqinn in #8845
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_02_25) by @GlutenPerfBot in #8826
  • [INFRA] Add missing license header for better ASF Policy compliance by @yaooqinn in #8860
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250301) by @kyligence-git in #8863
  • [GLUTEN-8738] Update GlutenSQLQueryTestSuite to match with the original file by @marin-ma in #8837
  • [VL] Enable spark function map_filter by @j7nhai in #8842
  • Revert "[VL] Enable spark function map_filter" by @baibaichen in #8869
  • [CH] optimize performance of sparkArraySort by @taiyang-li in #8844
  • [VL] Fix broken links for velox-backend-build-in-docker.md by @yaooqinn in #8875
  • [GLUTEN-8821][VL][DOC] Update scalar functions support and add automation script by @marin-ma in #8822
  • [GLUTEN-8565][VL] Support CollectLimit Operator by @ArnavBalyan in #8566
  • [DOC] Fix Gluten UI page title not correct by @zjuwangg in #8879
  • [GLUTEN-8872][CH][Part-1] Support Delta Deletion Vectors read for CH backend by @zzcclp in #8873
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_02) by @GlutenPerfBot in #8867
  • [CORE] Make scalatest.testFailureIgnore configurable for convenience by @ccat3z in #8878
  • [GLUTEN-8859][CH] Take advantage of compareSubstrings to compare substrings by @lgbo-ustc in #8874
  • [DOC] Add doc about experimental feature using off-heap to store broadcast build relation by @zjuwangg in #8882
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250304) by @kyligence-git in #8887
  • [VL] Enable spark function map_filter by @j7nhai in #8883
  • [GLUTEN-8894][VL] Fix buffer overflow in jStringToCString on arm by @kevinw66 in #8895
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_04) by @GlutenPerfBot in #8886
  • [GLUTEN-5884][VL] change default load quantum to 8M for local SSD cache by @zhouyuan in #8880
  • [INFRA] Switch archive.a.o to closer.lua to avoid abuse while fetch spark resources by @yaooqinn in #8881
  • [GLUTEN-8836][CH] Fix partition values with escape char by @lwz9103 in #8840
  • Revert "[GLUTEN-5884][VL] change default load quantum to 8M for local… by @FelixYBW in #8900
  • [GLUTEN-8742][VL] Improve the cast validation logic on native side by @ArnavBalyan in #8743
  • [HOTFIX] Fix binary release name for Spark 3.5.2 with Scala 2.13 by @yaooqinn in #8901
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250305) by @kyligence-git in #8899
  • [GLUTEN-8909][VL] Allow dynamic configuration for spark.gluten.auto.adjustStageResource.enabled by @zjuwangg in #8910
  • [GLUTEN-8905][VL]Ignore some CSV flaky tests by @jinchengchenghh in #8906
  • [GLUTEN-8340][VL] Enable from_json function by @zhli1142015 in #8320
  • [CORE] Enlarge defaultRecursionLimit by pre-loading the protobuf class by @PHILO-HE in #8904
  • [CORE] Post messages to Gluten web UI only when it is enabled by @PHILO-HE in #8907
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250306) by @kyligence-git in #8917
  • [VL] Adding nightly release by @zhouyuan in #8915
  • [GLUTEN-8903][Function] Support btrim function by @xinghuayu007 in #8903
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_06) by @GlutenPerfBot in #8916
  • [VL] Use isType func to check type by @liujiayi771 in #8902
  • [GLUTEN-8799][VL]Support Iceberg with Gluten test framework by @jinchengchenghh in #8800
  • Add scala suffix to tar command by @yaooqinn in #8918
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250307) by @kyligence-git in #8925
  • [GLUTEN-8926][CH] MergeTree Parameter Configuration Optimization to Prevent Multithreading Competition for activeSession Being None by @gleonSun in #8927
  • [GLUTEN-8802][VL] Support build static/dynamic docker images for arm by @kevinw66 in #8803
  • [GLUTEN-8921][GLUTEN-8922][CH] Fix checkDecimalOverflowSparkOrNull and lead function by @lwz9103 in #8929
  • [DOC] Document Stage Level Resource Profile Adjustment feature by @zjuwangg in #8908
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250308) by @kyligence-git in #8936
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_07) by @GlutenPerfBot in #8924
  • [CI] fix spark352-scala2.13 test home by @zhouyuan in #8938
  • [GLUTEN-8313][VL] Enable json_array_length by @WangGuangxin in #8314
  • [GLUTEN-8802][VL] Add specific jdk version for CentOS 8 docker image by @kevinw66 in #8814
  • [GLUTEN-8633] [VL] Rewrite tests for Gluten ColumnarRange by @ArnavBalyan in #8634
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_10) by @GlutenPerfBot in #8942
  • [GLUTEN-8939] Fix IllegalAccessError when converting viewfs to hdfs by @wangyum in #8940
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_11) by @GlutenPerfBot in #8955
  • [INFRA] Derive from Apache Software Foundation Parent POM by @yaooqinn in #8930
  • [GLUTEN-8846][CH] [Part 1] Support Positional Deletes by @baibaichen in #8937
  • [GLUTEN-8872][CH][Part-2] Support Delta Deletion Vectors read for CH backend by @zzcclp in #8947
  • [VL] Enable map_concat function by @rui-mo in #8781
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_12) by @GlutenPerfBot in #8972
  • [CH] Simplify parsing substrait struct fields. by @lgbo-ustc in #8976
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250313) by @kyligence-git in #8979
  • [VL] Support casting varchar type to timestamp type by @PHILO-HE in #8357
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_13) by @GlutenPerfBot in #8978
  • [GLUTEN-8932][VL] Suppport all the Iceberg test in folder source by @jinchengchenghh in #8952
  • [GLUTEN-8958][VL] Add offload rules for DeltaProjectExecTransformer and DeltaFilterExecTransformer by @dcoliversun in #8975
  • [VL] Update centos setup scripts to install new tzdata by @zhouyuan in #8988
  • [GLUTEN-8993][CELEBORN] Bump Celeborn version to 0.5.4 by @jackylee-ch in #8994
  • [VL] Support casting integral type to timestamp type by @PHILO-HE in #8593
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_14) by @GlutenPerfBot in #8989
  • [GLUTEN-3289][CH]Fix cast float to string by @KevinyhZou in #8092
  • [GLUTEN-8846][CH][Part 2] Add the test case for the iceberg MOR table with the equality deletion and the position deletion by @zzcclp in #8992
  • [CORE] Avoid ClassNotFoundException when loading a shaded protobuf class by @PHILO-HE in #8996
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250315) by @kyligence-git in #9007
  • [VL] Remove unnecessary build options by @PHILO-HE in #9021
  • [GLUTEN-8997][CH] Support regular expression delimiters for str_to_map by @lgbo-ustc in #8998
  • [GLUTEN-9019][VL] Add a check to fall back "cast decimal to timestamp" by @wForget in #9022
  • [GLUTEN-3620][VL] RangeExec support for fallback by user options by @ArnavBalyan in #8913
  • [GLUTEN-9015][VL] Support array_append function by @dcoliversun in #9016
  • [GLUTEN-8949][Core] Simplify synchronization from JniLibLoader by @ArnavBalyan in #8950
  • [GLUTEN-9027][VL] Make CI fail fast if native build job fails by @wForget in #9028
  • [GLUTEN-8995][CH] Fix column not found in row_number query by @KevinyhZou in #8999
  • [GLUTEN-9038][CH] Fix array_sort exception by @taiyang-li in #9040
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250318) by @kyligence-git in #9043
  • [VL][CI] Use a unified docker image for Spark tests by @PHILO-HE in #8605
  • [GLUTEN-8639][VL] Support casting from double/float to timestamp by @ArnavBalyan in #8640
  • [VL] Move hudi test package to org.apache.gluten.execution by @liujiayi771 in #9045
  • [GLUTEN-9032][CH] cast for values built from nothing type by @lgbo-ustc in #9042
  • [GLUTEN-8964][VL] Support BNLJ full outer join without condition by @WangGuangxin in #8965
  • [GLUTEN-9050][CH] Remove duplicated uri decode which causes runtime exceptions by @taiyang-li in #9051
  • [GLUTEN-8872][CH][Part-3] Fix reading bug for the update operation with the deletion vectors by @zzcclp in #9029
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_17) by @GlutenPerfBot in #9017
  • [GLUTEN-8945][VL] Pull out duplicate projections for HashProbe and FilterProject by @zml1206 in #8946
  • [GLUTEN-8437][VL] Fix the exception when verifying the PrestoPage header during the Presto deserialization process by @kerwin-zk in #9056
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_18) by @GlutenPerfBot in #9033
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250319) by @kyligence-git in #9053
  • [TEST] Remove useless param for runAndCompare by @zml1206 in #9048
  • [GLUTEN-8306][VL] Support GetStructField with scalar function as input by @WangGuangxin in #8606
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_19) by @GlutenPerfBot in #9063
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250320) by @kyligence-git in #9067
  • [VL] Add API for reserving global off-heap memory from Spark by @zhztheplayer in #9066
  • [GLUTEN-9010][VL] Fix GlutenCastSuite for Spark 34 and 35 by @ArnavBalyan in #9011
  • [GLUTEN-8948][VL] Fallback iceberg delete from scan by @jinchengchenghh in #8987
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_20) by @GlutenPerfBot in #9068
  • [GLUTEN-8956][VL] Add support for casting binary to string by @ArnavBalyan in #8957
  • [GLUTEN-9044][CH] Fix virtual columns in mergetree table by @lwz9103 in #9047
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_21) by @GlutenPerfBot in #9082
  • Revert "[GLUTEN-9032][CH] cast for values built from nothing type" by @lgbo-ustc in #9086
  • [VL] Minor refactor for cast expression validation by @PHILO-HE in #9084
  • [VL] Acquire off-heap global memory via the new API for off-heap broadcast exchange by @zhztheplayer in #9075
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_22) by @GlutenPerfBot in #9100
  • [VL][MINOR] Move HLL rewrite check to the beginning of the rule by @Yohahaha in #9071
  • [GLUTEN-9078][CORE] Simplify code of SoftAffinity by @WangGuangxin in #9079
  • [GLUTEN-8966][VL] Propagate HashAggregate's ignoreNullKeys when possible by @WangGuangxin in #8967
  • [CH][DOC] Update Clang to Clang-19 in CH backend by @jlfsdtc in #9110
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_24) by @GlutenPerfBot in #9105
  • [GLUTEN-8565][VL] Minor refactor for Columnar CollectLimit by @ArnavBalyan in #9097
  • [GLUTEN-9076][VL] Prioritize offloading supported hive udf in ColumnarPartialProject by @WangGuangxin in #9077
  • [TESTS] Disable Spark UI in some tests by @yaooqinn in #9109
  • [GLUTEN-9049][CH] Fix diff for cast complex type to string by @exmy in #9072
  • [GLUTEN-9093][CH] Add TryCastSuite for CH backend Spark 3.4 by @ArnavBalyan in #9094
  • [GLUTEN-9123][VL][CI] pin setuptools version in CI by @zhouyuan in #9124
  • [GLUTEN-9117][VL] Fix -DBUILD_BENCHMARKS=ON on macos by @marin-ma in #9118
  • [GLUTEN-8974][CH] Replace specical join + aggregate case with any join by @lgbo-ustc in #9059
  • [GLUTEN-9120][VL][Minor] Remove s3 check for macOS in build script by @marin-ma in #9122
  • [GLUTEN-9076][VL][FOLLOWUP] Simplify code of HiveUDF by @WangGuangxin in #9127
  • [GLUTEN-9085][CH] Add UT for mergetree write stats by @lwz9103 in #9089
  • [VL] Optimize memory allocation for VeloxRssShuffleReader by @kerwin-zk in #9069
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_25) by @GlutenPerfBot in #9130
  • [CH] Fix kafka unstable ut by @loneylee in #9131
  • [VL][CI] Dump & upload logs for unit test to GitHub artifact by @yaooqinn in #9024
  • [VL][MIRROR] Migrate Velox runtime config flags to dynamic VeloxRuntime settings by @jackylee-ch in #9103
  • [VL] Enable filter push-down on nested field by @rui-mo in #7946
  • [VL] Account some C++ untracked memory allocations into Spark global off-heap memory by @zhztheplayer in #9115
  • [GLUTEN-9113][VL] Remove unused not_equal function mapping by @kevincmchen in #9114
  • [GLUTEN-9083][CH]Fix the nullability missmatch of nothing type by @lgbo-ustc in #9091
  • [VL] Rename some src-*/ source folders to src/ by @zhztheplayer in #9134
  • [VL] Improve native plan validation code by @PHILO-HE in #9092
  • [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250326) by @kyligence-git in #9136
  • [GLUTEN-9057][VL] Avoid flatten unnecessary vector in ColumnarBatch.select by @WangGuangxin in #9058
  • [VL] Remove inlined children plan access during native validation of Exchange / CollectLimit by @zhztheplayer in #9145
  • [GLUTEN-9039][CH] Improve array_sort performance when only single argument is input by @taiyang-li in #9157
  • [VL] Fix reclaim size for shuffle by @yikf in #9143
  • [VL][CI] Add test for rss sort shuffle by @kerwin-zk in #9140
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_26) by @GlutenPerfBot in #9135
  • [VL] Use SPARK_COMPILE_VERSION instead of hardcoded for SparkShimDescriptor at compile time by @yaooqinn in #9132
  • [GLUTEN-9164][CH]Enable row group level bloom filter push down by @taiyang-li in #9165
  • [GLUTEN-9148] Fix shuffle file permission issue when using ColumnarShuffleManager by @wangyum in #9156
  • [VL] LocalPartitionWriter should discard the evict since it use hash/sort evict by @yikf in #9167
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_27) by @GlutenPerfBot in #9147
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_28) by @GlutenPerfBot in #9161
  • [GLUTEN-6887][VL] Daily Update Velox Version (2025_03_29) by @GlutenPerfBot in #9172
  • [VL] Fix weekly build job by @PHILO-HE in #9168
  • [GLUTEN-9055][CH] Fix input_file_name diff from hive text table by @taiyang-li in #9142
  • [1.4][VL] update oap velox to gluten-1.4.0 by @weiting-chen in #9217
  • [1.4] preparing v1.4.0-rc0 release by @weiting-chen in #9260
  • [branch-1.4] Port PR #9200 #9320 #9368 #9209 #9262 by @weiting-chen in #9431
  • [branch-1.4][VL] Fix docker image name for 1.4 branch by @zhouyuan in #9479
  • [Branch-1.4][VL] Update Velox branch to make Velox compatible with old tzdata by @PHILO-HE in #9565
  • [Branch-1.4][VL] update docker image name for branch-1.4 by @zhouyuan in #9599
  • [branch-1.4][VL] Fix docker build by @PHILO-HE in #9607
  • [Branch-1.4][VL] Fix code ref in docker build by @PHILO-HE in #9614
  • [Branch-1.4][VL] Port fix: Fix build failure due to libelf vcpkg unavailable files (#9550) by @PHILO-HE in #9601
  • [Branch-1.4][GLUTEN-9383][VL] Backport: fix leak when growing capacity by @wForget in #9663
  • [Branch-1.4][VL] Port patch #9685: add default config.guess and config.sub by @PHILO-HE in #9707
  • [Branch-1.4] Port #9851 #9879 to fix release issue by @weiting-chen in #9895

New Contributors

Full Changelog: v1.3.0...v1.4.0