hudi.git
24 hours ago[MINOR] Fix deploy script for flink 1.15 (#6872) master
Shiyan Xu [Thu, 6 Oct 2022 02:52:38 +0000 (10:52 +0800)] 
[MINOR] Fix deploy script for flink 1.15 (#6872)

24 hours agoEnhancing README for multi-writer tests (#6870)
Sivabalan Narayanan [Thu, 6 Oct 2022 02:41:52 +0000 (19:41 -0700)] 
Enhancing README for multi-writer tests (#6870)

24 hours ago[HUDI-4970] Update kafka-connect readme and refactor HoodieConfig#create (#6857)
Sagar Sumit [Thu, 6 Oct 2022 02:41:35 +0000 (08:11 +0530)] 
[HUDI-4970] Update kafka-connect readme and refactor HoodieConfig#create (#6857)

25 hours agoRevert "[HUDI-4915] improve avro serializer/deserializer (#6788)" (#6809)
Yann Byron [Thu, 6 Oct 2022 02:40:55 +0000 (10:40 +0800)] 
Revert "[HUDI-4915] improve avro serializer/deserializer (#6788)" (#6809)

This reverts commit 79b3e2b899cc303490c22610fda0e5ac2013cf02.

40 hours ago[HUDI-4980] Calculate avg record size using commit only (#6864)
Shiyan Xu [Wed, 5 Oct 2022 10:59:19 +0000 (18:59 +0800)] 
[HUDI-4980] Calculate avg record size using commit only (#6864)

Calculate average record size for Spark upsert partitioner
based on commit instants only. Previously it's based on
commit and replacecommit, of which the latter may be
created by clustering which has inaccurately smaller
average record sizes, which could result in OOM
due to size underestimation.

47 hours ago[HOTFIX] Fix source release validate script (#6865)
Shiyan Xu [Wed, 5 Oct 2022 04:19:07 +0000 (12:19 +0800)] 
[HOTFIX] Fix source release validate script (#6865)

3 days ago[HUDI-4962] Move cloud dependencies to cloud modules (#6846)
Shiyan Xu [Mon, 3 Oct 2022 11:12:50 +0000 (19:12 +0800)] 
[HUDI-4962] Move cloud dependencies to cloud modules (#6846)

3 days ago[MINOR] Fix testUpdateRejectForClustering (#6852)
Zouxxyy [Mon, 3 Oct 2022 05:30:30 +0000 (13:30 +0800)] 
[MINOR] Fix testUpdateRejectForClustering (#6852)

3 days ago[HUDI-4966] Add a partition extractor to handle partition values with slashes (#6851)
Y Ethan Guo [Mon, 3 Oct 2022 05:06:45 +0000 (22:06 -0700)] 
[HUDI-4966] Add a partition extractor to handle partition values with slashes (#6851)

4 days ago[HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the...
Yann Byron [Sun, 2 Oct 2022 13:46:51 +0000 (21:46 +0800)] 
[HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the Row (#6805)

5 days ago[HUDI-4769] Option read.streaming.skip_compaction skips delta commit (#6848)
Danny Chan [Sun, 2 Oct 2022 01:08:03 +0000 (09:08 +0800)] 
[HUDI-4769] Option read.streaming.skip_compaction skips delta commit (#6848)

5 days ago[HUDI-4916] Implement change log feed for Flink (#6840)
Danny Chan [Sat, 1 Oct 2022 10:21:23 +0000 (18:21 +0800)] 
[HUDI-4916] Implement change log feed for Flink (#6840)

6 days ago[HUDI-4718] Add Kerberos kdestroy command support (#6810)
Zouxxyy [Fri, 30 Sep 2022 21:53:19 +0000 (05:53 +0800)] 
[HUDI-4718] Add Kerberos kdestroy command support (#6810)

6 days ago[HUDI-4957] Shade JOL in bundles to fix NoClassDefFoundError:GraphLayout (#6839)
Sagar Sumit [Fri, 30 Sep 2022 19:54:16 +0000 (01:24 +0530)] 
[HUDI-4957] Shade JOL in bundles to fix NoClassDefFoundError:GraphLayout (#6839)

6 days ago[HUDI-4850] Add incremental source from GCS to Hudi (#6665)
Pramod Biligiri [Fri, 30 Sep 2022 07:05:12 +0000 (12:35 +0530)] 
[HUDI-4850] Add incremental source from GCS to Hudi (#6665)

Adds an incremental source from GCS based on a similar design
as https://hudi.apache.org/blog/2021/08/23/s3-events-source

7 days ago[HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand (#6355)
冯健 [Thu, 29 Sep 2022 22:34:00 +0000 (06:34 +0800)] 
[HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand (#6355)

Co-authored-by: jian.feng <jian.feng@shopee.com>
7 days ago[HUDI-4237] Fixing empty partition-values being sync'd to HMS (#6821)
Alexey Kudinkin [Thu, 29 Sep 2022 16:07:56 +0000 (09:07 -0700)] 
[HUDI-4237] Fixing empty partition-values being sync'd to HMS (#6821)

Co-authored-by: dujunling <dujunling@bytedance.com>
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
7 days ago[HUDI-4308] READ_OPTIMIZED read mode will temporary loss of data when compaction...
aiden.dong [Thu, 29 Sep 2022 16:05:18 +0000 (00:05 +0800)] 
[HUDI-4308] READ_OPTIMIZED read mode will temporary loss of data when compaction (#6664)

Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
7 days ago[MINOR] Use base path URI in ITTestDataStreamWrite (#6826)
Sagar Sumit [Thu, 29 Sep 2022 15:29:14 +0000 (20:59 +0530)] 
[MINOR] Use base path URI in ITTestDataStreamWrite (#6826)

7 days ago[HUDI-4951] Fix incorrect use of Long.getLong() (#6828)
申胜利 [Thu, 29 Sep 2022 15:17:15 +0000 (23:17 +0800)] 
[HUDI-4951] Fix incorrect use of Long.getLong() (#6828)

7 days ago[HUDI-4885] Adding org.apache.avro to hudi-hive-sync bundle (#6729)
Sivabalan Narayanan [Thu, 29 Sep 2022 09:06:32 +0000 (02:06 -0700)] 
[HUDI-4885] Adding org.apache.avro to hudi-hive-sync bundle (#6729)

7 days ago[HUDI-4861] Relaxing `MERGE INTO` constraints to permit limited casting operations...
Alexey Kudinkin [Thu, 29 Sep 2022 07:43:32 +0000 (00:43 -0700)] 
[HUDI-4861] Relaxing `MERGE INTO` constraints to permit limited casting operations w/in matched-on conditions (#6820)

7 days ago[HUDI-4936] Fix `as.of.instant` not recognized as hoodie config (#5616)
Leon Tsao [Thu, 29 Sep 2022 06:18:41 +0000 (14:18 +0800)] 
[HUDI-4936] Fix `as.of.instant` not recognized as hoodie config (#5616)

Co-authored-by: leon <leon@leondeMacBook-Pro.local>
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
7 days ago[HUDI-4722] Added locking metrics for Hudi (#6502)
jsbali [Thu, 29 Sep 2022 05:37:46 +0000 (11:07 +0530)] 
[HUDI-4722] Added locking metrics for Hudi (#6502)

8 days ago[HUDI-4934] Revert batch clean files (#6813)
Sivabalan Narayanan [Wed, 28 Sep 2022 22:51:45 +0000 (15:51 -0700)] 
[HUDI-4934] Revert batch clean files (#6813)

* Revert "[HUDI-4792] Batch clean files to delete (#6580)"
This reverts commit cbf9b83ca6d3dada14eea551a5bae25144ca0459.

8 days ago[HUDI-4734] Deltastreamer table config change validation (#6753)
Jon Vexler [Wed, 28 Sep 2022 21:12:27 +0000 (17:12 -0400)] 
[HUDI-4734] Deltastreamer table config change validation (#6753)

Co-authored-by: sivabalan <n.siva.b@gmail.com>
8 days ago[MINOR] fixing validate async operations to poll completed clean instances (#6814)
Sivabalan Narayanan [Wed, 28 Sep 2022 19:14:24 +0000 (12:14 -0700)] 
[MINOR] fixing validate async operations to poll completed clean instances (#6814)

8 days ago[HUDI-4687] Avoid setAccessible which breaks strong encapsulation (#6657)
Sagar Sumit [Wed, 28 Sep 2022 17:04:04 +0000 (22:34 +0530)] 
[HUDI-4687] Avoid setAccessible which breaks strong encapsulation (#6657)

Use JOL GraphLayout for estimating deep size.

8 days ago[HUDI-4924] Auto-tune dedup parallelism (#6802)
Y Ethan Guo [Wed, 28 Sep 2022 15:03:41 +0000 (08:03 -0700)] 
[HUDI-4924] Auto-tune dedup parallelism (#6802)

8 days ago[HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data ...
hj2016 [Wed, 28 Sep 2022 15:02:59 +0000 (23:02 +0800)] 
[HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data (#4015)

Co-authored-by: huangjing02 <huangjing02@bilibili.com>
Co-authored-by: sivabalan <n.siva.b@gmail.com>
9 days ago[HUDI-4453] Fix schema to include partition columns in bootstrap operation (#6676)
Y Ethan Guo [Wed, 28 Sep 2022 03:00:59 +0000 (20:00 -0700)] 
[HUDI-4453] Fix schema to include partition columns in bootstrap operation (#6676)

Turn off the type inference of the partition column to be consistent with
existing behavior. Add notes around partition column type inference.

9 days ago[HUDI-4913] Fix HoodieSnapshotExporter for writing to a different S3 bucket or FS...
Y Ethan Guo [Tue, 27 Sep 2022 19:21:19 +0000 (12:21 -0700)] 
[HUDI-4913] Fix HoodieSnapshotExporter for writing to a different S3 bucket or FS (#6785)

9 days ago[HUDI-4848] Fixing repair deprecated partition tool (#6731)
Sivabalan Narayanan [Tue, 27 Sep 2022 19:02:35 +0000 (12:02 -0700)] 
[HUDI-4848] Fixing repair deprecated partition tool (#6731)

9 days ago[HUDI-4923] Fix flaky TestHoodieReadClient.testReadFilterExistAfterBulkInsertPrepped...
Sivabalan Narayanan [Tue, 27 Sep 2022 11:13:26 +0000 (04:13 -0700)] 
[HUDI-4923] Fix flaky TestHoodieReadClient.testReadFilterExistAfterBulkInsertPrepped (#6801)

Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
9 days ago[HUDI-4907] Prevent single commit multi instant issue (#6766)
voonhous [Tue, 27 Sep 2022 07:52:23 +0000 (15:52 +0800)] 
[HUDI-4907] Prevent single commit multi instant issue (#6766)

Co-authored-by: TengHuo <teng_huo@outlook.com>
Co-authored-by: yuzhao.cyz <yuzhao.cyz@gmail.com>
9 days ago[HUDI-4904] Add support for unraveling proto schemas in ProtoClassBasedSchemaProvider...
Tim Brown [Tue, 27 Sep 2022 04:45:27 +0000 (21:45 -0700)] 
[HUDI-4904] Add support for unraveling proto schemas in ProtoClassBasedSchemaProvider (#6761)

If a user provides a recursive proto schema, it will fail when we write to parquet. We need to allow the user to specify how many levels of recursion they want before truncating the remaining data.

Main changes to existing code:

ProtoClassBasedSchemaProvider tracks number of times a message descriptor is seen within a branch of the schema traversal
once the number of times that descriptor is seen exceeds the user provided limit, set the field to preset record that will contain two fields: 1) the remaining data serialized as a proto byte array, 2) the descriptors full name for context about what is in that byte array
Converting from a proto to an avro now accounts for this truncation of the input

10 days ago[MINOR] Update PR template with documentation update (#6748)
Y Ethan Guo [Mon, 26 Sep 2022 23:46:05 +0000 (16:46 -0700)] 
[MINOR] Update PR template with documentation update (#6748)

10 days ago[HUDI-4902] Set default partitioner for SIMPLE BUCKET index (#6759)
Manu [Mon, 26 Sep 2022 23:45:12 +0000 (07:45 +0800)] 
[HUDI-4902] Set default partitioner for SIMPLE BUCKET index (#6759)

10 days ago[HUDI-4718] Add Kerberos kinit command support. (#6719)
Paul Zhang [Mon, 26 Sep 2022 14:11:10 +0000 (22:11 +0800)] 
[HUDI-4718] Add Kerberos kinit command support. (#6719)

10 days ago[HUDI-4918] Fix bugs about when trying to show the non -existing key from env, NullPo...
Forus [Mon, 26 Sep 2022 14:05:34 +0000 (22:05 +0800)] 
[HUDI-4918] Fix bugs about when trying to show the non -existing key from env, NullPointException occurs. (#6794)

10 days ago[HUDI-4910] Fix unknown variable or type "Cast" (#6778)
KnightChess [Mon, 26 Sep 2022 14:03:40 +0000 (22:03 +0800)] 
[HUDI-4910] Fix unknown variable or type "Cast" (#6778)

10 days ago[HUDI-4914] Managed memory weight should be set when sort clustering is enabled ...
Nicholas Jiang [Mon, 26 Sep 2022 12:22:50 +0000 (20:22 +0800)] 
[HUDI-4914] Managed memory weight should be set when sort clustering is enabled (#6792)

10 days ago[HUDI-4760] Fixing repeated trigger of data file creations w/ clustering (#6561)
Sivabalan Narayanan [Mon, 26 Sep 2022 05:05:57 +0000 (22:05 -0700)] 
[HUDI-4760] Fixing repeated trigger of data file creations w/ clustering (#6561)

- Apparently in clustering, data file creations are triggered twice since we don't cache the write status and for doing some validation, we do isEmpty on JavaRDD which ended up retriggering the action. Fixing the double de-referencing in this patch.

10 days ago[HUDI-4830] Fix testNoGlobalConfFileConfigured when add hudi-defaults.conf in default...
Zouxxyy [Mon, 26 Sep 2022 04:28:55 +0000 (12:28 +0800)] 
[HUDI-4830] Fix testNoGlobalConfFileConfigured when add hudi-defaults.conf in default dir (#6652)

11 days ago[HUDI-3478] Implement CDC Read in Spark (#6727)
Yann Byron [Mon, 26 Sep 2022 01:06:26 +0000 (09:06 +0800)] 
[HUDI-3478] Implement CDC Read in Spark (#6727)

11 days ago[HUDI-4915] improve avro serializer/deserializer (#6788)
Yann Byron [Sun, 25 Sep 2022 15:42:44 +0000 (23:42 +0800)] 
[HUDI-4915] improve avro serializer/deserializer (#6788)

11 days ago[RFC-51][HUDI-3478] Update RFC: CDC support (#6256)
Shiyan Xu [Sun, 25 Sep 2022 08:32:33 +0000 (16:32 +0800)] 
[RFC-51][HUDI-3478] Update RFC: CDC support (#6256)

11 days ago[HUDI-4433] hudi-cli repair deduplicate not working with non-partitioned dataset...
ChanKyeong Won [Sun, 25 Sep 2022 06:35:46 +0000 (15:35 +0900)] 
[HUDI-4433] hudi-cli repair deduplicate not working with non-partitioned dataset (#6349)

When using the repair deduplicate command with hudi-cli,
there is no way to run it on the unpartitioned dataset,
so modify the cli parameter.

Co-authored-by: Xingjun Wang <wongxingjun@126.com>
12 days ago[MINOR] Simple logging fix in LockManager (#6765)
苏承祥 [Sat, 24 Sep 2022 22:49:12 +0000 (06:49 +0800)] 
[MINOR] Simple logging fix in LockManager (#6765)

Co-authored-by: 苏承祥 <sucx@tuya.com>
12 days ago[MINOR] retain avro's namespace (#6783)
Yann Byron [Sat, 24 Sep 2022 21:19:30 +0000 (05:19 +0800)] 
[MINOR] retain avro's namespace (#6783)

12 days ago[HUDI-4412] Fix multi writer INSERT_OVERWRITE NPE bug (#6130)
liujinhui [Sat, 24 Sep 2022 07:34:09 +0000 (15:34 +0800)] 
[HUDI-4412] Fix multi writer INSERT_OVERWRITE NPE bug (#6130)

There are two minor issues fixed here:

1. When the insert_overwrite operation is performed, the
    clusteringPlan in the requestedReplaceMetadata will be
    null. Calling getFileIdsFromRequestedReplaceMetadata will cause NPE.

2. When insert_overwrite operation, inflightCommitMetadata!=null,
    getOperationType should be obtained from getHoodieInflightReplaceMetadata,
    the original code will have a null pointer.

12 days ago[MINOR] Fix a few typos in HoodieIndex (#6784)
Xingjun Wang [Sat, 24 Sep 2022 03:55:24 +0000 (11:55 +0800)] 
[MINOR] Fix a few typos in HoodieIndex (#6784)

Co-authored-by: xingjunwang <xingjunwang@tencent.com>
13 days ago[HUDI-4892] Fix hudi-spark3-bundle (#6735)
Y Ethan Guo [Fri, 23 Sep 2022 21:20:51 +0000 (14:20 -0700)] 
[HUDI-4892] Fix hudi-spark3-bundle (#6735)

13 days ago[HUDI-4899] Fixing compatibility w/ Spark 3.2.2 (#6755)
Alexey Kudinkin [Fri, 23 Sep 2022 19:58:30 +0000 (12:58 -0700)] 
[HUDI-4899] Fixing compatibility w/ Spark 3.2.2 (#6755)

13 days ago[HUDI-4906] Fix the local tests for hudi-flink (#6763)
Danny Chan [Fri, 23 Sep 2022 19:07:32 +0000 (03:07 +0800)] 
[HUDI-4906] Fix the local tests for hudi-flink (#6763)

13 days agoMerge pull request #6779 from wongxingjun/master
Prasanna Rajaperumal [Fri, 23 Sep 2022 16:39:55 +0000 (22:09 +0530)] 
Merge pull request #6779 from wongxingjun/master

Update HoodieIndex.java

13 days agoUpdate HoodieIndex.java 6779/head
Xingjun Wang [Fri, 23 Sep 2022 16:33:43 +0000 (00:33 +0800)] 
Update HoodieIndex.java

Fix a typo

13 days ago[MINOR] Drastically reducing concurrency level (to avoid CI flakiness) (#6754)
Alexey Kudinkin [Fri, 23 Sep 2022 14:06:20 +0000 (07:06 -0700)] 
[MINOR] Drastically reducing concurrency level (to avoid CI flakiness) (#6754)

13 days ago[HUDI-4903] Fix TestHoodieLogFormat`s minor typo (#6762)
wulei [Fri, 23 Sep 2022 12:44:56 +0000 (20:44 +0800)] 
[HUDI-4903] Fix TestHoodieLogFormat`s minor typo (#6762)

13 days ago[HUDI-3523] Introduce AddPrimitiveColumnSchemaPostProcessor to support add new primit...
wangxianghu [Fri, 23 Sep 2022 12:20:46 +0000 (20:20 +0800)] 
[HUDI-3523] Introduce AddPrimitiveColumnSchemaPostProcessor to support add new primitive column to the end of a schema (#6769)

13 days agoRevert "[HUDI-3523] Introduce AddColumnSchemaPostProcessor to support add columns...
wangxianghu [Fri, 23 Sep 2022 12:18:18 +0000 (20:18 +0800)] 
Revert "[HUDI-3523] Introduce AddColumnSchemaPostProcessor to support add columns to the end of a schema (#5031)" (#6768)

This reverts commit 092375fc1f058c7841d9d63cd04e842c062fae74.

13 days ago[HUDI-3523] Introduce AddColumnSchemaPostProcessor to support add columns to the...
wangxianghu [Fri, 23 Sep 2022 11:53:18 +0000 (19:53 +0800)] 
[HUDI-3523] Introduce AddColumnSchemaPostProcessor to support add columns to the end of a schema (#5031)

13 days ago[HUDI-4897] Refactor the merge handle in CDC mode (#6740)
Danny Chan [Fri, 23 Sep 2022 10:36:48 +0000 (18:36 +0800)] 
[HUDI-4897] Refactor the merge handle in CDC mode (#6740)

13 days ago[HUDI-4883] Supporting delete savepoint for MOR (#6744)
Sivabalan Narayanan [Fri, 23 Sep 2022 10:03:01 +0000 (03:03 -0700)] 
[HUDI-4883] Supporting delete savepoint for MOR (#6744)

Users could delete unnecessary savepoints
and unblock archival for MOR table.

13 days ago[HUDI-4559] Support hiveSync command based on Call Produce Command (#6322)
ForwardXu [Fri, 23 Sep 2022 08:04:42 +0000 (16:04 +0800)] 
[HUDI-4559] Support hiveSync command based on Call Produce Command (#6322)

13 days ago[HUDI-4901] Add avro.version to Flink profiles (#6757)
Shawn Chang [Fri, 23 Sep 2022 06:22:19 +0000 (23:22 -0700)] 
[HUDI-4901] Add avro.version to Flink profiles (#6757)

* Add avro.version to Flink profiles

Co-authored-by: Shawn Chang <yxchang@amazon.com>
2 weeks ago[MINOR] Add .mvn directory to gitignore (#6746)
Rahil C [Fri, 23 Sep 2022 02:37:43 +0000 (19:37 -0700)] 
[MINOR] Add .mvn directory to gitignore (#6746)

Co-authored-by: Rahil Chertara <rchertar@amazon.com>
2 weeks ago[HUDI-3901] Correct the description of hoodie.index.type (#6749)
Y Ethan Guo [Fri, 23 Sep 2022 02:21:20 +0000 (19:21 -0700)] 
[HUDI-3901] Correct the description of hoodie.index.type (#6749)

2 weeks ago[HUDI-4851] Fixing handling of `UTF8String` w/in `InSet` operator (#6739)
Alexey Kudinkin [Thu, 22 Sep 2022 22:43:45 +0000 (15:43 -0700)] 
[HUDI-4851] Fixing handling of `UTF8String` w/in `InSet` operator (#6739)

Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
2 weeks ago[HUDI-3478][HUDI-4887] Use Avro as the format of persisted cdc data (#6734)
Yann Byron [Thu, 22 Sep 2022 17:33:19 +0000 (01:33 +0800)] 
[HUDI-3478][HUDI-4887] Use Avro as the format of persisted cdc data (#6734)

2 weeks ago[HUDI-4363] Support Clustering row writer to improve performance (#6046)
RexAn [Thu, 22 Sep 2022 13:17:09 +0000 (21:17 +0800)] 
[HUDI-4363] Support Clustering row writer to improve performance (#6046)

2 weeks ago[HUDI-4792] Batch clean files to delete (#6580)
Nicolas Paris [Wed, 21 Sep 2022 21:41:03 +0000 (23:41 +0200)] 
[HUDI-4792] Batch clean files to delete (#6580)

This  patch makes use of batch call to get fileGroup to delete during cleaning instead of 1 call per partition.
This limit the number of call to the view and should fix the trouble with metadata table in context of lot of partitions.
Fixes issue #6373

Co-authored-by: sivabalan <n.siva.b@gmail.com>
2 weeks ago[HUDI-4758] Add validations to java spark examples (#6615)
Jon Vexler [Wed, 21 Sep 2022 14:52:08 +0000 (10:52 -0400)] 
[HUDI-4758] Add validations to java spark examples (#6615)

2 weeks ago[HUDI-3983] Fix ClassNotFoundException when using hudi-spark-bundle to write table...
Manu [Wed, 21 Sep 2022 11:46:55 +0000 (19:46 +0800)] 
[HUDI-3983] Fix ClassNotFoundException when using hudi-spark-bundle to write table with hbase index (#6715)

2 weeks ago[HUDI-4729] Fix file group pending compaction cannot be queried when query _ro table...
shaoxiong.zhan [Wed, 21 Sep 2022 08:50:22 +0000 (16:50 +0800)] 
[HUDI-4729] Fix file group pending compaction cannot be queried when query _ro table (#6516)

File group in pending compaction can not be queried
when query _ro table with spark. This commit fixes that.

Co-authored-by: zhanshaoxiong <shaoxiong0001@@gmail.com>
Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
2 weeks ago[DOCS] Improve the quick start guide for Kafka Connect Sink (#6708)
Y Ethan Guo [Tue, 20 Sep 2022 16:14:00 +0000 (09:14 -0700)] 
[DOCS] Improve the quick start guide for Kafka Connect Sink (#6708)

2 weeks ago[HUDI-4875] Fix NoSuchTableException when dropping temporary view after applied Hoodi...
dohongdayi [Tue, 20 Sep 2022 15:44:51 +0000 (23:44 +0800)] 
[HUDI-4875] Fix NoSuchTableException when dropping temporary view after applied HoodieSparkSessionExtension in Spark 3.2 (#6709)

2 weeks ago[HUDI-4326] Fix hive sync serde properties (#6722)
Shiyan Xu [Tue, 20 Sep 2022 12:22:30 +0000 (20:22 +0800)] 
[HUDI-4326] Fix hive sync serde properties (#6722)

2 weeks ago [HUDI-3478] Implement CDC Write in Spark (#6697)
Yann Byron [Tue, 20 Sep 2022 10:52:05 +0000 (18:52 +0800)] 
 [HUDI-3478] Implement CDC Write in Spark (#6697)

2 weeks ago[MINOR] fix indent to make build pass (#6721)
Yann Byron [Tue, 20 Sep 2022 05:08:42 +0000 (13:08 +0800)] 
[MINOR] fix indent to make build pass (#6721)

2 weeks ago[HUDI-4326] add updateTableSerDeInfo for HiveSyncTool (#5920)
Kyle Zhike Chen [Tue, 20 Sep 2022 02:48:02 +0000 (10:48 +0800)] 
[HUDI-4326] add updateTableSerDeInfo for HiveSyncTool (#5920)

- This pull request fix [SUPPORT] Hudi spark datasource error after migrate from 0.8 to 0.11 #5861*
- The issue is caused by after changing the table to spark data source table, the table SerDeInfo is missing. *

Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
2 weeks ago[HUDI-4877] Fix org.apache.hudi.index.bucket.TestHoodieSimpleBucketIndex#testTagLocat...
FocusComputing [Tue, 20 Sep 2022 01:31:50 +0000 (09:31 +0800)] 
[HUDI-4877] Fix org.apache.hudi.index.bucket.TestHoodieSimpleBucketIndex#testTagLocation not work correct issue (#6717)

Co-authored-by: xiaoxingstack <xiaoxingstack@didiglobal.com>
2 weeks ago[HUDI-4810] Fix log4j imports to use bridge API (#6710)
eric9204 [Mon, 19 Sep 2022 12:26:27 +0000 (20:26 +0800)] 
[HUDI-4810] Fix log4j imports to use bridge API  (#6710)

Co-authored-by: dongsj <dongsj@asiainfo.com>
2 weeks ago[HUDI-4832] Fix drop partition meta sync (#6662)
Sagar Sumit [Mon, 19 Sep 2022 10:57:11 +0000 (16:27 +0530)] 
[HUDI-4832] Fix drop partition meta sync (#6662)

2 weeks ago[minor] following 3304, some code refactoring (#6713)
Danny Chan [Mon, 19 Sep 2022 07:44:29 +0000 (15:44 +0800)] 
[minor] following 3304, some code refactoring (#6713)

2 weeks ago[HUDI-4485] Bump spring shell to 2.1.1 in CLI (#6489)
Paul Zhang [Mon, 19 Sep 2022 06:30:00 +0000 (14:30 +0800)] 
[HUDI-4485] Bump spring shell to 2.1.1 in CLI (#6489)

Bumped spring shell to 2.1.1 and updated the default
value for show fsview all `pathRegex` parameter.

2 weeks ago[HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo… (#6630)
FocusComputing [Mon, 19 Sep 2022 06:16:24 +0000 (14:16 +0800)] 
[HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo… (#6630)

* [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in log file issue

Co-authored-by: xiaoxingstack <xiaoxingstack@didiglobal.com>
2 weeks ago[HUDI-3304] Support partial update payload (#4676)
冯健 [Mon, 19 Sep 2022 01:43:56 +0000 (09:43 +0800)] 
[HUDI-3304] Support partial update payload (#4676)

Co-authored-by: jian.feng <jian.feng@shopee.com>
2 weeks ago[HUDI-4870] Improve compaction config description (#6706)
Y Ethan Guo [Sun, 18 Sep 2022 17:03:16 +0000 (10:03 -0700)] 
[HUDI-4870] Improve compaction config description (#6706)

2 weeks ago[HUDI-4873] Report number of messages to be processed via metrics (#6271)
Volodymyr Burenin [Sat, 17 Sep 2022 22:59:25 +0000 (17:59 -0500)] 
[HUDI-4873] Report number of messages to be processed via metrics (#6271)

Co-authored-by: Volodymyr Burenin <volodymyr.burenin@cloudkitchens.com>
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2 weeks ago[HUDI-4828] Fix the extraction of record keys which may be cut out (#6650)
y0908105023 [Sat, 17 Sep 2022 22:57:12 +0000 (06:57 +0800)] 
[HUDI-4828] Fix the extraction of record keys which may be cut out (#6650)

Co-authored-by: yangshuo3 <yangshuo3@kingsoft.com>
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2 weeks ago[HUDI-3959] Rename class name for spark rdd reader (#5409)
simonsssu [Sat, 17 Sep 2022 22:16:52 +0000 (06:16 +0800)] 
[HUDI-3959] Rename class name for spark rdd reader (#5409)

Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2 weeks ago[HUDI-4757] Create pyspark examples (#6672)
Jon Vexler [Sat, 17 Sep 2022 19:42:04 +0000 (15:42 -0400)] 
[HUDI-4757] Create pyspark examples (#6672)

2 weeks ago[HUDI-4282] Repair IOException in CHDFS when check block corrupted in HoodieLogFileRe...
5herhom [Sat, 17 Sep 2022 19:19:23 +0000 (03:19 +0800)] 
[HUDI-4282] Repair IOException in CHDFS when check block corrupted in HoodieLogFileReader (#6031)

Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2 weeks ago[HUDI-4842] Support compaction strategy based on delta log file num (#6670)
苏承祥 [Sat, 17 Sep 2022 17:08:19 +0000 (01:08 +0800)] 
[HUDI-4842] Support compaction strategy based on delta log file num (#6670)

Co-authored-by: 苏承祥 <sucx@tuya.com>
2 weeks ago[HUDI-4736] Fix inflight clean action preventing clean service to continue when multi...
Y Ethan Guo [Sat, 17 Sep 2022 14:27:32 +0000 (07:27 -0700)] 
[HUDI-4736] Fix inflight clean action preventing clean service to continue when multiple cleans are not allowed (#6536)

2 weeks ago[HUDI-4865] Optimize HoodieAvroUtils#isMetadataField to use O(1) complexity (#6702)
Danny Chan [Sat, 17 Sep 2022 12:34:17 +0000 (20:34 +0800)] 
[HUDI-4865] Optimize HoodieAvroUtils#isMetadataField to use O(1) complexity (#6702)

2 weeks ago[HUDI-4841] Fix sort idempotency issue (#6669)
voonhous [Sat, 17 Sep 2022 07:38:58 +0000 (15:38 +0800)] 
[HUDI-4841] Fix sort idempotency issue (#6669)

2 weeks ago[HUDI-4864] Fix AWSDmsAvroPayload#combineAndGetUpdateValue when using MOR snapshot...
Rahil C [Sat, 17 Sep 2022 01:47:29 +0000 (18:47 -0700)] 
[HUDI-4864] Fix AWSDmsAvroPayload#combineAndGetUpdateValue when using MOR snapshot query after delete operations with test (#6688)

Co-authored-by: Rahil Chertara <rchertar@amazon.com>
2 weeks ago[HUDI-4856] Missing option for HoodieCatalogFactory (#6693)
Danny Chan [Sat, 17 Sep 2022 00:26:33 +0000 (08:26 +0800)] 
[HUDI-4856] Missing option for HoodieCatalogFactory (#6693)