arrow-datafusion.git
6 hours agoConsolidate coercion code in `datafusion_expr::type_coercion` and submodules (#3728) master
Andrew Lamb [Thu, 6 Oct 2022 10:33:25 +0000 (06:33 -0400)] 
Consolidate coercion code in `datafusion_expr::type_coercion` and submodules (#3728)

* Move function coercion to its own module

* Move binary into type coercion

* fmt

* More updates

* consolidate some more

* Move aggregates

19 hours agoUse column aliases specified by `WITH` statements (#3717)
Batuhan Taskaya [Wed, 5 Oct 2022 21:26:43 +0000 (00:26 +0300)] 
Use column aliases specified by `WITH` statements (#3717)

22 hours agoFix aggregate type coercion bug (#3710)
Andrew Lamb [Wed, 5 Oct 2022 18:28:57 +0000 (14:28 -0400)] 
Fix aggregate type coercion bug (#3710)

* Do not change output expr name in `UnwrapCastInComparison`

* Update

* Update test

* Fix regression

* Update tests

* clippy

25 hours agoSkip filter push down on semi/anti joins (#3723)
Andy Grove [Wed, 5 Oct 2022 15:47:11 +0000 (09:47 -0600)] 
Skip filter push down on semi/anti joins (#3723)

27 hours agoRaise `Unsupported SQL type` for `Time(WithTimeZone)` and `Time(Tz)` (#3718)
Wei-Ting Kuo [Wed, 5 Oct 2022 13:24:45 +0000 (21:24 +0800)] 
Raise `Unsupported SQL type` for `Time(WithTimeZone)` and `Time(Tz)` (#3718)

* raise error for timetz

* fix test cases

36 hours agoReject recursive CTEs before processing the sub-expressions (#3714)
Batuhan Taskaya [Wed, 5 Oct 2022 04:45:17 +0000 (07:45 +0300)] 
Reject recursive CTEs before processing the sub-expressions (#3714)

36 hours agoMake column name consistent between Expr::name and Display/Debug (#3712)
Andy Grove [Wed, 5 Oct 2022 04:18:58 +0000 (22:18 -0600)] 
Make column name consistent between Expr::name and Display/Debug (#3712)

* Remove # from Column Display format

* update expected results in tests

* update expected results in tests

* update expected results in tests

* update expected results in tests

* update expected results in tests

* fmt

* Update datafusion/core/src/execution/context.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 days agoMINOR: Add `Expr::canonical_name` and improve docs on `Expr::name` (#3706)
Andy Grove [Tue, 4 Oct 2022 16:00:05 +0000 (10:00 -0600)] 
MINOR: Add `Expr::canonical_name` and improve docs on `Expr::name` (#3706)

* Add Expr::canonical_name

* update docs

2 days agobump sql-parser 0.25 (#3698)
xudong.w [Tue, 4 Oct 2022 14:30:38 +0000 (22:30 +0800)] 
bump sql-parser 0.25 (#3698)

* bump sql-parser 0.25

* Revert update to `parquet-testing` and `testing` sub modules

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 days ago`unwrap_cast_in_comparison`: fix bug which can find the field for the schema (#3699)
Kun Liu [Tue, 4 Oct 2022 12:37:30 +0000 (20:37 +0800)] 
`unwrap_cast_in_comparison`: fix bug which can find the field for the schema (#3699)

* the unwrap rule: don't throw error when meet unsupported data type

* support data type in unwrap cast rule

2 days agoSupport better dictionary coercion (#3688)
Andrew Lamb [Tue, 4 Oct 2022 12:27:24 +0000 (08:27 -0400)] 
Support better dictionary coercion (#3688)

2 days agomove `type coercion` for case when expr (#3676)
Kun Liu [Tue, 4 Oct 2022 09:38:18 +0000 (17:38 +0800)] 
move `type coercion` for case when expr (#3676)

* support type coercion in logical phase and remove it in the physical phase

* Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* format code

* change error message

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 days agoImprove docstrings in bianry_rule.rs (#3687)
Andrew Lamb [Mon, 3 Oct 2022 23:32:22 +0000 (19:32 -0400)] 
Improve docstrings in bianry_rule.rs (#3687)

2 days agoUpgrade `arrow` `parquet` and `arrow-flight` to 24.0.0 (#3691)
Andrew Lamb [Mon, 3 Oct 2022 23:24:22 +0000 (19:24 -0400)] 
Upgrade `arrow` `parquet` and `arrow-flight` to 24.0.0 (#3691)

* Update to arrow 24.0.0

* Update pyo3 interface

* Update datafusion-cli lockfile

2 days agoAutomate postrelease publishing to Homebrew (#3507)
Ian Alexander Joiner [Mon, 3 Oct 2022 18:45:21 +0000 (14:45 -0400)] 
Automate postrelease publishing to Homebrew (#3507)

* Done

* Autofile PR

* fix readme

Co-authored-by: Ian Joiner <ian.joiner@spaceandtime.io>
2 days agoMove optimizer init to optimizer crate (#3692)
Andy Grove [Mon, 3 Oct 2022 17:07:34 +0000 (11:07 -0600)] 
Move optimizer init to optimizer crate (#3692)

3 days ago[MINOR] Add `ScalarValue::new_utf8`, clean up creation of literals in casting tests...
Andrew Lamb [Mon, 3 Oct 2022 16:07:42 +0000 (12:07 -0400)] 
[MINOR] Add `ScalarValue::new_utf8`, clean up creation of literals in casting tests (#3680)

* Add ScalarValue::new_utf8, clean up construction in tests

* Cleanup

* some more minor cleanup

* Update unwrap_cast

3 days agoSimplification Rules for Modulo Operator (#3669)
askoa [Mon, 3 Oct 2022 13:46:46 +0000 (09:46 -0400)] 
Simplification Rules for Modulo Operator (#3669)

* Simplify Rules for Modulo Operator

* add divide by zero error

* fix PR comments

* remove error on mod by zero based on PR comment

* fix clippy issues

* add mod by zero back

Co-authored-by: askoa <askoa@local>
3 days agoCache collected file statistics (#3649)
mateuszkj [Mon, 3 Oct 2022 13:46:00 +0000 (15:46 +0200)] 
Cache collected file statistics (#3649)

* Cache collected file statistics

* fix clippy in tests

3 days agoUpdate sqlparser to 0.24.0 (#3675)
Andrew Lamb [Mon, 3 Oct 2022 13:01:02 +0000 (09:01 -0400)] 
Update sqlparser to 0.24.0  (#3675)

* Update sqlparser requirement from 0.23 to 0.24

Updates the requirements on [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) to permit the latest version.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
* Update for sqlparser changes

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
3 days agoDisable code coverage (#3679)
Andrew Lamb [Mon, 3 Oct 2022 13:00:50 +0000 (09:00 -0400)] 
Disable code coverage (#3679)

4 days agoFail if field lengths are not same in INTERSECT and EXPECT (#3674)
askoa [Sat, 1 Oct 2022 23:24:53 +0000 (19:24 -0400)] 
Fail if field lengths are not same in INTERSECT and EXPECT (#3674)

* fail if field lengths are not same in INTERSECT and EXPECT

* incorporate PR comment

* move test per PR comment

Co-authored-by: askoa <askoa@local>
5 days agoSimplify null division. (#3625)
Remzi Yang [Fri, 30 Sep 2022 20:07:45 +0000 (04:07 +0800)] 
Simplify null division. (#3625)

* fix

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix comment

Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
6 days agochange pre_cast_lit_in_comparison to unwrap_cast_in_comparison (#3662)
Kun Liu [Fri, 30 Sep 2022 14:35:50 +0000 (22:35 +0800)] 
change pre_cast_lit_in_comparison to unwrap_cast_in_comparison (#3662)

* change pre_cast_lit_in_comparison to unwrap_cast_in_comparison

* change some test case

6 days agorestore the between for simplify expression (#3661)
Kun Liu [Fri, 30 Sep 2022 11:03:57 +0000 (19:03 +0800)] 
restore the between for simplify expression (#3661)

6 days agoadd timestamptz (#3660)
Wei-Ting Kuo [Fri, 30 Sep 2022 11:00:19 +0000 (19:00 +0800)] 
add timestamptz (#3660)

6 days agoCustom window frame logic (support `ROWS`, `RANGE`, `PRECEDING` and `FOLLOWING` for...
Metehan Yıldırım [Fri, 30 Sep 2022 10:23:18 +0000 (13:23 +0300)] 
Custom window frame logic (support `ROWS`, `RANGE`, `PRECEDING` and `FOLLOWING` for window functions) (#3570)

* Custom window frame implementation (ROWS and RANGE)

* Revamp unit and integration tests for custom window frame logic.

* Add missing license texts, stabilize PSQL tests by fixing locale/collation settings

* Fix typo in test number assert, fix place of postgre_initdb_args

* Address PR reviews

* Fix treatment of NULL values

* Simplify bisection comparison subroutine

* Operations between different ScalarValues are retracted to the old functionality

* Readability improvements, use Internal for unreachable and/or bug-indicating errors

* Add missing uint8+uint8 and i8+i8 sum cases

Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai>
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
6 days agomake regexp_replace early abort with empty input (#3648)
Batuhan Taskaya [Thu, 29 Sep 2022 19:17:54 +0000 (22:17 +0300)] 
make regexp_replace early abort with empty input (#3648)

6 days agoremove the type coercion in the simplify_expressions rule (#3657)
Kun Liu [Thu, 29 Sep 2022 17:49:41 +0000 (01:49 +0800)] 
remove the type coercion in the simplify_expressions rule (#3657)

7 days agomove the `type coercion` to the beginning of the optimizer rule and support type...
Kun Liu [Thu, 29 Sep 2022 14:57:02 +0000 (22:57 +0800)] 
move the `type coercion` to the beginning of the optimizer rule and support type coercion for subquery (#3636)

* support subquery for type coercion

* support subquery

* move the type coercion to the begine of the rules

* fix all test case

* fix test

* remove useless code

* add subquery in type coercion

* address comments

* fix test

* support case #3565

7 days agosupport cast/try_cast expr in reduceOuterJoin (#3621)
AssHero [Thu, 29 Sep 2022 06:57:45 +0000 (14:57 +0800)] 
support cast/try_cast expr in reduceOuterJoin (#3621)

* support cast/try_cast expr in reduceOuterJoin

* add new test case for cast/try_cast expr in reduceOuterJoin

7 days agoAdd documentation for querying S3 data with CLI (#3631)
Andy Grove [Wed, 28 Sep 2022 20:56:50 +0000 (14:56 -0600)] 
Add documentation for querying S3 data with CLI (#3631)

* Add documentation for querying S3 data with CLI

* add s3 example

* update test

* fix example, use AWS_REGION

* prettier

* toml fmt

7 days agoAdd serialization of `ScalarValue::Struct` (#3536)
Andrew Lamb [Wed, 28 Sep 2022 20:16:21 +0000 (16:16 -0400)] 
Add serialization of `ScalarValue::Struct` (#3536)

* Add serialization of `ScalarValue::Struct`

* Remove explicit is_null encoding

* Restore submodules

7 days agoadd rules (#3627)
Remzi Yang [Wed, 28 Sep 2022 19:08:24 +0000 (03:08 +0800)] 
add rules (#3627)

Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
8 days agoCheck each query has same number of columns when building the UNION plan (#3638)
Remzi Yang [Wed, 28 Sep 2022 14:43:32 +0000 (22:43 +0800)] 
Check each query has same number of columns when building the UNION plan (#3638)

8 days ago[feat] Support using offset index in ParquetRecordBatchStream when pu… (#3616)
Yang Jiang [Wed, 28 Sep 2022 09:41:06 +0000 (17:41 +0800)] 
[feat] Support using offset index in ParquetRecordBatchStream when pu… (#3616)

* [feat] Support using offset index in ParquetRecordBatchStream when pushing down RowFilter.

Signed-off-by: yangjiang <yangjiang@ebay.com>
* Update datafusion/core/src/physical_plan/file_format/parquet.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Signed-off-by: yangjiang <yangjiang@ebay.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
8 days agoUse arrow row format in SortPreservingMerge (#3386)
Raphael Taylor-Davies [Tue, 27 Sep 2022 18:20:55 +0000 (19:20 +0100)] 
Use arrow row format in SortPreservingMerge (#3386)

9 days agoOptimize `regex_replace` for scalar patterns (#3614)
Batuhan Taskaya [Tue, 27 Sep 2022 06:32:42 +0000 (09:32 +0300)] 
Optimize `regex_replace` for scalar patterns (#3614)

* Optimize `regex_replace` for scalar patterns

* Change the hot-path on `regexp_replace` to only variadic source (#2)

9 days agoDocument ObjectStoreProvider (#3619)
Raphael Taylor-Davies [Tue, 27 Sep 2022 01:24:26 +0000 (02:24 +0100)] 
Document ObjectStoreProvider (#3619)

9 days agoMINOR: the .tbl files don't have headers. after conversion to parquet, it was missing...
Kirk Mitchener [Mon, 26 Sep 2022 18:58:54 +0000 (14:58 -0400)] 
MINOR: the .tbl files don't have headers. after conversion to parquet, it was missing a row. (#3620)

10 days agoStop wasting time in CI on MIRI runs (#3610)
Andrew Lamb [Mon, 26 Sep 2022 11:22:47 +0000 (07:22 -0400)] 
Stop wasting time in CI on MIRI runs (#3610)

10 days agoSimplify `concat_ws(null, ..)` to `null` (#3608)
Remzi Yang [Mon, 26 Sep 2022 11:19:21 +0000 (19:19 +0800)] 
Simplify `concat_ws(null, ..)` to `null` (#3608)

* simpl when seprartor is null

Signed-off-by: remzi <13716567376yh@gmail.com>
* fix clippy

Signed-off-by: remzi <13716567376yh@gmail.com>
* check nulls in other positions are not simplified to scalar null

Signed-off-by: remzi <13716567376yh@gmail.com>
* fmt

Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
11 days agoMINOR: improve docstrings on SessionContext (#3603)
Andrew Lamb [Sun, 25 Sep 2022 11:51:15 +0000 (07:51 -0400)] 
MINOR: improve docstrings on SessionContext (#3603)

12 days agoPrevent memory overflows (and spills) on sorts with a fixed limit (#3593)
Batuhan Taskaya [Sat, 24 Sep 2022 16:36:26 +0000 (19:36 +0300)] 
Prevent memory overflows (and spills) on sorts with a fixed limit (#3593)

12 days agoMerge s3_build_error and s3_success into one test (#3602)
Rito Takeuchi [Sat, 24 Sep 2022 12:36:56 +0000 (21:36 +0900)] 
Merge s3_build_error and s3_success into one test (#3602)

12 days agoadd function SessionContext::register_batch to register a single RecordBatch as a...
BaymaxHWY [Sat, 24 Sep 2022 11:51:59 +0000 (19:51 +0800)] 
add function SessionContext::register_batch to register a single RecordBatch as a table (#3600)

12 days agoremove type coercion in the binary physical expr (#3396)
Kun Liu [Sat, 24 Sep 2022 11:51:12 +0000 (19:51 +0800)] 
remove type coercion in the binary physical expr (#3396)

* remove type coercion binary from phy

* fix test case

* revert the fix for #3387

* type coercion before simplify expression

* complete remove the type coercion in the physical plan

* refactor

* merge master

* refactor

* do type coercion in the simplify expression

* Add comments

* fix: fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
12 days agofeat: use ObjectStoreProvider to provide s3 and gcs object stores to the cli (#3540)
gorkem [Fri, 23 Sep 2022 19:28:12 +0000 (12:28 -0700)] 
feat: use ObjectStoreProvider to provide s3 and gcs object stores to the cli (#3540)

13 days agoFix docs.rs (#3580)
Brent Gardner [Fri, 23 Sep 2022 12:35:48 +0000 (05:35 -0700)] 
Fix docs.rs (#3580)

* I think this will fix docs.rs but no idea how to test

* Generate to correct location

* Clippy

* Alphabetize

* Feature -> cfg

* New clippy is more demanding

13 days ago[CI] Fix the newly added linting errors to make clippy happy (#3598)
Batuhan Taskaya [Fri, 23 Sep 2022 09:08:38 +0000 (12:08 +0300)] 
[CI] Fix the newly added linting errors to make clippy happy (#3598)

13 days agoupdate datafusion cli deps (#3588)
Jiayu Liu [Fri, 23 Sep 2022 00:50:04 +0000 (08:50 +0800)] 
update datafusion cli deps (#3588)

13 days agoUse consistent name for `TimeUnit::Millisecond` (#3575)
Andrew Lamb [Thu, 22 Sep 2022 16:56:28 +0000 (12:56 -0400)] 
Use consistent name for  `TimeUnit::Millisecond` (#3575)

13 days agoAdd serialization of `ScalarValue::IntervalMonthDayNano` (#3535)
Andrew Lamb [Thu, 22 Sep 2022 16:55:46 +0000 (12:55 -0400)] 
Add serialization of `ScalarValue::IntervalMonthDayNano` (#3535)

2 weeks agoUpdate cranelift* dependencies `0.87` --> `0.88` (#3586)
Andrew Lamb [Thu, 22 Sep 2022 16:21:22 +0000 (12:21 -0400)] 
Update cranelift* dependencies `0.87` --> `0.88` (#3586)

* Update cranelift* dependencies `0.87` --> `0.88`

* fix other instance

2 weeks agoConfig support type conversion (#3522)
comphead [Thu, 22 Sep 2022 16:21:01 +0000 (09:21 -0700)] 
Config support type conversion (#3522)

2 weeks agoMake ObjectStoreProvider fallible (#3584)
Raphael Taylor-Davies [Thu, 22 Sep 2022 11:13:40 +0000 (12:13 +0100)] 
Make ObjectStoreProvider fallible (#3584)

2 weeks agofix coercion of null for decimal math in binary_rules (#3549)
Kirk Mitchener [Thu, 22 Sep 2022 11:01:22 +0000 (07:01 -0400)] 
fix coercion of null for decimal math in binary_rules (#3549)

* fix

* revert refactor -- handedness matters to decimal rules

2 weeks agofix and tests (#3567)
Kirk Mitchener [Thu, 22 Sep 2022 10:49:21 +0000 (06:49 -0400)] 
fix and tests (#3567)

2 weeks agoAdd Dask SQL to list of projects powered by DataFusion (#3581)
Andy Grove [Thu, 22 Sep 2022 00:03:07 +0000 (18:03 -0600)] 
Add Dask SQL to list of projects powered by DataFusion (#3581)

2 weeks agoFix logical plan serialization (#3574)
Dan Harris [Wed, 21 Sep 2022 21:00:18 +0000 (17:00 -0400)] 
Fix logical plan serialization  (#3574)

* Fix issue in type coercion optimizer

* Update datafusion/optimizer/src/type_coercion.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 weeks agoFix build (#3576)
Andrew Lamb [Wed, 21 Sep 2022 17:57:59 +0000 (13:57 -0400)] 
Fix build (#3576)

2 weeks agoReduce dependencies of `datafusion-sql` crate (#3566)
Matthijs Brobbel [Wed, 21 Sep 2022 17:50:12 +0000 (19:50 +0200)] 
Reduce dependencies of `datafusion-sql` crate (#3566)

* Remove tokio dependency from datafusion-sql crate

* Remove hashbrown dependency from datafusion-sql crate

* Remove ahash dependency from datafusion-sql crate

* Reduce features in common, expr and sql crates

* Disable default features of arrow dependency in jit crate

* Enable `snappy` feature of `apache-avro` dependency in `common` crate

* Fix Cargo.toml feature ordering

2 weeks agoremove is_dictionary checks from binary_rule.rs that have been implemented in arrow...
Kirk Mitchener [Wed, 21 Sep 2022 17:48:42 +0000 (13:48 -0400)] 
remove is_dictionary checks from binary_rule.rs that have been implemented in arrow-rs already (#3552)

2 weeks agoUse `fetch` limit in get_sorted_iter (#3545)
Daniël Heres [Wed, 21 Sep 2022 17:07:19 +0000 (19:07 +0200)] 
Use `fetch` limit in get_sorted_iter (#3545)

* Add fetch, fix length

* Add fetch, fix length

* Simplify implementation a bit

* Simplify

* Doc

* Reorder

* Move parallel sort to planner

* Simplify a bit more

* Update datafusion/core/src/physical_plan/planner.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 weeks agoAdd serialization of `ScalarValue::Binary` and `ScalarValue::LargeBinary`, `ScalarVal...
Andrew Lamb [Wed, 21 Sep 2022 16:37:19 +0000 (12:37 -0400)] 
Add serialization of `ScalarValue::Binary` and `ScalarValue::LargeBinary`, `ScalarValue::Time64` (#3534)

2 weeks agoUpdate pbjson-types requirement from 0.3 to 0.5 (#3560)
dependabot[bot] [Wed, 21 Sep 2022 13:54:32 +0000 (14:54 +0100)] 
Update pbjson-types requirement from 0.3 to 0.5 (#3560)

Updates the requirements on [pbjson-types](https://github.com/influxdata/pbjson) to permit the latest version.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson-types
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 weeks agoUpdate pbjson requirement from 0.3 to 0.5 (#3559)
dependabot[bot] [Wed, 21 Sep 2022 13:53:07 +0000 (14:53 +0100)] 
Update pbjson requirement from 0.3 to 0.5 (#3559)

Updates the requirements on [pbjson](https://github.com/influxdata/pbjson) to permit the latest version.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 weeks agoUpdate pbjson-build requirement from 0.3 to 0.5 (#3558)
dependabot[bot] [Wed, 21 Sep 2022 13:52:57 +0000 (14:52 +0100)] 
Update pbjson-build requirement from 0.3 to 0.5 (#3558)

Updates the requirements on [pbjson-build](https://github.com/influxdata/pbjson) to permit the latest version.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson-build
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 weeks agoenable q19 in TPCH (#3553)
Kirk Mitchener [Wed, 21 Sep 2022 10:48:27 +0000 (06:48 -0400)] 
enable q19 in TPCH (#3553)

2 weeks agotype coercion: support is/is_not_`bool`/like/unknown expr (#3510)
Kun Liu [Wed, 21 Sep 2022 01:43:25 +0000 (09:43 +0800)] 
type coercion: support is/is_not_`bool`/like/unknown expr (#3510)

* type coercion: support the boolen op, is_true, is_not_true, is_false, is_not_false

* Apply suggestions from code review

Co-authored-by: Daniël Heres <danielheres@gmail.com>
* suport like,unknown for type coercion

* Update datafusion/optimizer/src/type_coercion.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 weeks agoMake ParquetScanOptions public and add method to get a reference from ParquetExec...
Dan Harris [Tue, 20 Sep 2022 19:47:17 +0000 (15:47 -0400)] 
Make ParquetScanOptions public and add method to get a reference from ParquetExec (#3551)

2 weeks agoConvert more cross joins to inner joins (Address performance/execution plan of TPCH...
Dhamotharan Sritharan [Tue, 20 Sep 2022 18:34:28 +0000 (00:04 +0530)] 
Convert more cross joins to inner joins (Address performance/execution plan of TPCH query 19) (#3482)

* Address performance/execution plan of TPCH query 19

    * Added the new optimizer rule reduce_cross_join which would convert cross joins to inner joins if the filter has the join predicates for the corresponding tables.

* updating plan change after merging with latest master

* minor fixes based on CI

- fixing the build failure in CI -> Rust/ clippy
- Fixing the testcases

* improved based on review comments

* Added more testcases

-- Added more testcases and fixed minor fixes
-- Addressed review comments

* resolved clippy and lint failures

* resolved clippy failure

2 weeks agoBug fix: expr_visitor was not visiting aggregate filter expressions (#3548)
Andy Grove [Tue, 20 Sep 2022 18:10:07 +0000 (12:10 -0600)] 
Bug fix: expr_visitor was not visiting aggregate filter expressions (#3548)

2 weeks agoPush down limit to sort (#3530)
Daniël Heres [Tue, 20 Sep 2022 13:11:42 +0000 (15:11 +0200)] 
Push down limit to sort (#3530)

* Push down limit to sort

Support skip, fix test

Fmt

Add limit directly after sort

Update comment

Simplify parallel sort by using new pushdown

Clippy

* Update datafusion/core/src/physical_plan/sorts/sort.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 weeks agoAdd additional pruning tests with casts, handle unsupported predicates better (#3454)
Andrew Lamb [Tue, 20 Sep 2022 09:58:57 +0000 (05:58 -0400)] 
Add additional pruning tests with casts, handle unsupported predicates better (#3454)

* Add tests for pruning, support pruning with constant expressions

* Use downcast_any!

* chore: Remove uneeded use

2 weeks agoUpgrade to arrow 23.0.0 (#3483)
Andrew Lamb [Tue, 20 Sep 2022 09:55:17 +0000 (05:55 -0400)] 
Upgrade to arrow 23.0.0 (#3483)

* Changes for API

* Update avro code for API changes

* Use divide_opt` kernel

* Update update_arrow_deps.py

* Update arrow dependency to 23.0.0

* Use nicer RecordBatchOptions API

* cleanups

* fix: update

2 weeks agoActually test that `ScalarValue`s are the same after round trip serialization (#3537)
Andrew Lamb [Tue, 20 Sep 2022 07:14:45 +0000 (03:14 -0400)] 
Actually test that `ScalarValue`s are the same after round trip serialization (#3537)

2 weeks agofeat: Union types coercion (#3513)
George Andronchik [Mon, 19 Sep 2022 21:49:37 +0000 (05:49 +0800)] 
feat: Union types coercion (#3513)

1

2 weeks agoAdd Debug to TableReference and ResolvedTableReference (#3533)
Andy Grove [Mon, 19 Sep 2022 18:38:58 +0000 (12:38 -0600)] 
Add Debug to TableReference and ResolvedTableReference (#3533)

2 weeks agoAdd support for `ScalarValue::Dictionary` to datafusion-proto (#3532)
Andrew Lamb [Mon, 19 Sep 2022 18:13:54 +0000 (14:13 -0400)] 
Add support for `ScalarValue::Dictionary` to datafusion-proto (#3532)

2 weeks agoExecute sort in parallel when a limit is used after sort (#3527)
Daniël Heres [Mon, 19 Sep 2022 12:48:01 +0000 (14:48 +0200)] 
Execute sort in parallel when a limit is used after sort (#3527)

* Parallel sort

* Move it to optimization rule

* Add rule

* Improve rule

* Remove bench

* Fix doc

* Fix indent

2 weeks agoMINOR: reduce pub in OptimizerConfig (#3525)
Andrew Lamb [Mon, 19 Sep 2022 05:57:44 +0000 (01:57 -0400)] 
MINOR: reduce pub in OptimizerConfig (#3525)

2 weeks ago[DataFrame] - Add cache function for DataFrame (#3512)
Francis Du [Sun, 18 Sep 2022 15:32:04 +0000 (23:32 +0800)] 
[DataFrame] - Add cache function for DataFrame (#3512)

* feat: add cache function for dataframe

* fix: function doc typo

* fix: test issue

2 weeks agoChange `downcast_any!` macro so it does not need to use `use std::any::type_name...
Andrew Lamb [Sat, 17 Sep 2022 11:00:07 +0000 (07:00 -0400)] 
Change `downcast_any!` macro so it does not need to use `use std::any::type_name;` (#3484)

2 weeks agofix divide by zero not throwing proper error for decimal (#3517)
Kirk Mitchener [Sat, 17 Sep 2022 10:46:16 +0000 (06:46 -0400)] 
fix divide by zero not throwing proper error for decimal (#3517)

* fix divide by zero for decimal, add tests

* check for actual error strings

* move scalar divide by zero check out of the loop. fix precision in test.

2 weeks agoMINOR: Add more execs to list of supported execs (#3519)
Andy Grove [Sat, 17 Sep 2022 03:49:39 +0000 (21:49 -0600)] 
MINOR: Add more execs to list of supported execs (#3519)

* prettier

* add more execs to docs

* Revert

* Update datafusion/core/src/lib.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update datafusion/core/src/lib.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update datafusion/core/src/lib.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update datafusion/core/src/lib.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Update datafusion/core/src/lib.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2 weeks agoMake FileStream and FileOpener public (#3514)
Dan Harris [Fri, 16 Sep 2022 18:37:12 +0000 (14:37 -0400)] 
Make FileStream and FileOpener public (#3514)

2 weeks agoAdd additional DATE_PART units (#3503)
Jon Mease [Fri, 16 Sep 2022 14:03:08 +0000 (10:03 -0400)] 
Add additional DATE_PART units (#3503)

* Add dow support to date_part

* Add doy support to date_part

* Add quarter support to date_part

2 weeks agoprettier (#3504)
Andy Grove [Fri, 16 Sep 2022 10:45:16 +0000 (04:45 -0600)] 
prettier (#3504)

2 weeks agoMINOR: remove unused dependencies (#3508)
Ruihang Xia [Fri, 16 Sep 2022 07:55:44 +0000 (15:55 +0800)] 
MINOR: remove unused dependencies (#3508)

* MINOR: remove unused dependencies

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* revert rand

* ad newline

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2 weeks agoadd time_zone into ConfigOptions (#3485)
Wei-Ting Kuo [Thu, 15 Sep 2022 17:04:17 +0000 (01:04 +0800)] 
add time_zone into ConfigOptions (#3485)

* add time_zone into ConfigOptions

* fix debug leftover

* Update datafusion/core/src/config.rs

Co-authored-by: Kun Liu <liukun@apache.org>
Co-authored-by: Kun Liu <liukun@apache.org>
3 weeks agoAdd BitwiseXor in function from_proto_binary_op (#3496)
askoa [Thu, 15 Sep 2022 12:43:22 +0000 (08:43 -0400)] 
Add BitwiseXor in function from_proto_binary_op (#3496)

3 weeks agomake the function from_proto_binary_op public (#3490)
askoa [Thu, 15 Sep 2022 01:44:57 +0000 (21:44 -0400)] 
make the function from_proto_binary_op public (#3490)

Co-authored-by: askoa <askoa@local>
3 weeks agominor: fix bug in `downcast_value!` macro (#3486)
Andrew Lamb [Thu, 15 Sep 2022 01:44:35 +0000 (21:44 -0400)] 
minor: fix bug in `downcast_value!` macro (#3486)

3 weeks agochange the null type in the row filter (#3470)
Kun Liu [Thu, 15 Sep 2022 00:56:22 +0000 (08:56 +0800)] 
change the null type in the row filter (#3470)

3 weeks agoAdd Parseable as Datafusion user (#3471)
Nitish Tiwari [Wed, 14 Sep 2022 20:54:13 +0000 (02:24 +0530)] 
Add Parseable as Datafusion user (#3471)

3 weeks agoadd FixedSizeBinary support to create_hashes (#3458)
Morgan Cassels [Wed, 14 Sep 2022 17:52:25 +0000 (10:52 -0700)] 
add FixedSizeBinary support to create_hashes (#3458)

* add FixedSizeBinary support to create_hashes

* remove equality and inequality assertions about FixedSizeBinary hashes because feature force_hash_collisions changes hash values

Co-authored-by: Morgan Cassels <morgan@urbanlogiq.com>
3 weeks ago[minor] Remove unused arg in macro in Inlist (#3474)
Yang Jiang [Wed, 14 Sep 2022 17:49:50 +0000 (01:49 +0800)] 
[minor] Remove unused arg in macro in Inlist (#3474)

* remove unused arg in macro in Inlist

* fix fmt

3 weeks agoinlist: move type coercion to logical phase (#3472)
Kun Liu [Wed, 14 Sep 2022 15:04:48 +0000 (23:04 +0800)] 
inlist: move type coercion to logical phase (#3472)

3 weeks agoSupport ShowVariable Statement (#3455)
Wei-Ting Kuo [Wed, 14 Sep 2022 12:47:07 +0000 (20:47 +0800)] 
Support ShowVariable Statement (#3455)

* support "SHOW VARIABLE;"

* fix test case

* fix comment

* fix clippy

* rename settings -> df_settings