druid.git
2 days agoWeb console: fix DQT import (#13159) master
Vadim Ogievetsky [Fri, 30 Sep 2022 16:31:06 +0000 (09:31 -0700)] 
Web console: fix DQT import (#13159)

* fix dqt import

* update licenses

* update tests

3 days agoFix over-replication caused by balancing when inventory is not updated yet (#13114)
Kashif Faraz [Thu, 29 Sep 2022 06:36:23 +0000 (12:06 +0530)] 
Fix over-replication caused by balancing when inventory is not updated yet (#13114)

* Add coordinator test framework

* Remove outdated changes

* Add more tests

* Add option to auto-sync inventory

* Minor cleanup

* Fix inspections

* Add README for simulations, add SegmentLoadingNegativeTest

* Fix over-replication from balancing

* Fix README

* Cleanup unnecessary fields from DruidCoordinator

* Add a test

* Fix DruidCoordinatorTest

* Remove unused import

* Fix CuratorDruidCoordinatorTest

* Remove test log4j2.xml

4 days agoFix assertion error in sql planning for latest aggregators (#13151)
Abhishek Agarwal [Wed, 28 Sep 2022 15:31:32 +0000 (21:01 +0530)] 
Fix assertion error in sql planning for latest aggregators (#13151)

* Fix sql planning bug for latest aggregators

* change test name

* Fix error messages

* fix error message again

4 days agoUpgrade kafka version to 3.2.3 to fix CVE (#13142)
AmatyaAvadhanula [Wed, 28 Sep 2022 05:17:09 +0000 (10:47 +0530)] 
Upgrade kafka version to 3.2.3 to fix CVE (#13142)

Upgrade to 3.2.3 to fix CVE: https://nvd.nist.gov/vuln/detail/CVE-2022-34917

4 days agoCorrect nested columns example (#13150)
Jill Osborne [Wed, 28 Sep 2022 05:09:56 +0000 (06:09 +0100)] 
Correct nested columns example (#13150)

5 days agoAdd a note to the documentation about pre-built HLLSketches (#13088)
David Palmer [Tue, 27 Sep 2022 02:29:39 +0000 (15:29 +1300)] 
Add a note to the documentation about pre-built HLLSketches (#13088)

* add a note to the documentation about pre-built HLLSketches

Druid actually supports ingesting a pre-generated sketch column by using
the HLLSketchMerge aggregator. However, this functionality was
previously not made clear in the documentation.

* copyedit from the King's English to American English

* add suggested style changes

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
5 days agoFix documentation bug about injective lookups (#13147)
Apoorv Gupta [Tue, 27 Sep 2022 02:16:48 +0000 (19:16 -0700)] 
Fix documentation bug about injective lookups (#13147)

replace mapping to `unique keys` with mapping to `unique values`.

5 days agoAdd BIG_SUM SQL function (#13102)
Sam Rash [Tue, 27 Sep 2022 01:02:25 +0000 (18:02 -0700)] 
Add BIG_SUM SQL function (#13102)

This adds a sql function, "BIG_SUM", that uses
CompressedBigDecimal to do a sum. Other misc changes:

1. handle NumberFormatExceptions when parsing a string (default to set
   to 0, configurable in agg factory to be strict and throw on error)
2. format pom file (whitespace) + add dependency
3. scaleUp -> scale and always require scale as a parameter

5 days agoAdd JsonInputFormat option to assume newline delimited JSON, improve parse exception...
Jonathan Wei [Tue, 27 Sep 2022 00:51:04 +0000 (19:51 -0500)] 
Add JsonInputFormat option to assume newline delimited JSON, improve parse exception handling for multiline JSON (#13089)

* Add JsonInputFormat option to assume newline delimited JSON, improve handling for non-NDJSON

* Fix serde and docs

* Add PR comment check

5 days agoGrab the thread name in a poisoned pool (#13143)
imply-cheddar [Tue, 27 Sep 2022 00:09:10 +0000 (09:09 +0900)] 
Grab the thread name in a poisoned pool (#13143)

8 days agoFix the Injector creation in HadoopTask (#13138)
Laksh Singla [Sat, 24 Sep 2022 05:08:25 +0000 (10:38 +0530)] 
Fix the Injector creation in HadoopTask (#13138)

* Injector fix in HadoopTask

* Log the ExtensionsConfig while instantiating the HadoopTask

* Log the config in the run() method instead of the ctor

9 days agoSuppress Calcite CVE (#13119)
Adarsh Sanjeev [Fri, 23 Sep 2022 10:53:26 +0000 (16:23 +0530)] 
Suppress Calcite CVE (#13119)

* Suppress Calcite CVE

* Update comment

10 days agobetter spec conversion with issues (#13136)
Vadim Ogievetsky [Thu, 22 Sep 2022 17:46:57 +0000 (10:46 -0700)] 
better spec conversion with issues (#13136)

10 days agoinitialize all counters for stages with input (#13137)
Vadim Ogievetsky [Thu, 22 Sep 2022 15:10:50 +0000 (08:10 -0700)] 
initialize all counters for stages with input (#13137)

10 days agoAdd IT for MSQ task engine using the new IT framework (#12992)
Laksh Singla [Thu, 22 Sep 2022 10:39:47 +0000 (16:09 +0530)] 
Add IT for MSQ task engine using the new IT framework (#12992)

* first test, serde causing problems

* serde working

* insert and select check

* Add cluster annotations for MSQ test cases

* Add cluster config for MSQ

* Add MSQ config to the pom.xml

* cleanup unnecessary changes

* Remove model classes

* Comments, checkstyle, check queries from file

* fixup test case name

* build failure fix

* review changes

* build failure fix

* Trigger Build

* Log the mismatch in QueryResultsVerifier

* Trigger Build

* Change the signature of the results verifier

* review changes

* LGTM fix

* build, change pom

* Trigger Build

* Trigger Build

* trigger build with minimal pom changes

* guice fix in tests

* travis.yml

10 days agoOptimize CompressedBigDecimal compareTo() (#13086)
Sam Rash [Thu, 22 Sep 2022 03:31:02 +0000 (20:31 -0700)] 
Optimize CompressedBigDecimal compareTo() (#13086)

Optimizes the compareTo() function in
CompressedBigDecimal. It directly compares the int[] rather than
creating BigDecimal objects and using its compareTo.

It handles unequal sized CBDs, but does require
the scales to match.

10 days agoappend to exisitng callout (#13130)
Vadim Ogievetsky [Thu, 22 Sep 2022 02:39:28 +0000 (19:39 -0700)] 
append to exisitng callout (#13130)

10 days agoupdate log4j example (#13095)
Charles Smith [Thu, 22 Sep 2022 01:46:49 +0000 (18:46 -0700)] 
update log4j example (#13095)

* update log4j example

* fix some style issues

* Update docs/configuration/logging.md

Co-authored-by: Frank Chen <frankchen@apache.org>
Co-authored-by: Frank Chen <frankchen@apache.org>
10 days agofix: fix broken postgres link (#13135)
317brian [Thu, 22 Sep 2022 01:46:20 +0000 (18:46 -0700)] 
fix: fix broken postgres link (#13135)

10 days agofix: follow naming convention for msq task engine (#13127)
317brian [Thu, 22 Sep 2022 01:46:06 +0000 (18:46 -0700)] 
fix: follow naming convention for msq task engine (#13127)

* fix: follow naming convention for msq task engine

* more fixes

* add back in experimental

* fix anchor

11 days agoUpdate pull-deps docs with correct repo list. (#13134)
Gian Merlino [Wed, 21 Sep 2022 19:16:57 +0000 (12:16 -0700)] 
Update pull-deps docs with correct repo list. (#13134)

There is only one default remote repo at this time.

11 days agoAdd KafkaConfigOverrides extension point (#13122)
Jonathan Wei [Wed, 21 Sep 2022 06:17:19 +0000 (01:17 -0500)] 
Add KafkaConfigOverrides extension point (#13122)

* Add KafkaConfigOverrides extension point

* X

11 days agospatial-filters (#13124)
Katya Macedo [Wed, 21 Sep 2022 05:48:36 +0000 (00:48 -0500)] 
spatial-filters (#13124)

11 days agoAdd test framework to simulate segment loading and balancing (#13074)
Kashif Faraz [Wed, 21 Sep 2022 04:21:58 +0000 (09:51 +0530)] 
Add test framework to simulate segment loading and balancing (#13074)

Fixes #12822

The framework added here make it easy to write tests that verify the behaviour and interactions
of the following entities under various conditions:
- `DruidCoordinator`
- `HttpLoadQueuePeon`, `LoadQueueTaskMaster`
- coordinator duties: `BalanceSegments`, `RunRules`, `UnloadUnusedSegments`, etc.
- datasource retention rules: `LoadRule`, `DropRule`

Changes:
Add the following main classes:
- `CoordinatorSimulation` and related interfaces to dictate behaviour of simulation
- `CoordinatorSimulationBuilder` to build a simulation.
- `BlockingExecutorService` to keep submitted tasks in queue and execute them
  only when explicitly invoked.

Add tests:
- `CoordinatorSimulationBaseTest`, `SegmentLoadingTest`, `SegmentBalancingTest`
- `SegmentLoadingNegativeTest` to contain tests which assert the existing erroneous behaviour
of segment loading. Once the behaviour is fixed, these tests will be moved to the regular
`SegmentLoadingTest`.

Please refer to the README.md in `org.apache.druid.server.coordinator.simulate` for more details

11 days agoClarified the behaviour of SQL COUNT(DISTINCT dim) on multi-value dimensions (#13128)
hosswald [Wed, 21 Sep 2022 01:03:34 +0000 (03:03 +0200)] 
Clarified the behaviour of SQL COUNT(DISTINCT dim) on multi-value dimensions (#13128)

* Clarified the behaviour of COUNT(DISTINCT column) on multi-value columns

* Update docs/querying/sql-aggregations.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Vadim Ogievetsky <vadimon@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
11 days agofix quickstart (#13126)
Vadim Ogievetsky [Wed, 21 Sep 2022 00:44:21 +0000 (17:44 -0700)] 
fix quickstart (#13126)

12 days agoMove JDK11 ITs to cron stage (#13075)
Abhishek Agarwal [Tue, 20 Sep 2022 16:18:52 +0000 (21:48 +0530)] 
Move JDK11 ITs to cron stage (#13075)

* Move JDK11 ITs to cron stage

* Make cron run on release branches

* Review comments

* fix spelling

12 days agobe consistent about referring to the web console by its name (#13118)
Vadim Ogievetsky [Mon, 19 Sep 2022 22:02:17 +0000 (15:02 -0700)] 
be consistent about referring to the web console by its name (#13118)

13 days agoImprove a MSQ planning error message (#13113)
Frank Chen [Mon, 19 Sep 2022 15:11:54 +0000 (23:11 +0800)] 
Improve a MSQ planning error message (#13113)

13 days agoGetting extension list from pom (#13073)
abhagraw [Mon, 19 Sep 2022 09:44:21 +0000 (15:14 +0530)] 
Getting extension list from pom (#13073)

* Getting extension list from pom

* Trigger Build

13 days agonested column serializer performance improvement for sparse columns (#13101)
Clint Wylie [Mon, 19 Sep 2022 08:37:48 +0000 (01:37 -0700)] 
nested column serializer performance improvement for sparse columns (#13101)

13 days agoConvert the Druid planner to use statement handlers (#12905)
Paul Rogers [Mon, 19 Sep 2022 06:28:45 +0000 (08:28 +0200)] 
Convert the Druid planner to use statement handlers (#12905)

* Converted Druid planner to use statement handlers

Converts the large collection of if-statements for statement
types into a set of classes: one per supported statement type.
Cleans up a few error messages.

* Revisions from review comments

* Build fix

* Build fix

* Resolve merge confict.

* More merges with QueryResponse PR

* More parameterized type cleanup

Forces a rebuild due to a flaky test

13 days agofix html tags in docs (#13117)
Vadim Ogievetsky [Mon, 19 Sep 2022 02:40:33 +0000 (19:40 -0700)] 
fix html tags in docs (#13117)

* fix html tags in docs

* revert not null

2 weeks agoKill task: Don't include markAsUnused unless set. (#13104)
Gian Merlino [Sat, 17 Sep 2022 21:03:34 +0000 (14:03 -0700)] 
Kill task: Don't include markAsUnused unless set. (#13104)

Cleans up the serialized JSON.

2 weeks agoWeb console: correctly escape path based flatten specs (#13105)
Vadim Ogievetsky [Sat, 17 Sep 2022 21:02:42 +0000 (14:02 -0700)] 
Web console: correctly escape path based flatten specs (#13105)

* fix path generation

* do escape

* fix replace

* fix replace for good

2 weeks agoDocs: Clarify the situation with SELECT. (#13109)
Gian Merlino [Sat, 17 Sep 2022 17:47:57 +0000 (10:47 -0700)] 
Docs: Clarify the situation with SELECT. (#13109)

2 weeks agoAdd clarification around docker environment #8926 (#13084)
Charles Smith [Sat, 17 Sep 2022 12:44:24 +0000 (05:44 -0700)] 
Add clarification around docker environment #8926 (#13084)

* Add clarification around docker environment #8926

* fix spelling

* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>
* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>
* fix nano quickstart

Co-authored-by: Frank Chen <frankchen@apache.org>
2 weeks agokafka consumer: custom serializer can't be configured after it's instantiation (...
Ellen Shen [Sat, 17 Sep 2022 12:42:21 +0000 (05:42 -0700)] 
kafka consumer: custom serializer can't be configured after it's instantiation (#12960) (#13097)

* allow kakfa custom serializer to be configured

  * add unit tests

Co-authored-by: ellen shen <ellenshen@apple.com>
2 weeks agoVarious documentation updates. (#13107)
Gian Merlino [Sat, 17 Sep 2022 04:58:11 +0000 (21:58 -0700)] 
Various documentation updates. (#13107)

* Various documentation updates.

1) Split out "data management" from "ingestion". Break it into thematic pages.

2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
   all conceptual content is in concepts.md and all syntax content is in reference.md.
   Shorten the known issues page to the most interesting ones.

3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
   index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.

4) Rename various mentions of "Druid console" to "web console".

5) Add additional information to ingestion/partitioning.md.

6) Remove a mention of Tranquility.

7) Remove a note about upgrading to Druid 0.10.1.

8) Remove no-longer-relevant task types from ingestion/tasks.md.

9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.

10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
    places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
    in the sidebar.

11) Make all br tags self-closing.

12) Certain other cosmetic changes.

13) Update to node-sass 7.

* make travis use node12 for docs

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
2 weeks agosupport kafka lookups (#13098)
Vadim Ogievetsky [Fri, 16 Sep 2022 22:25:25 +0000 (15:25 -0700)] 
support kafka lookups (#13098)

2 weeks agoAllocate numCorePartitions using only used segments (#13070)
AmatyaAvadhanula [Fri, 16 Sep 2022 13:46:36 +0000 (19:16 +0530)] 
Allocate numCorePartitions using only used segments (#13070)

* Allocate numCorePartitions using only used segments

* Add corePartition checks in existing test

* Separate committedMaxId and overallMaxId

* Fix bug: replace overall with committed

2 weeks agoDoc fixes around msq (#13090)
Vadim Ogievetsky [Fri, 16 Sep 2022 09:15:26 +0000 (02:15 -0700)] 
Doc fixes around msq (#13090)

* remove things that do not apply

* fix more things

* pin node to a working version

* fix

* fixes

* known issues tidy up

* revert auto formatting changes

* remove management-uis page which is 100% lies

* don't mention the Coordinator console (that no longer exits)

* goodies

* fix typo

2 weeks agosplit up NestedDataColumnSerializer into separate files (#13096)
Clint Wylie [Fri, 16 Sep 2022 08:28:47 +0000 (01:28 -0700)] 
split up NestedDataColumnSerializer into separate files (#13096)

* split up NestedDataColumnSerializer into separate files

* fix it

2 weeks agoDocumentation: Update spatial indexing example (#12555)
Katya Macedo [Fri, 16 Sep 2022 02:32:19 +0000 (21:32 -0500)] 
Documentation: Update spatial indexing example (#12555)

* fix spatial indexing example

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update text and example

* Format JSON example

* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/development/geo.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Accept review suggestions

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Frank Chen <frankchen@apache.org>
2 weeks agoDocs – README.md update around documentation contributions (#12850)
Peter Marshall [Fri, 16 Sep 2022 02:31:06 +0000 (03:31 +0100)] 
Docs – README.md update around documentation contributions (#12850)

* Update README.md

Expansion on the process and where everything is.

* Update README.md

Switcheroo and a typo fix.

* Update README.md

Header link update to take to the H2.

* Update README.md

Reverted docs link after feedback

* Update README.md

Amended language.

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update README.md

PR term update

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2 weeks agoSome improvements about Docker (#13059)
Frank Chen [Fri, 16 Sep 2022 01:25:52 +0000 (09:25 +0800)] 
Some improvements about Docker (#13059)

2 weeks agolink to error docs (#13094)
Vadim Ogievetsky [Thu, 15 Sep 2022 22:06:08 +0000 (15:06 -0700)] 
link to error docs (#13094)

2 weeks agoDocs - README.md community channels removal + link (#12843)
Peter Marshall [Thu, 15 Sep 2022 12:52:46 +0000 (13:52 +0100)] 
Docs - README.md community channels removal + link (#12843)

* README.md community channels

Removed explicit links to project channels in favour of a link direct to the Community page on druid.apache.org.

Updated nav to match remaining headings in the README.

* Update README.md

Reintroduced the old section and amended the nav bar to point to back to the community section.

* Incorporated suggested wording from @paul-rogers with some stylistic blahness
* Updated Slack phraseology to be closer to the Google User Groups header wording and called out specific channels
* Added new wording re: events and articles with link to the repo to contribute them

2 weeks agoInitialize NullValueHandlingConfig for failed tests (#13078)
Atul Mohan [Thu, 15 Sep 2022 12:47:10 +0000 (05:47 -0700)] 
Initialize NullValueHandlingConfig for failed tests  (#13078)

* Initialize null handling

* Refactor nullhandlingconfig init

2 weeks agoFaster fix for dangling tasks upon supervisor termination (#13072)
AmatyaAvadhanula [Thu, 15 Sep 2022 10:01:14 +0000 (15:31 +0530)] 
Faster fix for dangling tasks upon supervisor termination (#13072)

This commit fixes issues with delayed supervisor termination during certain transient states.
Tasks can be created during supervisor termination and left behind since the cleanup may
not consider these newly added tasks.

#12178 added a lock for the entire process of task creation to prevent such dangling tasks.
But it also introduced a deadlock scenario as follows:
- An invocation of `runInternal` is in progress.
- A `stop` request comes, acquires `stateChangeLock` and submit a `ShutdownNotice`
- `runInternal` keeps waiting to acquire the `stateChangeLock`
- `ShutdownNotice` remains stuck in the notice queue because `runInternal` is still running
- After some timeout, the supervisor goes through a forced termination

Fix:
 * `SeekableStreamSupervisor.runInternal` - do not try to acquire lock if supervisor is already stopping
 * `SupervisorStateManager.maybeSetState` - do not allow transitions from STOPPING state

2 weeks agoMove web-console dependency declaration from druid-server to druid-distribution ...
Frank Chen [Thu, 15 Sep 2022 02:39:30 +0000 (10:39 +0800)] 
Move web-console dependency declaration from druid-server to druid-distribution (#12501)

* Move web-console dependency from druid-server to distribution

* Add a test to check if the web-console is correctly integrated

* exclude web-console from 'other integration tests'

* Revert "exclude web-console from 'other integration tests'"

This reverts commit 8d72225544f83514c344b5ecd9c69c9b3114ee33.

* Revert "Add a test to check if the web-console is correctly integrated"

This reverts commit d6ac8f3087b22515b03e42fd57d24e7f3ddca254.

2 weeks agoUpdate Snappy to 1.1.8.4. (#13081)
Gian Merlino [Wed, 14 Sep 2022 22:13:47 +0000 (15:13 -0700)] 
Update Snappy to 1.1.8.4. (#13081)

* Update Snappy to 1.1.8.4.

Prior to this, because snappy-java wasn't included in dependencyManagement,
we actually shipped multiple different versions for different extensions,
ranging from 1.1.7.1 to 1.1.8.4. Now, we standardize on 1.1.8.4.

Among other things, this enables the tests to pass on M1 Macs.

* Update snappy-java versions in licenses.yaml.

2 weeks agofix JsonParserIteratorTest (#13083)
Clint Wylie [Wed, 14 Sep 2022 03:49:57 +0000 (20:49 -0700)] 
fix JsonParserIteratorTest (#13083)

2 weeks agoProvide service specific log4j overrides in containerized deployments (#13020)
Atul Mohan [Wed, 14 Sep 2022 03:47:11 +0000 (20:47 -0700)] 
Provide service specific log4j overrides in containerized deployments (#13020)

* Provide service specific log4j overrides

* Clarify comments

* Add docs

2 weeks agoCompressed Big Decimal Cleanup and Extension (#13048)
sr [Wed, 14 Sep 2022 02:14:31 +0000 (19:14 -0700)] 
Compressed Big Decimal Cleanup and Extension (#13048)

1. remove unnecessary generic type from CompressedBigDecimal
2. support Number input types
3. support aggregator reading supported input types directly (uningested
   data)
4. fix scaling bug in buffer aggregator

2 weeks agoAvoid ClassCastException when getting values from `QueryContext` (#13022)
Frank Chen [Tue, 13 Sep 2022 10:00:09 +0000 (18:00 +0800)] 
Avoid ClassCastException when getting values from `QueryContext` (#13022)

* Use safe conversion methods

* Rename method

* Add getContextAsBoolean

* Update test case

* Remove generic from getContextValue

* Update catch-handler

* Add test

* Resolve comments

* Replace 'getContextXXX' to 'getQueryContext().getAsXXXX'

2 weeks agoWeb console: better detection for arrays containing objects (#13077)
Vadim Ogievetsky [Tue, 13 Sep 2022 01:50:29 +0000 (18:50 -0700)] 
Web console: better detection for arrays containing objects (#13077)

* better detection for arrays containing objects

* include boolean also

2 weeks agoExpressions: fixes for round-trips of floating point literals, Long.MIN_VALUE literal...
Gian Merlino [Tue, 13 Sep 2022 00:06:20 +0000 (17:06 -0700)] 
Expressions: fixes for round-trips of floating point literals, Long.MIN_VALUE literals, Shuffle.visitAll. (#13037)

* SQL: Fix round-trips of floating point literals.

When writing RexLiterals into Druid expressions, we now write non-integer
numeric literals in such a way that ensures they are parsed as doubles
on the other end.

* Updates from code review, and some additional stuff inspired by the
investigation.

- Remove unnecessary formatting code from DruidExpression.doubleLiteral:
  it handles things just fine with its default behavior.

- Fix a problem where expression literals could not represent Long.MIN_VALUE.
  Now, integer literals start life off as BigIntegerExpr instead of LongExpr,
  and are converted to LongExpr during flattening. This is necessary because,
  in order to avoid ambiguity between unary minus and negative literals, our
  grammar does not actually have true negative literals. Negative numbers must
  be represented as unary minus next to a positive literal.

- Fix a bug  introduced in #12230 where shuttle.visitAll(args) delegated
  to shuttle.visit(arg) instead of arg.visit(shuttle). The latter does
  a recursive visitation, which is the intended behavior.

* Style fixes.

* Move regexp to the right place.

2 weeks agoCleaner JSON for various input sources and formats. (#13064)
Gian Merlino [Mon, 12 Sep 2022 17:29:31 +0000 (10:29 -0700)] 
Cleaner JSON for various input sources and formats. (#13064)

* Cleaner JSON for various input sources and formats.

Add JsonInclude to various properties, to avoid population of default
values in serialized JSON.

Also fixes a bug in OrcInputFormat: it was not writing binaryAsString,
so the property would be lost on serde.

* Additonal test cases.

2 weeks agoCreate a copy of the shared JDBC context (#13049)
Paul Rogers [Mon, 12 Sep 2022 17:27:56 +0000 (19:27 +0200)] 
Create a copy of the shared JDBC context (#13049)

2 weeks agoexport com.sun.management.internal (#13068)
Frank Chen [Mon, 12 Sep 2022 16:03:22 +0000 (00:03 +0800)] 
export com.sun.management.internal (#13068)

2 weeks agoExpose HTTP Response headers from SqlResource (#13052)
imply-cheddar [Mon, 12 Sep 2022 08:40:06 +0000 (17:40 +0900)] 
Expose HTTP Response headers from SqlResource (#13052)

* Expose HTTP Response headers from SqlResource

This change makes the SqlResource expose HTTP response
headers in the same way that the QueryResource exposes them.

Fundamentally, the change is to pipe the QueryResponse
object all the way through to the Resource so that it can
populate response headers.  There is also some code
cleanup around DI, as there was a superfluous FactoryFactory
class muddying things up.

3 weeks agoEnable msq for docker by default (#13069)
Frank Chen [Sun, 11 Sep 2022 15:30:32 +0000 (23:30 +0800)] 
Enable msq for docker by default (#13069)

3 weeks agoBump the version of Druid docker image from 0.16.0-incubating to latest (#13058)
Benedict Jin [Sat, 10 Sep 2022 08:36:00 +0000 (16:36 +0800)] 
Bump the version of Druid docker image from 0.16.0-incubating to latest (#13058)

3 weeks agoadjust docs and images (#13067)
Vadim Ogievetsky [Sat, 10 Sep 2022 08:35:19 +0000 (01:35 -0700)] 
adjust docs and images (#13067)

3 weeks agofix number of expected functions (#13050)
Vadim Ogievetsky [Fri, 9 Sep 2022 20:42:01 +0000 (13:42 -0700)] 
fix number of expected functions (#13050)

3 weeks agoAdd ARRAY_QUANTILE function. (#13061)
Gian Merlino [Fri, 9 Sep 2022 18:29:20 +0000 (11:29 -0700)] 
Add ARRAY_QUANTILE function. (#13061)

* Add ARRAY_QUANTILE function.

Expected usage is like: ARRAY_QUANTILE(ARRAY_AGG(x), 0.9).

* Fix test.

3 weeks agoquote columns, datasources in auto complete if needed (#13060)
Vadim Ogievetsky [Fri, 9 Sep 2022 18:22:40 +0000 (11:22 -0700)] 
quote columns, datasources in auto complete if needed (#13060)

3 weeks agoprometheus-emitter supports sending metrics to pushgateway regularly … (#13034)
DENNIS [Fri, 9 Sep 2022 12:46:14 +0000 (20:46 +0800)] 
prometheus-emitter supports sending metrics to pushgateway regularly … (#13034)

* prometheus-emitter supports sending metrics to pushgateway regularly and continuously

* spell check fix

* Optimization variable name and related documents

* Update docs/development/extensions-contrib/prometheus.md

OK, it looks more conspicuous

Co-authored-by: Frank Chen <frankchen@apache.org>
* Update doc

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Frank Chen <frankchen@apache.org>
* When PrometheusEmitter is closed, close the scheduler

* Ensure that registeredMetrics is thread safe.

* Local variable name optimization

* Remove unnecessary white space characters

Co-authored-by: Frank Chen <frankchen@apache.org>
3 weeks agoUpdate tutorial-kafka.md (#13056)
sachidananda007 [Fri, 9 Sep 2022 02:06:19 +0000 (07:36 +0530)] 
Update tutorial-kafka.md (#13056)

* Update tutorial-kafka.md

Added missing command to the doc for zookeeper before starting kafka

* Update docs/tutorials/tutorial-kafka.md

Co-authored-by: Frank Chen <frankchen@apache.org>
3 weeks agoimprove nested column serializer (#13051)
Clint Wylie [Fri, 9 Sep 2022 01:32:53 +0000 (18:32 -0700)] 
improve nested column serializer (#13051)

changes:
* long and double value columns are now written directly, at the same time as writing out the 'intermediary' dictionaryid column with unsorted ids
* remove reverse value lookup from GlobalDictionaryIdLookup since it is no longer needed

3 weeks agoImprove doc and configuration of prometheus emitter (#13028)
Frank Chen [Thu, 8 Sep 2022 18:20:34 +0000 (02:20 +0800)] 
Improve doc and configuration of prometheus emitter (#13028)

* Improve doc and validation

* Add configuration for peon tasks

* Update doc

* Update test case

* Fix typo

* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Update docs/development/extensions-contrib/prometheus.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
3 weeks agofix bug in /status/properties filtering (#13045)
Lucas Capistrant [Thu, 8 Sep 2022 00:45:28 +0000 (19:45 -0500)] 
fix bug in /status/properties filtering (#13045)

* fix bug in /status/properties filtering

* Refactor tests to use jackson for parsing druid.server.hiddenProperties instead of hacky string modifications

* make javadoc more descriptive using example

* add in a sanity assertion that raw properties keyset size is greater than filtered properties keyset size

3 weeks agoFix web-console message in MSQ data loader (#12996)
Kashif Faraz [Wed, 7 Sep 2022 20:34:10 +0000 (02:04 +0530)] 
Fix web-console message in MSQ data loader (#12996)

* Fix typo in web-console message

* Prettify the changes

3 weeks agodefault to no compare (#13041)
Vadim Ogievetsky [Wed, 7 Sep 2022 15:28:28 +0000 (08:28 -0700)] 
default to no compare (#13041)

3 weeks agoMSQ extension: Fix over-capacity write in ScanQueryFrameProcessor. (#13036)
Gian Merlino [Wed, 7 Sep 2022 14:02:21 +0000 (07:02 -0700)] 
MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor. (#13036)

* MSQ extension: Fix over-capacity write in ScanQueryFrameProcessor.

Frame processors are meant to write only one output frame per cycle.
The ScanQueryFrameProcessor would write two when reading from a channel
if the input frame cursor cycled and then the output frame filled up
while reading from the next frame.

This patch fixes the bug, and adds a test. It also makes some adjustments
to the processor code in order to make it easier to test.

* Add license header.

3 weeks agoDisallow timeseries queries with ETERNITY interval and non-ALL granularity (#12944)
Rohan Garg [Wed, 7 Sep 2022 11:15:08 +0000 (16:45 +0530)] 
Disallow timeseries queries with ETERNITY interval and non-ALL granularity (#12944)

3 weeks agoAdd query/time metric for SQL queries from router (#12867)
Rohan Garg [Wed, 7 Sep 2022 08:24:46 +0000 (13:54 +0530)] 
Add query/time metric for SQL queries from router (#12867)

* Add query/time metric for SQL queries from router

* Fix query cancel bug when user has overriden native query-id in a SQL query

3 weeks agoAdd interpolation to JsonConfigurator (#13023)
Adam Peck [Wed, 7 Sep 2022 07:18:01 +0000 (01:18 -0600)] 
Add interpolation to JsonConfigurator (#13023)

* Add interpolation to JsonConfigurator

* Fix checkstyle

* Fix tests by removing common-text override

* Add back commons-text without version

* Remove unused hadoopDir configs

* Move some stuff to hopefully pass coverage

3 weeks agomore consistent expression error messages (#12995)
Clint Wylie [Wed, 7 Sep 2022 06:21:38 +0000 (23:21 -0700)] 
more consistent expression error messages (#12995)

* more consistent expression error messages

* review stuff

* add NamedFunction for Function, ApplyFunction, and ExprMacro to share common stuff

* fixes

* add expression transform name to transformer failure, better parse_json error messaging

3 weeks agoImprove String Last/First Storage Efficiency (#12879)
sr [Wed, 7 Sep 2022 03:00:54 +0000 (20:00 -0700)] 
Improve String Last/First Storage Efficiency (#12879)

-Add classes for writing cell values in LZ4 block compressed format.
Payloads are indexed by element number for efficient random lookup
-update SerializablePairLongStringComplexMetricSerde to use block
compression
-SerializablePairLongStringComplexMetricSerde also uses delta encoding
of the Long by doing 2-pass encoding: buffers first to find min/max
numbers and delta-encodes as integers if possible

Entry points for doing block-compressed storage of byte[] payloads
are the CellWriter and CellReader class. See
SerializablePairLongStringComplexMetricSerde for how these are used
along with how to do full column-based storage (delta encoding here)
which includes 2-pass encoding to compute a column header

3 weeks agoNested columns documentation (#12946)
Jill Osborne [Tue, 6 Sep 2022 21:42:18 +0000 (22:42 +0100)] 
Nested columns documentation (#12946)

Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: brian.le <brian.le@imply.io>
3 weeks agoremove mentions of DruidQueryRel from docs (#13033)
Vadim Ogievetsky [Tue, 6 Sep 2022 20:37:27 +0000 (13:37 -0700)] 
remove mentions of DruidQueryRel from docs (#13033)

* remove mentions of DruidQueryRel

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
3 weeks agoAdd CTA and fix typo (#13009)
Vadim Ogievetsky [Tue, 6 Sep 2022 18:16:50 +0000 (11:16 -0700)] 
Add CTA and fix typo (#13009)

* Add CTA and fix typo

* resolve hostname better

3 weeks agoWeb console: upgrade the console to use node 16 (#13017)
Vadim Ogievetsky [Tue, 6 Sep 2022 18:15:23 +0000 (11:15 -0700)] 
Web console: upgrade the console to use node 16 (#13017)

* upgrade the console to use node 16

* run npm audit fix

3 weeks agomsq: add multi-stage-query docs (#12983)
317brian [Tue, 6 Sep 2022 17:36:09 +0000 (10:36 -0700)] 
msq: add multi-stage-query docs (#12983)

* msq: add multi-stage-query docs

* add screenshots

add back theta sketches tutoria

change filename

fix filename

fix link

fix headings

* fixes

* fixes

* fix spelling issues and update spell file

* address feedback from karan

* add missing guardrail to known issues

* update blurb

* fix typo

* remove durable storage info

* update titles

* Restore en.json

* Update query view

* address comments from vad

* Update docs/multi-stage-query/msq-known-issues.md

finish sentence

* add apache license to docs

* add apache license to docs

Co-authored-by: Katya Macedo <katya.macedo@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
3 weeks agoFix compiler error: The project was not built since its build path is incomplete...
Didip Kerabat [Tue, 6 Sep 2022 15:19:41 +0000 (08:19 -0700)] 
Fix compiler error: The project was not built since its build path is incomplete. Cannot find the class file for org.slf4j.Logger. Fix the build path then try building this project (#13029)

Co-authored-by: Didip Kerabat <didip@apple.com>
3 weeks agocompressed big decimal - module (#10705)
senthilkv [Tue, 6 Sep 2022 07:06:57 +0000 (03:06 -0400)] 
compressed big decimal - module  (#10705)

Compressed Big Decimal is an extension which provides support for
Mutable big decimal value that can be used to accumulate values
without losing precision or reallocating memory. This type helps in
absolute precision arithmetic on large numbers in applications,
where greater level of accuracy is required, such as financial
applications, currency based transactions. This helps avoid rounding
issues where in potentially large amount of money can be lost.

Accumulation requires that the two numbers have the same scale,
but does not require that they are of the same size. If the value
being accumulated has a larger underlying array than this value
(the result), then the higher order bits are dropped, similar to what
happens when adding a long to an int and storing the result in an
int. A compressed big decimal that holds its data with an embedded
array.

Compressed big decimal is an absolute number based complex type
based on big decimal in Java. This supports all the functionalities
supported by Java Big Decimal. Java Big Decimal is not mutable in
order to avoid big garbage collection issues. Compressed big decimal
is needed to mutate the value in the accumulator.

3 weeks agoSuppress false CVEs (#13026)
Abhishek Agarwal [Tue, 6 Sep 2022 06:16:56 +0000 (11:46 +0530)] 
Suppress false CVEs (#13026)

* Suppress CVEs

* Add more suppressions

4 weeks agoEase of hidding sensitive properties from /status/proper… (#12950)
zemin [Fri, 2 Sep 2022 13:51:25 +0000 (15:51 +0200)] 
Ease of hidding sensitive properties from /status/proper… (#12950)

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

using one property for hiding properties, updated the index.md to document hiddenProperties

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

Added java docs

* apache#12063 Ease of hidding sensitive properties from /status/properties endpoint

Add "password", "key", "token", "pwd" as default druid.server.hiddenProperties

fixed typo and removed redundant space

Co-authored-by: zemin <zemin.piao@adyen.com>
4 weeks agoWeb console: don't crash if cookies are totally disabled (#13013)
Vadim Ogievetsky [Thu, 1 Sep 2022 23:10:23 +0000 (16:10 -0700)] 
Web console: don't crash if cookies are totally disabled (#13013)

* fix local storage detection

* fix numeric input dialog

4 weeks agoImprove range partitioning docs. (#13016)
Gian Merlino [Thu, 1 Sep 2022 22:21:30 +0000 (15:21 -0700)] 
Improve range partitioning docs. (#13016)

Two improvements:

- Use a realistic targetRowsPerSegment, so if people copy and paste
  the example from the docs, it will generate reasonable segments.
- Spell "countryName" correctly.

4 weeks agoBump the version of Maven in the Dockerfile (#11994)
Benedict Jin [Wed, 31 Aug 2022 14:54:24 +0000 (22:54 +0800)] 
Bump the version of Maven in the Dockerfile (#11994)

4 weeks agoMake console e2e tests run in band so as to not hog task slots (#13004)
Vadim Ogievetsky [Wed, 31 Aug 2022 04:55:53 +0000 (21:55 -0700)] 
Make console e2e tests run in band so as to not hog task slots (#13004)

* increase e2e timeline

* get rid of pull deps

* increase post index task timeoout

* boost msq e2e timeout

* run in band

4 weeks agodon't show transform actions on * queries (#13005)
Vadim Ogievetsky [Wed, 31 Aug 2022 04:54:18 +0000 (21:54 -0700)] 
don't show transform actions on * queries (#13005)

4 weeks agoAdd Java 17 information to documentation. (#12990)
Gian Merlino [Tue, 30 Aug 2022 19:32:49 +0000 (12:32 -0700)] 
Add Java 17 information to documentation. (#12990)

The docs say Java 17 support is experimental, and give tips on running
successfully with Java 17.

This patch also removes java.base/jdk.internal.perf and
jdk.management/com.sun.management.internal from the list of required
exports and opens, because they were formerly needed for JvmMonitor,
which was rewritten in #12481 to use MXBeans instead.

4 weeks agoFrameFile: Java 17 compatibility. (#12987)
Gian Merlino [Tue, 30 Aug 2022 18:13:47 +0000 (11:13 -0700)] 
FrameFile: Java 17 compatibility. (#12987)

* FrameFile: Java 17 compatibility.

DataSketches Memory.map is not Java 17 compatible, and from discussions
with the team, is challenging to make compatible with 17 while also
retaining compatibility with 8 and 11. So, in this patch, we switch away
from Memory.map and instead use the builtin JDK mmap functionality. Since
it only supports maps up to Integer.MAX_VALUE, we also implement windowing
in FrameFile, such that we can still handle large files.

Other changes:

1) Add two new "map" functions to FileUtils, which we use in this patch.
2) Add a footer checksum to the FrameFile format. Individual frames
   already have checksums, but the footer was missing one.

* Changes for static analysis.

* wip

* Fixes.

4 weeks agoBuilding druid-it-tools and running for travis in it.sh (#12957)
abhagraw [Tue, 30 Aug 2022 07:18:07 +0000 (12:48 +0530)] 
Building druid-it-tools and running for travis in it.sh (#12957)

* Building druid-it-tools and running for travis in it.sh

* Addressing comments

* Updating druid-it-image pom to point to correct it-tools

* Updating all it-tools references to druid-it-tools

* Adding dist back to it.sh travis

* Trigger Build

* Disabling batchIndex tests and commenting out user specific code

* Fixing checkstyle and intellij inspection errors

* Replacing tabs with spaces in it.sh

* Enabling old batch index tests with indexer

4 weeks agoFix accounting of bytesAdded in ReadableByteChunksFrameChannel. (#12988)
Gian Merlino [Tue, 30 Aug 2022 01:25:28 +0000 (18:25 -0700)] 
Fix accounting of bytesAdded in ReadableByteChunksFrameChannel. (#12988)

* Fix accounting of bytesAdded in ReadableByteChunksFrameChannel.

Could cause WorkerInputChannelFactory to get into an infinite loop when
reading the footer of a frame file.

* Additional tests.

4 weeks agoRemove dependency on jvm-attach. (#12989)
Gian Merlino [Mon, 29 Aug 2022 21:18:33 +0000 (14:18 -0700)] 
Remove dependency on jvm-attach. (#12989)

This dependency was no longer needed after #12481, but remained because
it was used for a (now useless) test. This patch removes the test and
the dependency.