samza.git
10 hours agoFix more broken links master
Jagadish [Tue, 11 Dec 2018 08:02:39 +0000 (00:02 -0800)] 
Fix more broken links

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #853 from vjagadish1989/website-reorg40

10 hours agoFix links in release documentation
Jagadish [Tue, 11 Dec 2018 07:27:12 +0000 (23:27 -0800)] 
Fix links in release documentation

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #852 from vjagadish1989/website-reorg38

2 days agoMinor typos/reword for meetups page
Jagadish [Sun, 9 Dec 2018 05:07:41 +0000 (21:07 -0800)] 
Minor typos/reword for meetups page

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #850 from vjagadish1989/website-reorg37

2 days agoAdd videos and descriptions from the last Samza meet-up
Jagadish [Sun, 9 Dec 2018 04:41:23 +0000 (20:41 -0800)] 
Add videos and descriptions from the last Samza meet-up

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #849 from vjagadish1989/website-reorg36

3 days agoSAMZA-2028: Samza-SQL Diagnostics: add metrics to Join and Aggregate operators
Shenoda Guirguis [Fri, 7 Dec 2018 23:05:43 +0000 (15:05 -0800)] 
SAMZA-2028: Samza-SQL Diagnostics: add metrics to Join and Aggregate operators

by adding metrics to Join and Aggregate, it concludes the first phase (adding metrics) of Samza-SQL Diagnostics.

Author: Shenoda Guirguis <sguirguis@linkedin.com>

Reviewers: atoomula

Closes #848 from shenodaguirguis/joinmetrics

4 days agoSAMZA-1835: Consolidate all processorId generation code.
Shanthoosh Venkataraman [Fri, 7 Dec 2018 02:11:38 +0000 (18:11 -0800)] 
SAMZA-1835: Consolidate all processorId generation code.

Currently, the processorId creation function createProcessorId() is repeated in three different implementation of `JobCoordinator` viz `ZkJobCoordinator`, `PassthroughJobCoordinator`, and `AzureJobCoordinator`.  Here're the few problems that stems from this duplication.

1. `ProcessorId` is passed into the `MetricsReporterFactory` through the factory create method: `MetricsReporter getMetricsReporter(String name, String processorId, Config config);`. Custom `MetricsReporter` implementations currently use the processorId as a component in the generated metric names. Metrics reporters are instantiated from `LocalApplicationRunner` and`processorId` is currently passed in as null to `MetricsReporterFactory.getMetricsReporter`. This corrupts the generated metrics names.
2. `ZkJobCoordinator`, `ZkUtils`,  `ZkLeaderElector` and different downstream components of `LocalApplicationRunner` currently instantiate and manage their private reporters, rather than the sharing common `MetricsRegistry` managed by `LocalApplicationRunner`. Since there is no common namespace and reporter shared by downstream components of `LocalApplicationRunner`,  generating metrics dashboards for standalone is kind of a hassle.

This PR is comprised of the following changes:

1. Moved the processorId generation to `LocalApplicationRunner` and injects the generated `processorId` to all the downstream layers.
2. Deprecated the getProcessorId API in `JobCoordinator` interface.
3. Add the `processorId` and `metricsRegistry` arguments to the `getJobCoordinator` method of `JobCoordinatorFactory` interface.
4. Fixed the unit tests and added unit tests for `LocalApplicationRunner.createProcessorId`.

Author: Shanthoosh Venkataraman <svenkata@linkedin.com>
Author: Shanthoosh Venkataraman <spvenkat@usc.edu>
Author: svenkata <svenkataraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #844 from shanthoosh/SAMZA-1835

4 days agoSAMZA-2002: SamzaSQL Diagnostics: instrument rest of operators (except join & aggrega...
Shenoda Guirguis [Thu, 6 Dec 2018 18:56:38 +0000 (10:56 -0800)] 
SAMZA-2002: SamzaSQL Diagnostics: instrument rest of operators (except join & aggregate) and at Query level

Second phase of instrumenting SamzaSQL operators to add and maintain metrics. All operators, except join and aggregate, are instrumented to add Processing Time and Input Rate metrics. Whenever output rate could be different (e.g., filter operator) the output rate is also added. At query level, we have Query Latency, and input and output rates.

Author: Shenoda Guirguis <sguirguis@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>, Aditya Toomula <atoomula@linkedin.com>

Closes #831 from shenodaguirguis/addmetrics.3

5 days agoSAMZA-2030: Config mock
Boris S [Thu, 6 Dec 2018 17:54:55 +0000 (09:54 -0800)] 
SAMZA-2030: Config mock

Fix getOption of ScalaMapConfig to support mocking.

Author: Boris S <bshkolnik@linkedin.com>
Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Closes #847 from sborya/ConfigMock

5 days agoSAMZA-2012: Add API for wiring an external context through to application processing...
Cameron Lee [Thu, 6 Dec 2018 00:41:34 +0000 (16:41 -0800)] 
SAMZA-2012: Add API for wiring an external context through to application processing code

This PR also refactors TestSamzaSqlRemoteTable to be in samza-test instead of samza-sql, since it seems to actually be an integration test. It is useful to move that test in this PR so that tests that may need an external context can be consolidated.

Author: Cameron Lee <calee@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>, Shanthoosh Venkatraman <svenkatr@linkedin.com>

Closes #829 from cameronlee314/external_context

5 days agoSAMZA-2019: for 1 partition broadcast topic generate topic#0 config
Boris S [Wed, 5 Dec 2018 22:13:50 +0000 (14:13 -0800)] 
SAMZA-2019: for 1 partition broadcast topic generate topic#0 config

+ address few review comments

Author: Boris S <bshkolnik@linkedin.com>
Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: xiliu <xiliu@linkedin.com>

Closes #846 from sborya/isBroadcast1

5 days agoSAMZA-2019: add Is broadcast per stream config
Boris S [Wed, 5 Dec 2018 19:54:44 +0000 (11:54 -0800)] 
SAMZA-2019: add Is broadcast per stream config

Author: Boris S <bshkolnik@linkedin.com>
Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: xinyuiscool <xiliu@linkedin.com>, Prateek M <prateekm@apache.org>

Closes #837 from sborya/isBroadcast

5 days agoSAMZA-1973: Unify the TaskNameGrouper interface for yarn and standalone.
Shanthoosh Venkataraman [Wed, 5 Dec 2018 18:56:55 +0000 (10:56 -0800)] 
SAMZA-1973: Unify the TaskNameGrouper interface for yarn and standalone.

This patch consists of the following changes:
* Unify the different methods present in the TaskNameGrouper interface. This will enable us to have a single interface method usable for both the yarn and standalone models.
* Generate locationId aware task assignment to processors in standalone.
* Move the task assignment persistence logic from a custom `TaskNameGrouper` implementation to `JobModelManager`, so that this works for any kind of custom group.
* General code clean up in `JobModelManager`,  `TaskAssignmentManager` and in other samza internal classes.
* Read/write taskLocality of the processors in standalone.
* Updated the existing java docs and added java docs where they were missing.

Testing:
* Fixed the existing unit-tests due to the changes.
* Added new unit tests for the functionality changed added as a part of this patch.
* Tested this patch with a sample job from `hello-samza` project and verified that it works as expected.

Please refer to [SEP-11](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957309) for more details.

Author: Shanthoosh Venkataraman <spvenkat@usc.edu>
Author: Shanthoosh Venkataraman <svenkata@linkedin.com>
Author: svenkata <svenkataraman@linkedin.com>

Reviewers: Prateek M<pmaheshw@linkedin.com>

Closes #790 from shanthoosh/task_name_grouper_changes

6 days agoSAMZA-2026: Refactor remote table API to separate retry policy settings
Wei Song [Tue, 4 Dec 2018 23:51:58 +0000 (15:51 -0800)] 
SAMZA-2026: Refactor remote table API to separate retry policy settings

As per subject, the goal is to make configuration of retry policies consistent with other API's.

Author: Wei Song <wsong@linkedin.com>

Reviewers: Aditya Toomula <atoomula@linkedin.com>

Closes #842 from weisong44/SAMZA-2026

6 days agoSAMZA-2021: Adding an API to rel converter to filter out system messages.
Aditya Toomula [Tue, 4 Dec 2018 23:50:07 +0000 (15:50 -0800)] 
SAMZA-2021: Adding an API to rel converter to filter out system messages.

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: srinipunuru

Closes #839 from atoomula/system and squashes the following commits:

0dcba87b [Aditya Toomula] Adding an API to rel converter to filter out system messages.
2bee3ba4 [Aditya Toomula] Adding an API to rel converter to filter out system messages.

6 days agoSAMZA-2025: InputOperatorImpl should work with filtering InputTransformer
Deepthi Sridharan [Tue, 4 Dec 2018 23:47:25 +0000 (15:47 -0800)] 
SAMZA-2025: InputOperatorImpl should work with filtering InputTransformer

InputOperatorImpl should handle the case where InputTransformer returns null record. It makes having simple filtering operation as part of the transformer easy.

Author: Deepthi Sridharan <desridharan@linkedin.com>

Reviewers: atoomula, prateekm

Closes #841 from DEEPTHIKORAT/tranformer

6 days agoMinor fix to some config variable names and accessor methods.
Prateek Maheshwari [Tue, 4 Dec 2018 22:08:57 +0000 (14:08 -0800)] 
Minor fix to some config variable names and accessor methods.

Author: Prateek Maheshwari <pmaheshwari@apache.org>

Reviewers: Jagadish<jagadish@apache.org>

Closes #840 from prateekm/fix-config-names

6 days agoSAMZA-1989: SystemStreamGrouper interface change for SEP-5
Shanthoosh Venkataraman [Tue, 4 Dec 2018 21:53:56 +0000 (13:53 -0800)] 
SAMZA-1989: SystemStreamGrouper interface change for SEP-5

Samza users may need to increase the partition count of the input streams of their stateful samza jobs. For example, Kafka needs to limit the maximum size of each partition to scale up its performance. Thus the number of partitions of a Kafka topic needs to be expanded to reduce the partition size if the average byte-in-rate or retention time of the Kafka topic has doubled.

In order to perform a join between streams, stateful jobs generally have to route the partitions from the different input streams to same task of a container. However, when a input stream repartitioning happens, key space of a partition gets redistributed. This will make the stateful jobs to produce erroneous results.

So if the partition count of input stream is increased then the users have to manually purge the changelog topics, local RocksDb state of their stateful jobs. This  results in an increased operational complexity and data loss.

This patch takes a first stab at solving the above problem and is comprised of the following changes:

* Introduce a new group method in `SystemStreamPartitionGrouper` interface to generate task assignment factoring in the partition expansion of input streams.
* Introduced a `StreamPartitionMapper` abstraction to allow the user to plugin the input stream partitioning function.
* Fixed the existing unit tests and added new unit tests to validate the new grouper changes.

In a followup PR shortly, these grouper changes would be integrated with `JobModelManager`(Waiting for PR 790 to be landed for this. It had made significant changes to `JobModelManager`)

Author: Shanthoosh Venkataraman <spvenkat@usc.edu>

Reviewers: Prateek M<pmaheshw@linkedin.com>, Ray Matharu<rmatharu@linkedin.com>, Daniel Nishimura<dnishimura@linkedin.com>

Closes #803 from shanthoosh/SEP-5

7 days agoMoving store test to TestContainerStorageManager from TestSamzaContainer
Ray Matharu [Tue, 4 Dec 2018 02:08:36 +0000 (18:08 -0800)] 
Moving store test to TestContainerStorageManager from TestSamzaContainer

There was a test in TestSamzaContainer that needs to be moved to TestContainerStorageManager because the restore logic is moved there.

Minor change in TestSamzaContainer and ContainerStorageManager

Author: Ray Matharu <rmatharu@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #838 from rmatharu/storeTest-fix

10 days agoSAMZA-2017: Update committer doc
Aditya Toomula [Sat, 1 Dec 2018 01:12:51 +0000 (17:12 -0800)] 
SAMZA-2017: Update committer doc

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: vjagadish1989

Closes #836 from atoomula/new

10 days agoSAMZA-2018: State restore improvements
Ray Matharu [Sat, 1 Dec 2018 01:07:23 +0000 (17:07 -0800)] 
SAMZA-2018: State restore improvements

This PR makes the following changes:

* Consumer consolidation to ensure 1 storeConsumer per system, earlier it was 1 consumer per SSP per store.
* Refactoring stores to use ContainerStorageManager with parallelization for restoration, and serial execution of sysConsumers start, stop, register, etc.

Author: Ray Matharu <rmatharu@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #823 from rmatharu/consumerConsolidate

10 days agoSAMZA-570: Enabling auto-discovery of regex input topics
Ray Matharu [Sat, 1 Dec 2018 01:06:30 +0000 (17:06 -0800)] 
SAMZA-570: Enabling auto-discovery of regex input topics

This PR makes the following changes

* Enriches StreamPartitionCountMonitor to periodically monitor input-regexes to match to actual inputs and stop the job when a new input stream is discovered.

* Add a new API to SysAdmin to allow listing of all streams, e.g., Kafka-topics. KafkaSysAdmin implementation of this uses KafkaConsumer's listTopics API. (Even if listTopics had 1 million topics with 100 bytes per topic total, temporary memory overhead will be 100 MB).

* Added config job.coordinator.monitor-input-regex.frequency.ms for the monitoring frequency, and job.coordinator.monitor-input-regex.%s for each input system. Users can then choose desired regex for each input system, e.g., job.coordinator.monitor-input-regex.kafka=test-.*.

* We can later enrich RegexTopicGen rewriter to add a monitor-input-regex config to allow periodic jonitoring

* Tested: Unit test for SPCM and tested with test jobs on local grid.

Author: Ray Matharu <rmatharu@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #796 from rmatharu/newtopic-test

10 days agoSAMZA-2014: Samza-sql: Support table as both source (for join) and destination in...
Aditya Toomula [Fri, 30 Nov 2018 21:40:05 +0000 (13:40 -0800)] 
SAMZA-2014: Samza-sql: Support table as both source (for join) and destination in the same application

While parsing queries in an application, with in SamzaSqlApplicationConfig, we collect all input sources and output sources from all queries and create descriptors for input sources first followed by output sources. But there could be only one table descriptor instance per table. Writable table is a readable table but vice versa is not true. If we go through input sources, we will end up creating readable table descriptor and would not be able to create writable table descriptor again when we go through output sources (the code will be ugly if we have to achieve this). There are couple of ways to solve this:
- Always make a table readable and writable
- Go through output sources first followed by input sources.

Choosing option 2 as making a table always read-writable does not make sense.

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #834 from atoomula/table

10 days agoSAMZA-2015: Refactor timer handling in tables to be consistent with stores
Wei Song [Fri, 30 Nov 2018 20:52:59 +0000 (12:52 -0800)] 
SAMZA-2015: Refactor timer handling in tables to be consistent with stores

Currently when timer is disabled, we do not instantiate timer instances for tables, this introduced potential opportunities for NPE in the future. We wanted to refactor to use the same approach used in store implementation based on HighResolutionClock.

Author: Wei Song <wsong@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #835 from weisong44/SAMZA-2015

10 days agoFix minor issue with operator spec graph traversal
Ahmed Abdul Hamid [Fri, 30 Nov 2018 18:33:35 +0000 (10:33 -0800)] 
Fix minor issue with operator spec graph traversal

Remove redundant traversal loop added by mistake during conflict merge.

Author: Ahmed Abdul Hamid <ahabdulh@ahabdulh-mn1.linkedin.biz>

Reviewers: Aditya Toomla <atoomla@linkedin.com>

Closes #833 from ahmedahamid/master

11 days agoSAMZA-2013: Account for cycles in graph traversal within Execution Planner
Ahmed Abdul Hamid [Fri, 30 Nov 2018 03:23:46 +0000 (19:23 -0800)] 
SAMZA-2013: Account for cycles in graph traversal within Execution Planner

Author: Ahmed Abdul Hamid <ahabdulh@ahabdulh-mn1.linkedin.biz>

Reviewers: Aditya Toomla <atoomla@linkedin.com>

Closes #832 from ahmedahamid/master

11 days agoSAMZA-2010: Handle null value in LocalReadWriteTable.putAll()
Wei Song [Thu, 29 Nov 2018 23:31:58 +0000 (15:31 -0800)] 
SAMZA-2010: Handle null value in LocalReadWriteTable.putAll()

To be consistent with put(), null values in input should be delete operation

Author: Wei Song <wsong@linkedin.com>

Reviewers: Ahmed Abdul Hamid <ahabdulhamid@linkedin.com>

Closes #827 from weisong44/SAMZA-2010

11 days agoSAMZA-1638: Recreate SystemProducer on KafkaCheckpointManager.writeCheckpoint failures.
Shanthoosh Venkataraman [Thu, 29 Nov 2018 19:53:39 +0000 (11:53 -0800)] 
SAMZA-1638: Recreate SystemProducer on KafkaCheckpointManager.writeCheckpoint failures.

Retry loop in the existing `KafkaCheckpointManager` implementation retries using the same `SystemProducer` instance on exception and does not recreate it.

When some irrecoverable exceptions occur within the `SystemProducer`, all the subsequent produce message invocations on the `SystemProducer` instance will fail. This had made the entire retry loop on `KafkaCheckpointManager` pointless.

This patch consists of the following changes:
1. This patch addresses the above problem by recreating the `SystemProducer` instance on failure and adds a unit test to verify the functionality.
2. Minor code cleanup in classes: `TestKafkaCheckpointManager` and `KafkaCheckpointManager`.

Author: Shanthoosh Venkataraman <spvenkat@usc.edu>
Author: Shanthoosh Venkataraman <svenkata@linkedin.com>

Reviewers: Dong Lin <lindong28@gmail.com>

Closes #792 from shanthoosh/kafka_checkpoint_manager_fix

12 days agoSAMZA-1976: MetadataStore API cleanup.
Shanthoosh Venkataraman [Thu, 29 Nov 2018 17:46:13 +0000 (09:46 -0800)] 
SAMZA-1976: MetadataStore API cleanup.

This PR consists of the following changes:
* Switching all the API methods from using byte[] array as key type to string.
* Fixed `CoordinatorMetadataStore`, `ZkMetadataStore` tests due to the type change of key.

Shortly in a followup PR,  namespace unification for different metadata stored in standalone and YARN model will be done.

Author: Shanthoosh Venkataraman <spvenkat@usc.edu>

Reviewers: Prateek <prateekm@linkedin.com>

Closes #791 from shanthoosh/metadata_store_api_cleanup

12 days agoSAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite...
Aditya Toomula [Wed, 28 Nov 2018 23:53:30 +0000 (15:53 -0800)] 
SAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite plan.

This is required for supporting schema evolution without failing the jobs.

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #821 from atoomula/modify and squashes the following commits:

17b4b1c1 [Aditya Toomula] SAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite plan.
65be581a [Aditya Toomula] SAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite plan.
fb50ee81 [Aditya Toomula] SAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite plan.
9fff9573 [Aditya Toomula] SAMZA-2007: Samza-sql - Fix Samza-SQL to not expect LogicalTableModify in the Calcite plan.
f3e887c6 [Aditya Toomula] dummy

13 days agoSAMZA-2004: Add ability to disable table metrics
Wei Song [Tue, 27 Nov 2018 18:40:34 +0000 (10:40 -0800)] 
SAMZA-2004: Add ability to disable table metrics

For jobs with very high throughput, it is desirable to disable metrics on tables. We would introduce the option to disable all metrics for a table on table descriptor.

Author: Wei Song <wsong@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #822 from weisong44/SAMZA-2004-2

2 weeks agoUpdate version to 1.0.0 in docs
Jagadish [Tue, 27 Nov 2018 13:41:40 +0000 (05:41 -0800)] 
Update version to 1.0.0 in docs

2 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Tue, 27 Nov 2018 12:00:16 +0000 (04:00 -0800)] 
Merge branch 'master' of https://github.com/apache/samza

2 weeks agoEnsure that DOC pages for older releases are easy to discover
Jagadish [Tue, 27 Nov 2018 11:59:13 +0000 (03:59 -0800)] 
Ensure that DOC pages for older releases are easy to discover

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #820 from vjagadish1989/website-reorg35

2 weeks agoMake older release pages better discoverable
Jagadish [Tue, 27 Nov 2018 11:57:43 +0000 (03:57 -0800)] 
Make older release pages better discoverable

2 weeks agoUse consistent font /heading sizes for all pages
Jagadish [Tue, 27 Nov 2018 11:21:25 +0000 (03:21 -0800)] 
Use consistent font /heading sizes for all pages

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #819 from vjagadish1989/website-reorg34

2 weeks agoUse consistent font /heading sizes for all pages
Jagadish [Tue, 27 Nov 2018 11:18:22 +0000 (03:18 -0800)] 
Use consistent font /heading sizes for all pages

2 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Tue, 27 Nov 2018 09:47:09 +0000 (01:47 -0800)] 
Merge branch 'master' of https://github.com/apache/samza

2 weeks agoCommit for website publish for 1.0.0
Jagadish [Tue, 27 Nov 2018 09:46:54 +0000 (01:46 -0800)] 
Commit for website publish for 1.0.0

2 weeks agoClean up docs for standalone
Jagadish [Tue, 27 Nov 2018 08:28:39 +0000 (00:28 -0800)] 
Clean up docs for standalone

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #817 from vjagadish1989/website-reorg32

2 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Tue, 27 Nov 2018 08:27:11 +0000 (00:27 -0800)] 
Merge branch 'master' of https://github.com/apache/samza

2 weeks agoClean up standalone docs
Jagadish [Tue, 27 Nov 2018 08:26:06 +0000 (00:26 -0800)] 
Clean up standalone docs

2 weeks agoSAMZA-2006: Removed config from table provider constructor
Wei Song [Mon, 26 Nov 2018 23:32:31 +0000 (15:32 -0800)] 
SAMZA-2006: Removed config from table provider constructor

With the latest API change in Samza 1.0, config can be obtained from Context object during init(), therefore we do not to pass this in the constructor.

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #816 from weisong44/SAMZA-2006

2 weeks agoClean-up open source docs for Samza SQL
Jagadish [Mon, 26 Nov 2018 22:20:24 +0000 (14:20 -0800)] 
Clean-up open source docs for Samza SQL

atoomula srinipunuru FYI..

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #815 from vjagadish1989/website-reorg31

2 weeks agoClean up open-source documentation for Samza SQL
Jagadish [Mon, 26 Nov 2018 22:18:01 +0000 (14:18 -0800)] 
Clean up open-source documentation for Samza SQL

2 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Mon, 26 Nov 2018 22:17:14 +0000 (14:17 -0800)] 
Merge branch 'master' of https://github.com/apache/samza

2 weeks agoSAMZA-1998: Table API refactoring
Wei Song [Wed, 21 Nov 2018 01:22:18 +0000 (17:22 -0800)] 
SAMZA-1998: Table API refactoring

Table API refactoring
     - Removed TableSpec
     - Consolidated configuration generation for tables to table descriptors
     - Refactored constructor so that only local table would require serde's
     - Removed table provider for RocksDB- and in-memory tables, and added LocalTableProvider
     - Updates to unit tests
     - Various refactoring

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #807 from weisong44/SAMZA-1998

2 weeks agoSAMZA-1997: Samza-sql diagnostics - instrument project operator
Shenoda Guirguis [Tue, 20 Nov 2018 22:05:47 +0000 (14:05 -0800)] 
SAMZA-1997: Samza-sql diagnostics - instrument project operator

When the user uses Samza-SQL, they use high level declarative language (SQL) for ease and speed of implementation of their Samza Job. Therefore, monitoring the job should provide metrics at this high/logical level. This is the goal of the Samza-SQL diagnostics project. In this first baby-step, we start with instrumenting the Project operator to provide run-time metrics.

Author: Shenoda Guirguis <sguirgui@sguirgui-ld2.linkedin.biz>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>, Aditya Toomula <atoomula@linkedin.com>

Closes #806 from shenodaguirguis/samza-sql-diagnostics

2 weeks agoSAMZA-2001: Samza-sql: Handle null records in rel converter and in joins
Aditya Toomula [Tue, 20 Nov 2018 21:18:57 +0000 (13:18 -0800)] 
SAMZA-2001: Samza-sql: Handle null records in rel converter and in joins

* Synced some AvroRelConverter fixes from linkedin version.
* Null value handling in AvroRelConverter and Join function.
* Null value handling in Table API StreamTableJoinOperatorImpl class.

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #812 from atoomula/nulljoins

2 weeks agoSAMZA-1994: Table API: Add missed key lookups metric for table reads
Aditya Toomula [Tue, 20 Nov 2018 21:02:13 +0000 (13:02 -0800)] 
SAMZA-1994: Table API: Add missed key lookups metric for table reads

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>, Wei Song <wsong@linkedin.com>

Closes #811 from atoomula/metric

2 weeks agoSAMZA-1707: Samza onTimer method triggering before init
Xinyu Liu [Tue, 20 Nov 2018 20:31:56 +0000 (12:31 -0800)] 
SAMZA-1707: Samza onTimer method triggering before init

Currently there was a bug when registering a timer with a very short amount of delay, it might not be invoked since it depends on the creation of the run loop. This patch fixed the problem by double checking the ready timers when run loop is created (listener is registered.)

Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Prateek M <prateekm@apache.org>

Closes #810 from xinyuiscool/SAMZA-1707

3 weeks agoSAMZA-1999: Fix NullPointerException when sink is log.outputstream
Weiqing Yang [Mon, 19 Nov 2018 17:59:23 +0000 (09:59 -0800)] 
SAMZA-1999: Fix NullPointerException when sink is log.outputstream

## What changes were proposed in this pull request?
The PR is to fix a bug which throws NullPointerException when sink is log.outputstream

## How was this patch tested?
Pass build and current tests.
Test in Samza SQL shell.

Author: Weiqing Yang <yangweiqing001@gmail.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #808 from weiqingy/SAMZA-1999

3 weeks agoSAMZA-2000: update contributor page
Hai Lu [Fri, 16 Nov 2018 22:13:03 +0000 (14:13 -0800)] 
SAMZA-2000: update contributor page

Trivial commit to test committer workflow

Author: Hai Lu <halu@linkedin.com>

Reviewers: xiliu <xiliu@apache.org>

Closes #809 from lhaiesp/master

3 weeks agoSAMZA-1972: Make Operator Timer metrics calculation configurable
xinyuiscool [Fri, 16 Nov 2018 00:01:53 +0000 (16:01 -0800)] 
SAMZA-1972: Make Operator Timer metrics calculation configurable

This patch introduces two changes:
1. Make the timer metrics in OperatorImpl to be optional, and disabled by default. Adding TimerMetrics has quite a big performance impact which affects jobs with large number of operators, so it should be turned on for debugging only.
2. Register operator-level metrics on the container metrics registry. The task level registry has too many metrics which are usually ignored by the users. Having it in the container level will reduce the total amount of metrics published as well as the memory footprint.

Tested by hello-samza and works as expected.

Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #805 from xinyuiscool/SAMZA-1972

3 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Wed, 14 Nov 2018 03:17:26 +0000 (19:17 -0800)] 
Merge branch 'master' of https://github.com/apache/samza

3 weeks agoUpdated API documentation for high and low level APIs.
Prateek Maheshwari [Wed, 14 Nov 2018 02:17:17 +0000 (18:17 -0800)] 
Updated API documentation for high and low level APIs.

vjagadish1989 nickpan47 Please take a look.

Author: Prateek Maheshwari <pmaheshwari@apache.org>

Reviewers: Jagadish<jagadish@apache.org>

Closes #802 from prateekm/api-docs

4 weeks agoSAMZA-1986: Samza-sql: Use system name along with stream name for streamId
Aditya Toomula [Mon, 12 Nov 2018 19:42:51 +0000 (11:42 -0800)] 
SAMZA-1986: Samza-sql: Use system name along with stream name for streamId

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #800 from atoomula/streamid

4 weeks agoSAMZA-1988: Properly suffix modules with direct Scala dependencies with the Scala...
Daniel Nishimura [Mon, 12 Nov 2018 19:11:28 +0000 (11:11 -0800)] 
SAMZA-1988: Properly suffix modules with direct Scala dependencies with the Scala version.

List of modules without a Scala version suffix that have direct Scala dependencies and the direct Scala API calls in each module are in the JIRA ticket: https://issues.apache.org/jira/browse/SAMZA-1988

Author: Daniel Nishimura <dnishimura@linkedin.com>

Reviewers: Sanil Jain <snjain@linkedin.com>

Closes #801 from dnishimura/samza-1988-scala-version-suffixes

4 weeks agoSAMZA-1978: Use samza offset reset value in kafka consumer
Boris S [Sat, 10 Nov 2018 00:23:05 +0000 (16:23 -0800)] 
SAMZA-1978: Use samza offset reset value in kafka consumer

Author: Boris S <bshkolnik@linkedin.com>
Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #753 from sborya/UseSamazResetInKafka

4 weeks agoSAMZA-1616: Samza-Sql - Support remote table for stream-table join
Aditya Toomula [Fri, 9 Nov 2018 16:30:59 +0000 (08:30 -0800)] 
SAMZA-1616: Samza-Sql - Support remote table for stream-table join

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #794 from atoomula/remote

4 weeks agoSAMZA-1981: Consolidate table descriptors to samza-api
Wei Song [Thu, 8 Nov 2018 22:04:28 +0000 (14:04 -0800)] 
SAMZA-1981: Consolidate table descriptors to samza-api

As per subject, table descriptors moved are
 - LocalTableDescriptor
 - RemoteTableDescriptor
 - HybridTableDescriptor
 - GuavaCacheTableDescriptor
 - CachingTableDescriptor

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #799 from weisong44/SAMZA-1981

4 weeks agoSAMZA-1980: Rename LocalStoreBackedTable to LocalTable
Wei Song [Wed, 7 Nov 2018 22:46:17 +0000 (14:46 -0800)] 
SAMZA-1980: Rename LocalStoreBackedTable to LocalTable

As per subject, this is to keep naming of local tables consistent with other table types.

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #798 from weisong44/SAMZA-1980 and squashes the following commits:

51c17ff4 [Wei Song] Renamed LocalStoreBackedTable to LocalTable
9c121207 [Wei Song] Merge remote-tracking branch 'upstream/master'
89bfc14c [Wei Song] Merge remote-tracking branch 'upstream/master'
a53e5628 [Wei Song] SAMZA-1964 Make getTableSpec() in RemoteTableDescriptor reentrant
c9e8bf7c [Wei Song] Merge remote-tracking branch 'upstream/master'
7c777fec [Wei Song] Merge remote-tracking branch 'upstream/master'
a06e8ec2 [Wei Song] Merge remote-tracking branch 'upstream/master'
2c679c39 [Wei Song] Merge remote-tracking branch 'upstream/master'
a56c28dc [Wei Song] Merge remote-tracking branch 'upstream/master'
097958c8 [Wei Song] Merge remote-tracking branch 'upstream/master'
05822f0a [Wei Song] Merge remote-tracking branch 'upstream/master'
f7480505 [Wei Song] Merge remote-tracking branch 'upstream/master'
7706ab1f [Wei Song] Merge remote-tracking branch 'upstream/master'
f5731b10 [Wei Song] Merge remote-tracking branch 'upstream/master'
1e5de45a [Wei Song] Merge remote-tracking branch 'upstream/master'
c85604e0 [Wei Song] Merge remote-tracking branch 'upstream/master'
242d8442 [Wei Song] Merge remote-tracking branch 'upstream/master'
ec7d8409 [Wei Song] Merge remote-tracking branch 'upstream/master'
e19b4dc9 [Wei Song] Merge remote-tracking branch 'upstream/master'
8ee78441 [Wei Song] Merge remote-tracking branch 'upstream/master'
1c6a2eae [Wei Song] Merge remote-tracking branch 'upstream/master'
a6c94add [Wei Song] Merge remote-tracking branch 'upstream/master'
41299b5b [Wei Song] Merge remote-tracking branch 'upstream/master'
239a0950 [Wei Song] Merge remote-tracking branch 'upstream/master'
eca00204 [Wei Song] Merge remote-tracking branch 'upstream/master'
51562391 [Wei Song] Merge remote-tracking branch 'upstream/master'
de708f5e [Wei Song] Merge remote-tracking branch 'upstream/master'
df2f8d7b [Wei Song] Merge remote-tracking branch 'upstream/master'
f28b491d [Wei Song] Merge remote-tracking branch 'upstream/master'
4782c61d [Wei Song] Merge remote-tracking branch 'upstream/master'
0440f75f [Wei Song] Merge remote-tracking branch 'upstream/master'
aae0f380 [Wei Song] Merge remote-tracking branch 'upstream/master'
a15a7c9a [Wei Song] Merge remote-tracking branch 'upstream/master'
5cbf9af9 [Wei Song] Merge remote-tracking branch 'upstream/master'
3f7ed71f [Wei Song] Added self to committer list

4 weeks agoSAMZA-1921: upgrade to use the latest java AdminClient.
Boris S [Wed, 7 Nov 2018 22:39:59 +0000 (14:39 -0800)] 
SAMZA-1921: upgrade to use the latest java AdminClient.

In this PR, I've refactored create/clear streams methods.

Author: Boris S <bshkolnik@linkedin.com>
Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>
Author: svenkata <svenkataraman@linkedin.com>

Reviewers: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Closes #789 from sborya/JavaAdminClient

5 weeks agoCleanup docs for HDFS connector
Jagadish [Sat, 3 Nov 2018 00:35:20 +0000 (17:35 -0700)] 
Cleanup docs for HDFS connector

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #793 from vjagadish1989/website-reorg30

5 weeks agoCleanup docs for HDFS connector
Jagadish [Sat, 3 Nov 2018 00:33:26 +0000 (17:33 -0700)] 
Cleanup docs for HDFS connector

5 weeks agoSAMZA-1952: StreamPartitionCountMonitor for standalone.
Shanthoosh Venkataraman [Fri, 2 Nov 2018 16:38:30 +0000 (09:38 -0700)] 
SAMZA-1952: StreamPartitionCountMonitor for standalone.

This patch adds the capability to detect the partition change of the input streams of a stateless standalone jobs and trigger a re-balancing phase(which will essentially account for new partitions from input stream and distribute it to the live processors of the group).

Existing partition count detection of input streams is broken in yarn for stateful jobs. This will be addressed for both yarn and standalone as a part of #622

Author: Shanthoosh Venkataraman <spvenkat@usc.edu>

Reviewers: Boris Shkolnik <boryas@apache.org>

Closes #726 from shanthoosh/stream_partition_count_monitor_for_standalone

5 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Thu, 1 Nov 2018 22:38:32 +0000 (15:38 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

5 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Wed, 31 Oct 2018 21:22:21 +0000 (14:22 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

5 weeks agoy
Boris S [Wed, 31 Oct 2018 21:22:04 +0000 (14:22 -0700)] 
y

Author: Boris S <boryas@apache.org>
Author: Boris S <bshkolnik@linkedin.com>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Ray Matharu <rmatharu@linkedin.com>

Closes #779 from sborya/RemoveGetKafkaSystemConsumerConfig

5 weeks agoSAMZA-1943 Remove ExtendedSystemAdmin and deprecated getNewestOffsets method.
Boris S [Wed, 31 Oct 2018 20:45:28 +0000 (13:45 -0700)] 
SAMZA-1943 Remove ExtendedSystemAdmin and deprecated getNewestOffsets method.

Author: Boris S <boryas@apache.org>
Author: Boris S <bshkolnik@linkedin.com>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Bharath Kumarasubramanian <bkumarasubramanian@linkedin.com>

Closes #782 from sborya/removeExtendedSystemAdmin

5 weeks agoJavadoc cleanup for new Application, Descriptor, Context and Table APIs - Part 2
Prateek Maheshwari [Wed, 31 Oct 2018 20:36:54 +0000 (13:36 -0700)] 
Javadoc cleanup for new Application, Descriptor, Context and Table APIs - Part 2

Currently, we don't allow imports for use only in javadocs. This requires using FQNs in link tags, which is not very readable. Checkstyle's UnusedImport rule has an option to allow imports for use in javadoc comments (processJavadocs=true, should be read as "check javadocs for import usage == true").

AFAICT, there's no good way to change the check's properties within a submodule. This PR adds both versions (strict and relaxed) to the checkstyle, and disables the strict validation for samza-api only.

This PR also updates the javadocs to use the class names with imports.

Author: Prateek Maheshwari <pmaheshwari@apache.org>

Reviewers: Cameron Lee <calee@linkedin.com>

Closes #760 from prateekm/javadoc-cleanup

5 weeks agoSAMZA-1970: Support for physical names in InMemorySystem
Sanil15 [Wed, 31 Oct 2018 19:41:40 +0000 (12:41 -0700)] 
SAMZA-1970: Support for physical names in InMemorySystem

if super is not there, java compiles this to this.withPhysicalName which results in StackOverflows

Author: Sanil15 <sanil.jain15@gmail.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #788 from Sanil15/SAMZA-1970-edit

5 weeks agoSAMZA-1971: Fix NPE in partition key computation for InMemorySystemProducer
bharathkk [Wed, 31 Oct 2018 19:41:20 +0000 (12:41 -0700)] 
SAMZA-1971: Fix NPE in partition key computation for InMemorySystemProducer

Author: bharathkk <codin.martial@gmail.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #786 from bharathkk/fix-inmemory-partitionkey-npe

5 weeks agoClose iterators to time-series store on deletes
Jagadish [Wed, 31 Oct 2018 01:48:40 +0000 (18:48 -0700)] 
Close iterators to time-series store on deletes

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #787 from vjagadish1989/website-reorg29

5 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Wed, 31 Oct 2018 01:46:53 +0000 (18:46 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

5 weeks agoClose timeseries store iterators on deletes
Jagadish [Wed, 31 Oct 2018 01:45:37 +0000 (18:45 -0700)] 
Close timeseries store iterators on deletes

5 weeks agoUpdated RELEASE instructions.
Prateek Maheshwari [Tue, 30 Oct 2018 23:29:53 +0000 (16:29 -0700)] 
Updated RELEASE instructions.

Author: Prateek Maheshwari <pmaheshwari@apache.org>

Reviewers: Jagadish <jagadish@apache.org>

Closes #783 from prateekm/release-docs

5 weeks agoAdding release notes for Samza 1.0
rmatharu@linkedin.com [Tue, 30 Oct 2018 22:54:25 +0000 (15:54 -0700)] 
Adding release notes for Samza 1.0

<img width="1280" alt="screen shot 2018-10-26 at 7 25 58 pm" src="https://user-images.githubusercontent.com/40646191/47598776-11921000-d956-11e8-8863-ce245e1d0cb2.png">
<img width="1280" alt="screen shot 2018-10-26 at 7 26 06 pm" src="https://user-images.githubusercontent.com/40646191/47598777-11921000-d956-11e8-884b-3b5acbdbfb94.png">
<img width="1280" alt="screen shot 2018-10-26 at 7 26 08 pm" src="https://user-images.githubusercontent.com/40646191/47598778-11921000-d956-11e8-9428-8a8f11f3d1e8.png">
<img width="1280" alt="screen shot 2018-10-26 at 7 26 10 pm" src="https://user-images.githubusercontent.com/40646191/47598779-11921000-d956-11e8-99ee-9c77349417f2.png">
<img width="1280" alt="screen shot 2018-10-26 at 7 26 13 pm" src="https://user-images.githubusercontent.com/40646191/47598780-122aa680-d956-11e8-9fbc-98527b099a6a.png">
<img width="1280" alt="screen shot 2018-10-26 at 7 26 17 pm" src="https://user-images.githubusercontent.com/40646191/47598781-122aa680-d956-11e8-8c81-b113fbddbd8a.png">

Author: rmatharu@linkedin.com <rmatharu@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #776 from rmatharu/releasenotes

5 weeks agoFix to make Samza SQL applications work after the Runner refactoring
Srinivasulu Punuru [Tue, 30 Oct 2018 20:33:33 +0000 (13:33 -0700)] 
Fix to make Samza SQL applications work after the Runner refactoring

With recent change in Samza, Constructor signature for ApplicationRunner has changed. But the SamzaSQLApplicationRunner was not updated with the new signature. This is to fix the signature of the constructor for SamzaSQLApplicationRunner with the updated signature.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Aditya Toomula <atoomula@linkedin.com>

Closes #784 from srinipunuru/sql-app-fix.1

5 weeks agoatoomula and prateekm FYI..
Jagadish [Tue, 30 Oct 2018 19:11:56 +0000 (12:11 -0700)] 
atoomula and prateekm FYI..

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #785 from vjagadish1989/website-reorg28

5 weeks agoCleanup documentation for Kinesis
Jagadish [Tue, 30 Oct 2018 19:06:54 +0000 (12:06 -0700)] 
Cleanup documentation for Kinesis

5 weeks agoCleanup documentation for Kinesis
Jagadish [Tue, 30 Oct 2018 19:05:49 +0000 (12:05 -0700)] 
Cleanup documentation for Kinesis

5 weeks agoSAMZA-1968: Samza-sql - Change Calcite sql type for samza sql rel message __key__...
Aditya Toomula [Tue, 30 Oct 2018 18:25:00 +0000 (11:25 -0700)] 
SAMZA-1968: Samza-sql - Change Calcite sql type for samza sql rel message __key__ to accept any format

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #774 from atoomula/keyformat

6 weeks agoAdd physical name support for InMemoryStreamDescriptors
Sanil15 [Tue, 30 Oct 2018 02:52:04 +0000 (19:52 -0700)] 
Add physical name support for InMemoryStreamDescriptors

Author: Sanil15 <sanil.jain15@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #781 from Sanil15/SAMZA-1970

6 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Mon, 29 Oct 2018 23:37:07 +0000 (16:37 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

6 weeks agoCleanup the EventHubs connector section. Use System/Stream Descriptors
Jagadish [Mon, 29 Oct 2018 23:31:11 +0000 (16:31 -0700)] 
Cleanup the EventHubs connector section. Use System/Stream Descriptors

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #780 from vjagadish1989/website-reorg27

6 weeks agoCleanup the EventHubs connector section. Use System/Stream Descriptors
Jagadish [Mon, 29 Oct 2018 23:29:39 +0000 (16:29 -0700)] 
Cleanup the EventHubs connector section. Use System/Stream Descriptors

6 weeks agoCleanup the EventHubs connector section. Use System/Stream Descriptors
Jagadish [Mon, 29 Oct 2018 23:28:14 +0000 (16:28 -0700)] 
Cleanup the EventHubs connector section. Use System/Stream Descriptors

6 weeks agoCleanup the connectors-overview and Kafka-connector sections. Use System/StreamDescri...
Jagadish [Sun, 28 Oct 2018 18:29:07 +0000 (11:29 -0700)] 
Cleanup the connectors-overview and Kafka-connector sections. Use System/StreamDescriptors

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #778 from vjagadish1989/website-reorg26

6 weeks agoCleanup the connectors overview and Kafka connector sections. Use System/Stream Descr...
Jagadish [Sun, 28 Oct 2018 18:26:46 +0000 (11:26 -0700)] 
Cleanup the connectors overview and Kafka connector sections. Use System/Stream Descriptors

6 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Sat, 27 Oct 2018 18:18:52 +0000 (11:18 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

6 weeks agoDOCS: Clean-up the section on YARN deployments
Jagadish [Sat, 27 Oct 2018 18:18:08 +0000 (11:18 -0700)] 
DOCS: Clean-up the section on YARN deployments

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #777 from vjagadish1989/website-reorg25

6 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Sat, 27 Oct 2018 18:15:08 +0000 (11:15 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

6 weeks agoCleanup the deployment section for Samza 1.0
Jagadish [Sat, 27 Oct 2018 18:12:38 +0000 (11:12 -0700)] 
Cleanup the deployment section for Samza 1.0

6 weeks agoSAMZA-1967: Tests failing when Job uses any serde other than NoOp
Sanil15 [Sat, 27 Oct 2018 00:51:58 +0000 (17:51 -0700)] 
SAMZA-1967: Tests failing when Job uses any serde other than NoOp

Context: Serde is configured in JobNodeConfigurationGenerator and any StreamDescriptor#toConfig does not generate key and msg serde configs

Problem: Tests failing when Job uses any serde other than NoOp, since ApplicationDescriptor serdes take precedence in absence of any user-supplied configs

Solution: Passing null msg and key serde configs in userConfigs for StreamDescriptors ensures ApplicationDescriptor generated serde configs don't take precedence

prateekm rmatharu please take a look

Author: Sanil15 <sanil.jain15@gmail.com>

Reviewers: Prateek M<pmaheshw@linkedin.com>

Closes #764 from Sanil15/SAMZA-1967

6 weeks agoPrint the logical plan during query planning
Srinivasulu Punuru [Thu, 25 Oct 2018 21:32:38 +0000 (14:32 -0700)] 
Print the logical plan during query planning

Minor fix to print the logical plan.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Aditya Toomula <atoomula@linkedin.com>

Closes #763 from srinipunuru/print.1

6 weeks agoSAMZA-1903: Samza-sql - Fix stream-table join to work with udfs
Aditya Toomula [Thu, 25 Oct 2018 21:28:19 +0000 (14:28 -0700)] 
SAMZA-1903: Samza-sql - Fix stream-table join to work with udfs

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #762 from atoomula/udf

6 weeks agoSAMZA-1927: Samza-sql - always repartition the stream denoted by stream-table join.
Aditya Toomula [Thu, 25 Oct 2018 21:16:24 +0000 (14:16 -0700)] 
SAMZA-1927: Samza-sql - always repartition the stream denoted by stream-table join.

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srinivasulu Punuru <spunuru@linkedin.com>

Closes #676 from atoomula/dsl3 and squashes the following commits:

e86ef83c [Aditya Toomula] Adding metadatastream prefix config. This will be used to reset both the intermediate streams and changelogstore streams by changing the prefix name.
14450264 [Aditya Toomula] Adding metadatastream prefix config. This will be used to reset both the intermediate streams and changelogstore streams by changing the prefix name.
c3289673 [Aditya Toomula] Adding changelogstreamname prefix config. This will be used to reset the state by changing the prefix name.
804b07a1 [Aditya Toomula] SAMZA-1927: Samza-sql - always repartition the stream denoted by stream-table join.

6 weeks agoClean up the deployment options section.
Jagadish [Wed, 24 Oct 2018 22:20:02 +0000 (15:20 -0700)] 
Clean up the deployment options section.

- Reword for consistency, style, tone

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #761 from vjagadish1989/website-reorg24

6 weeks agoMerge branch 'master' of https://github.com/apache/samza
Jagadish [Wed, 24 Oct 2018 22:18:41 +0000 (15:18 -0700)] 
Merge branch 'master' of https://github.com/apache/samza

6 weeks agoClean up the deployment models section
Jagadish [Wed, 24 Oct 2018 22:18:04 +0000 (15:18 -0700)] 
Clean up the deployment models section