samza.git
21 months agoSAMZA-1214: Bug - system-scoped stream default configs may not be honored
Jacob Maes [Mon, 1 May 2017 20:52:13 +0000 (13:52 -0700)] 
SAMZA-1214: Bug - system-scoped stream default configs may not be honored

* Re-introduced deprecated system-stream configs into config table
* Fixed position of task.consumer.batch.size in config table
* Moved system-scoped defaults from StreamConfig to SystemConfig

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #150 from jmakes/samza-1214

21 months agoSAMZA-1250: JobRunner.kill doesn't terminate cleanly with YarnJob.
Jacob Maes [Mon, 1 May 2017 20:44:54 +0000 (13:44 -0700)] 
SAMZA-1250: JobRunner.kill doesn't terminate cleanly with YarnJob.

1. The ClientHelper now checks inactive application IDs so it can get status for terminated jobs in addition to running jobs
2. JobRunner.kill() waits for any finish, not just successful finish.
3. A killed job is now considered successful.

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #152 from jmakes/samza-1250

21 months agoSAMZA-1219; Add metrics for operator message received and execution times
Prateek Maheshwari [Thu, 27 Apr 2017 22:55:11 +0000 (15:55 -0700)] 
SAMZA-1219; Add metrics for operator message received and execution times

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #142 from prateekm/operator-metrics

21 months agoSamza 1214: Allow users to set a default replication.factor for intermediate topics
Jacob Maes [Thu, 27 Apr 2017 22:05:28 +0000 (15:05 -0700)] 
Samza 1214: Allow users to set a default replication.factor for intermediate topics

* Add a new "systems.sysName.default.stream.*" config structure that allows users to set system-wide defaults for streams.
* More thorough testing of system defaults and stream defaults
* Removed the old migration config from the config table since there's no code to support it.
* Moved 2 kafka-specific config accessors out of JobConfig and into KafkaConfig
* Removed duplicate impl of getChangelogStream()

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #141 from jmakes/samza-1214

21 months agoSAMZA-1210: Fixing merge issue and container id generations.
Boris Shkolnik [Thu, 27 Apr 2017 21:37:08 +0000 (14:37 -0700)] 
SAMZA-1210: Fixing merge issue and container id generations.

This PR is just for fixing issues introduced by a merge and changing container id generation.

Author: Boris Shkolnik <boryas@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #122 from sborya/mergeFixes

21 months agoSAMZA-1245: Make stream samza.physical.name config name string public
Xinyu Liu [Thu, 27 Apr 2017 20:21:11 +0000 (13:21 -0700)] 
SAMZA-1245: Make stream samza.physical.name config name string public

For certain system such as hdfs, the physical stream name might need to be finalized during the config generation. In order to do that, we will need to expose the stream samza.physical.name config string.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #145 from xinyuiscool/SAMZA-1245

21 months agoSAMZA-1200: Scala compile for samza-core fails with ambiguous reference error for...
Prateek Maheshwari [Wed, 26 Apr 2017 23:58:55 +0000 (16:58 -0700)] 
SAMZA-1200: Scala compile for samza-core fails with ambiguous reference error for some compiler versions

... for some compiler versions.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #143 from prateekm/logging-compile-fix

21 months agoSAMZA-1026: HDFS System Producer should not have Kafka dependency
Prateek Maheshwari [Wed, 26 Apr 2017 23:55:25 +0000 (16:55 -0700)] 
SAMZA-1026: HDFS System Producer should not have Kafka dependency

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <nramesh@linkedin.com>

Closes #144 from prateekm/hdfs-kafka-dependency

21 months agoSAMZA-1233: Create SystemAdmin only for JobCoordinator System in SamzaTaskProxy
Shanthoosh Venkataraman [Wed, 26 Apr 2017 16:05:41 +0000 (09:05 -0700)] 
SAMZA-1233: Create SystemAdmin only for JobCoordinator System in SamzaTaskProxy

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #138 from shanthoosh/fix_samza_task_proxy

21 months agoSAMZA-1220 : Add thread name to SamzaContainer shutdown hook and prevent shutdown...
Navina Ramesh [Tue, 25 Apr 2017 23:27:02 +0000 (16:27 -0700)] 
SAMZA-1220 : Add thread name to SamzaContainer shutdown hook and prevent shutdown deadlock

* SamzaContainerExceptionHandler is written in Java and used by LocalContainerRunner.java

Author: nramesh <nramesh@linkedin.com>

Reviewers: Jagadish Venkataraman <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #139 from navina/SAMZA-1220

21 months agoSAMZA-1202; Multiple calls to getInputStream() result in non-deterministic behavior
vjagadish1989 [Tue, 25 Apr 2017 00:15:47 +0000 (17:15 -0700)] 
SAMZA-1202; Multiple calls to getInputStream() result in non-deterministic behavior

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>

Closes #136 from vjagadish1989/samza-1202

21 months agoSAMZA-1229; Disk space monitor should only count data in use by the container
Prateek Maheshwari [Mon, 24 Apr 2017 22:03:00 +0000 (15:03 -0700)] 
SAMZA-1229; Disk space monitor should only count data in use by the container

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #134 from prateekm/disk-space-monitor

21 months agoSAMZA-1226: relax type parameters in MessageStream functions
Yi Pan (Data Infrastructure) [Fri, 21 Apr 2017 23:14:41 +0000 (16:14 -0700)] 
SAMZA-1226: relax type parameters in MessageStream functions

relax the type parameter in user supplied functions in fluent API

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Navina Ramesh <navina@apache.org>, Jacob Maes <jmaes@linkedin.com>

Closes #133 from nickpan47/SAMZA-1226 and squashes the following commits:

b8d3461 [Yi Pan (Data Infrastructure)] SAMZA-1226: cleanup code example in StreamApplication javadoc
93fa471 [Yi Pan (Data Infrastructure)] SAMZA-1226: added more unit tests for type-cast functions
18e1e9f [Yi Pan (Data Infrastructure)] SAMZA-1226: address review feedbacks
b5da53b [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1226
7981b83 [Yi Pan (Data Infrastructure)] SAMZA-1226: relax type parameters in MessageStream functions

22 months agoSAMZA-868: support elasticsearch version 2.x
Jiri Humpolicek [Thu, 20 Apr 2017 08:25:31 +0000 (01:25 -0700)] 
SAMZA-868: support elasticsearch version 2.x

22 months agoSAMZA-1176: Intermittent TestJoinOperator unit test failure
Prateek Maheshwari [Wed, 19 Apr 2017 21:26:00 +0000 (14:26 -0700)] 
SAMZA-1176: Intermittent TestJoinOperator unit test failure

Join TTL was set too low (10 ms). `joinRetainsMatchedMessagesReverse` will fail if the execution time between line 176 and 186 is longer than that. Increased TTL to 1 min (will not affect test execution time).

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #131 from prateekm/join-test-fix

22 months agoSAMZA-1213 - StreamProcessorLifeCycleAware interface should not use processorId
Navina Ramesh [Wed, 19 Apr 2017 19:33:37 +0000 (12:33 -0700)] 
SAMZA-1213 - StreamProcessorLifeCycleAware interface should not use processorId

Refactoring LocalApplicationRunner s.t. each processor has its own listener instance, instead of a single listener keeping track of all processors.

Author: Navina Ramesh <navina@apache.org>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Xinyu Liu <xiliu@linkedin.com>

Closes #125 from navina/SAMZA-1213

22 months agoSAMZA-1209; Improve error handling in LocalStoreMonitor
Shanthoosh Venkataraman [Wed, 19 Apr 2017 06:18:13 +0000 (23:18 -0700)] 
SAMZA-1209; Improve error handling in LocalStoreMonitor

Changes:

1. Add opt-in configuration to continue garbage collection of local stores
   when there’s a failure in garbage collecting one local store.
2. Handle failures gracefully. In getJobModel, JobModel is expected as return response.
   Incase of failures, returned error-message Map is deserialized to JobModel
   resulting in ClassCastException. Handle 2xx, failures separately. Log error
   messages returned properly in httpGet call.
3. Fix getTasks url format.
4. Minor code cleanup(Switch to using ',' as seperator for job.status.servers configuration
   instead of '.' for ease of use).
5. Fix disabled tests in TestLocalStoreMonitor.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Closes #124 from shanthoosh/fix_config_format_in_localstore_monitor

22 months agoSAMZA-1217: Fix broken build due to TestCheckstyle problem.
Shanthoosh Venkataraman [Wed, 19 Apr 2017 04:23:39 +0000 (21:23 -0700)] 
SAMZA-1217: Fix broken build due to TestCheckstyle problem.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #130 from shanthoosh/master

22 months agoSAMZA-1157: Serialization/deserialization throwables should not be suppressed
Yi Pan (Data Infrastructure) [Wed, 19 Apr 2017 00:15:32 +0000 (17:15 -0700)] 
SAMZA-1157: Serialization/deserialization throwables should not be suppressed

…ppressed if user don't configure to drop those errors

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #128 from nickpan47/fix-serde-error-suppressed

22 months agoSAMZA-1215, SAMZA-1216: remove ProcessKiller and change the ZooKeeper connection...
Yi Pan (Data Infrastructure) [Wed, 19 Apr 2017 00:04:29 +0000 (17:04 -0700)] 
SAMZA-1215, SAMZA-1216: remove ProcessKiller and change the ZooKeeper connection string in test

… test to use 127.0.0.1 in connect string

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #129 from nickpan47/fix-some-tests

22 months agoSAMZA-1182 : Disable flaky tests in TestAsyncRunLoop
Shanthoosh Venkataraman [Tue, 18 Apr 2017 22:05:27 +0000 (15:05 -0700)] 
SAMZA-1182 : Disable flaky tests in TestAsyncRunLoop

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #126 from shanthoosh/disable_all_async_run_loop_tests

22 months agoSAMZA-1205; Disable flaky tests in TestJmxServer.
Shanthoosh Venkataraman [Mon, 17 Apr 2017 21:25:11 +0000 (14:25 -0700)] 
SAMZA-1205; Disable flaky tests in TestJmxServer.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org?

Closes #120 from shanthoosh/disable_jmx_server_tests

22 months agoSAMZA-1211; Remove Thread.sleep() from TestJoinOperator tests
Prateek Maheshwari [Mon, 17 Apr 2017 19:09:25 +0000 (12:09 -0700)] 
SAMZA-1211; Remove Thread.sleep() from TestJoinOperator tests

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #123 from prateekm/join-test-no-sleep

22 months agoSAMZA-1145: Provide Ability To Confgure The Default Number Of Changel…
James Lent [Fri, 14 Apr 2017 19:09:07 +0000 (12:09 -0700)] 
SAMZA-1145: Provide Ability To Confgure The Default Number Of Changel…

…og Replicas

Author: James Lent <jlent@nc.rr.com>

Reviewers: Yi Pan <nickpan47@apache.org>, Jagadish <vjagadish1989@gmail.com>

Closes #86 from jwlent55/SAMZA-1145

22 months agoSAMZA-1195 fix bug in Samza application master on Kerberos
Chen Song [Fri, 14 Apr 2017 19:04:08 +0000 (12:04 -0700)] 
SAMZA-1195 fix bug in Samza application master on Kerberos

Author: Chen Song <chens_cs@hotmail.com>

Reviewers: Yi Pan <nickpan47@apache.org>

Closes #119 from garlicbulb-puzhuo/SAMZA-1195

22 months agoSAMZA-1183: Fix flaky tests in TestAsyncRunLoop
Shanthoosh Venkataraman [Thu, 13 Apr 2017 22:44:13 +0000 (15:44 -0700)] 
SAMZA-1183: Fix flaky tests in TestAsyncRunLoop

Changes:

* Moved all states locally within the test methods(no global state, since tests can be run parallely).
* Increased the timeout waiting for all task threads to finish.
* Changed message chooser mock to return null at the end(When EOS message is used a final msg, during some runs task.process gets invoked more than once for it. This is due to test setup, other tests within this test class also do the same.)

How was it verified:

Ran the test class repeatedly to check for failures(there were none) using following script.

`
while [ $i -lt 30 ] do     sleep 3;     i=`expr $i + 1`;     ./gradlew :samza-core:clean :samza-core:test -Dtest.single=TestAsyncRunLoop  done
`

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #108 from shanthoosh/fix-async-run-loop-tests-1

22 months agoSAMZA-1132: LocalApplicationRunner for StreamApplication
Xinyu Liu [Thu, 13 Apr 2017 22:39:23 +0000 (15:39 -0700)] 
SAMZA-1132: LocalApplicationRunner for StreamApplication

LocalApplicationRunner runs the StreamApplication locally on every node that the application is deployed to. LocalRunner.start() is blocking until the process is killed or user invoke LocalRunner.stop() from another thread. It can be configured to use Zk for coordination.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #117 from xinyuiscool/SAMZA-1132

22 months agoSAMZA-1208 - IllegalFormatConversionException in LocalContainerRunner
Navina Ramesh [Thu, 13 Apr 2017 01:40:01 +0000 (18:40 -0700)] 
SAMZA-1208 - IllegalFormatConversionException in LocalContainerRunner

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #121 from navina/SAMZA-1208

22 months agoSAMZA-1170: Integration testing harness for StreamApplications
vjagadish1989 [Wed, 12 Apr 2017 00:56:54 +0000 (17:56 -0700)] 
SAMZA-1170: Integration testing harness for StreamApplications

- Added an integration testing harness for testing `StreamApplication`s. This is convenient for running and interacting with Kafka brokers, Zk servers and Samza components locally.
- Added the following integration tests that use this harness to test actual `StreamApplication`s:
1. Keyed Tumbling Window
2. Keyed Session Window
3. Repartition with Keyed Session Window

Bonus: A couple of additional bug-fixes that were blockers for this test.

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>, Jacob Maes<jmaes@linkedin.com>

Closes #96 from vjagadish1989/integration-tests

22 months agoSAMZA-1198: disable the flaky test TestZkBarrierForVersionUpgrade.testZkBarrierForVer...
Fred Ji [Mon, 10 Apr 2017 20:53:03 +0000 (13:53 -0700)] 
SAMZA-1198: disable the flaky test TestZkBarrierForVersionUpgrade.testZkBarrierForVersionUpgrade

We are seeing the fails sometimes from this test, disabling it for build success first. See details in SAMZA-1198, and the JIRA for fix is in SAMZA-1193.

Author: Fred Ji <fji@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #116 from fredji97/disableFlaky

22 months agoSamza-1187 : TestZkProcessorLatch tests share state causing transient failures in...
Navina Ramesh [Sat, 8 Apr 2017 00:00:28 +0000 (17:00 -0700)] 
Samza-1187 : TestZkProcessorLatch tests share state causing transient failures in CI builds

* Removed testSingleCountdown - didn't understand what was being tested
* The timeout behavior for ZkProcessorLatch has not been documented. Hence, I have fixed testLatchExpires as best I could understand.
* Removed redundant/unused member variables from ZkProcessorLatch

Author: navina <navina@apache.org>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #112 from navina/SAMZA-1187

22 months agoSAMZA-1126 - Semantics of processorId in Samza
Navina Ramesh [Fri, 7 Apr 2017 22:22:13 +0000 (15:22 -0700)] 
SAMZA-1126 - Semantics of processorId in Samza

Implementation based on [SEP-1](https://cwiki.apache.org/confluence/display/SAMZA/SEP-1%3A+Semantics+of+ProcessorId+in+Samza)

Author: navina <navina@apache.org>

Reviewers: Yi Pan <nickpan47@gmail.com>, Jacob Maes <jmaes@linkedin.com>, Jagadish <jagadish@apache.org>

Closes #103 from navina/SAMZA-1126

22 months agoSAMZA-1194 - Commenting out testJmxReporter flaky test
Shanthoosh Venkataraman [Fri, 7 Apr 2017 22:17:10 +0000 (15:17 -0700)] 
SAMZA-1194 - Commenting out testJmxReporter flaky test

TestJmxReporter brings up an JmxServer and uses it to test the reporter behavior. However, due to failures in bringing up/connecting to JMX server, this fails intermittently. Temporarily disabling it.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #118 from shanthoosh/disable_jmx_reporter_tests

22 months agoSAMZA-1089: Enable YarnJob and ClientHelper to kill a job by name rather than YARN...
Jacob Maes [Fri, 7 Apr 2017 21:32:02 +0000 (14:32 -0700)] 
SAMZA-1089: Enable YarnJob and ClientHelper to kill a job by name rather than YARN ApplicationID

Missed a couple files in the previous commit to enable YarnJob to kill and get status of a Job based on the job name rather than the YARN ApplicationName.

These changes have been manually verified in a Yarn cluster at LI.

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #114 from jmakes/samza-1089-3

22 months agoSAMZA-1192: Fixed TestJoinOperator test failure on JDK 1.8.0_05
Prateek Maheshwari [Thu, 6 Apr 2017 22:35:39 +0000 (15:35 -0700)] 
SAMZA-1192: Fixed TestJoinOperator test failure on JDK 1.8.0_05

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #115 from prateekm/join-test-fix

22 months agoSAMZA-1178: Generate JSON from StreamPlan
Xinyu Liu [Thu, 6 Apr 2017 16:58:38 +0000 (09:58 -0700)] 
SAMZA-1178: Generate JSON from StreamPlan

As the first step to visualize the StreamGraph/Plan, this patch generates a json representation of it. For the example StreamGraph in TestPlanJsonGenerator, here is the json that is generated (https://github.com/xinyuiscool/examples/blob/master/example_plan.json). This patch also introduces the immutable StreamPlan and did some code cleanup.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #110 from xinyuiscool/SAMZA-1178

22 months agoSAMZA-1143; Include fs.<scheme>.impl.* subkeys to YarnConfiguration used in YarnJobFa...
Fred Ji [Thu, 6 Apr 2017 05:28:38 +0000 (22:28 -0700)] 
SAMZA-1143; Include fs.<scheme>.impl.* subkeys to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager

SAMZA-1143 Include fs.&lt;scheme&gt;.impl.* subkeys, in addition tofs.&lt;scheme&gt;.impl, to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager.

When there are additional subconfigurations under fs.myScheme.impl, such as fs.myScheme.impl.client, we need to keep the set of configuration completed in  YarnJobFactory and YarnClusterResourceManager. When the context is set for localizing the resource in ClientHelper and YarnContainerRunner, it may rely on this configuration to get the FileStatus information, which may or may not depends on fs.&lt;scheme&gt;.impl and all possible fs.&lt;scheme&gt;.impl.* sub-configuration.

This is an enhanced feature to the PR#90.

Author: Fred Ji <fji@linkedin.com>
Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #97 from fredji97/fsImplSubkeys

22 months agoSAMZA-1191; Fixed flaky test: TestExponentialSleepStrategy testThreadInterruptInRetryLoop
Prateek Maheshwari [Thu, 6 Apr 2017 05:26:09 +0000 (22:26 -0700)] 
SAMZA-1191; Fixed flaky test: TestExponentialSleepStrategy testThreadInterruptInRetryLoop

It's possible that the interruptee thread (see `#interruptedThread`) gets pre-empted before it has a chance to run the operation (increment `iterations`) and then gets interrupted, causing these assertions to fail.

I think these assertions also aren't critical for the tests which I presume want to test interrupt propagation behavior, so removing them in this change.

vjagadish1989 & xinyuiscool, please take a look.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #113 from prateekm/flaky-ess-test-fix

22 months agoSAMZA-1094 SAMZA-1101 SAMZA-1159; Remove MessageEnvelope from public operator APIs...
Prateek Maheshwari [Wed, 5 Apr 2017 23:42:15 +0000 (16:42 -0700)] 
SAMZA-1094 SAMZA-1101 SAMZA-1159; Remove MessageEnvelope from public operator APIs. : Delay the creation of SinkFunction for output streams. : Move StreamSpec from a public API to an internal class.

Removed the MessageEnvelope and OutputStream interfaces from public operator APIs.
Moved the creation of SinkFunction for output streams to SinkOperatorSpec.
Moved StreamSpec from a public API to an internal class.

Additionally,
1. Removed references to StreamGraph in OperatorSpecs. It was being used to getNextOpId(). MessageStreamsImpl now gets the ID and gives it to OperatorSpecs itself.
2. Updated and cleaned up the StreamGraphBuilder examples.
3. Renamed SinkOperatorSpec to OutputOperatorSpec since its used by sink, sendTo and partitionBy.

nickpan47 and xinyuiscool, please take a look.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #92 from prateekm/message-envelope-removal

22 months agoSAMZA-1189: Fix file system closed issue on hdfs system producer
Jacob Maes [Wed, 5 Apr 2017 03:37:32 +0000 (20:37 -0700)] 
SAMZA-1189: Fix file system closed issue on hdfs system producer

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #111 from jmakes/samza-1189

22 months agoSamza 1186: Rename Processor to Job
Xinyu Liu [Tue, 4 Apr 2017 01:24:33 +0000 (18:24 -0700)] 
Samza 1186: Rename Processor to Job

Now we have the top level Samza application, and each stage is called a job, the previous introduced "processor" naming should be renamed as Job. This includes renaming PrcessorGraph to JobGraph, and ProcessorNode to JobNode, etc.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #109 from xinyuiscool/SAMZA-1186

22 months agoSAMZA-1089: Runner should support kill and status commands
Jacob Maes [Mon, 3 Apr 2017 16:50:27 +0000 (09:50 -0700)] 
SAMZA-1089: Runner should support kill and status commands

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>,Xinyu Liu <xiliu@linkedin.com>

Closes #106 from jmakes/samza-1089-2

22 months agoSAMZA-1138; Yarn capability check is broken
Maksim Logvinenko [Sun, 2 Apr 2017 23:18:43 +0000 (16:18 -0700)] 
SAMZA-1138; Yarn capability check is broken

After migration from `yarn.container.*` properties to `cluster-manager.container.*` properties we have to use either of them and `ClusterManagerConfig` provides backward compatibility for these properties. But in `YarnClusterResourceManage`r only old properties are used (from `YarnConfig`), hence if job config migrated to new `cluster-manager.*` properties names then check will be evaluated against default values, not against actual values.

Author: Maksim Logvinenko <mlogvinenko@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #84 from logarithm/yarn-properties-fix

22 months agoSAMZA-1181: Fix AppMaster hang after submitting jobs to Yarn
Shanthoosh Venkataraman [Fri, 31 Mar 2017 22:58:22 +0000 (15:58 -0700)] 
SAMZA-1181: Fix AppMaster hang after submitting jobs to Yarn

Currently when a job is submitted to Yarn, it is going to hang after AppMaster is created. The log shows that it hangs during bootstrapping from Coordinator stream. Further debugging shows that the jobs hang in the second time of bootstrap while reading locality data from LocalityManager. The sequence is the following:
1. JobModelManager creates CoordinatorStreamConsumer, and bootstrap it,
2. LocalityManager writes locality info into coordinator stream
3. JobModelManager closes CoordinatorStreamConsumer
4. Later localityManager bootstraps CoordinatorStreamConsumer again
Step 3 is the problem here. Since CoordinatorStreamConsumer is still held by LocalityManager, it cannot be closed prematurely. Step 3 is introduced in SAMZA-1154, as a refactoring of JobModelManager for task rest end point. To fix this issue, we will revert this change of step 3.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #104 from shanthoosh/master

22 months agoSAMZA-1176; Make TestJoinOperator unit tests safe for concurrent execution
Prateek Maheshwari [Fri, 31 Mar 2017 21:00:30 +0000 (14:00 -0700)] 
SAMZA-1176; Make TestJoinOperator unit tests safe for concurrent execution

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #105 from prateekm/join-tests

22 months agoSAMZA-1182 - Commenting out some of the flaky tests
Navina Ramesh [Fri, 31 Mar 2017 20:48:35 +0000 (13:48 -0700)] 
SAMZA-1182 - Commenting out some of the flaky tests

Author: navina <navina@apache.org>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>, Jagadish Venkataraman <vjagadish1989@gmail.com>

Closes #107 from navina/SAMZA-1182

22 months ago SAMZA-1135 - Support scala 2.12
vjagadish1989 [Fri, 31 Mar 2017 17:26:08 +0000 (10:26 -0700)] 
SAMZA-1135 - Support scala 2.12

    Added support for scala 2.12

    Author: Maxim Logvinenko <mlogvinenko@gmail.com>

    Reviewers: Jagadish <jagadish@apache.org>,Prateek Maheshwari <prateekm@linkedin.com>

    Closes #82 from metamx:scala-2.12

22 months agoSAMZA-1175 - Removing CoordinationService from JobCoordinatorFactory interface
Navina Ramesh [Wed, 29 Mar 2017 23:25:16 +0000 (16:25 -0700)] 
SAMZA-1175 - Removing CoordinationService from JobCoordinatorFactory interface

Removing CoordinationService from JobCoordinatorFactory interface

Author: navina <navina@apache.org>

Reviewers: Xinyu Liu <xinyuliu.us@apache.org>,Boris Shkolnik <boryas@apache.org>

Closes #102 from navina/SAMZA-1175

22 months agoSAMZA-1161: Adding metrics into LocalStoreMonitor.
Shanthoosh Venkataraman [Wed, 29 Mar 2017 21:07:34 +0000 (14:07 -0700)] 
SAMZA-1161: Adding metrics into LocalStoreMonitor.

Rocksdb LocalStoreMonitor is responsible for clearing up unused local task partition stores. Metrics are required to understand the behavior of this monitor in production(especially when it's clearing up unused rocksdb local stores).

The following two metrics will be emitted from this monitor:
a) Total disk space cleared in bytes.
b) Total number of rocksdb task partition stores cleared.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #95 from shanthoosh/metrics_into_local_store_monitor

22 months agoSAMZA-1167: New streamId-specific configs do not override equivalent system-scoped...
Jacob Maes [Wed, 29 Mar 2017 20:57:26 +0000 (13:57 -0700)] 
SAMZA-1167: New streamId-specific configs do not override equivalent system-scoped configs

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #101 from jmakes/samza-1167

22 months agoFix an import issue in unit test
Xinyu Liu [Wed, 29 Mar 2017 20:43:10 +0000 (13:43 -0700)] 
Fix an import issue in unit test

22 months agoSAMZA-1172: Fix for the topological sort to handle single-node loop
Xinyu Liu [Wed, 29 Mar 2017 17:49:25 +0000 (10:49 -0700)] 
SAMZA-1172: Fix for the topological sort to handle single-node loop

In the processor graph, the topological sort missed adding to the visited set during graph traversal. This caused wrong graph being generated for single-node loop. This is fixed in the patch.

Also fixed the maxPartition method not handling empty collection correctly.

Added a few new unit tests for these. Also adjust the timing of previous async commit unit tests so it can run more reliably. Long term wise we need to fix the timer inside the AsyncRunLoop tests.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #100 from xinyuiscool/SAMZA-1172

22 months agoSAMZA-1151 - Coordination Service
Boris Shkolnik [Wed, 29 Mar 2017 17:44:49 +0000 (10:44 -0700)] 
SAMZA-1151 - Coordination Service

Author: Boris Shkolnik <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@apache.org>, Navina Ramesh <navina@apache.org>

Closes #91 from sborya/CoordinationService

22 months agoSAMZA-1171: Rewrite config in ApplicationRunnerMain when creating ApplicationRunner
Xinyu Liu [Tue, 28 Mar 2017 00:33:42 +0000 (17:33 -0700)] 
SAMZA-1171: Rewrite config in ApplicationRunnerMain when creating ApplicationRunner

The config needs to be rewritten before passing down to the ApplicationRunner. This is a bug that was introduced during some refactoring/cleanup of the config in the ApplicationRunner interface.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #98 from xinyuiscool/SAMZA-1171

22 months agoSAMZA-1143; Universal config support for localized resource
Fred Ji [Mon, 27 Mar 2017 17:34:12 +0000 (10:34 -0700)] 
SAMZA-1143; Universal config support for localized resource

More details in https://issues.apache.org/jira/browse/SAMZA-1143

Tests: ./gradlew clean check successful and all unit tests passed

Author: Fred Ji <fji@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #90 from fredji97/universalLocalizer

22 months agoSAMZA-1137: Instantiate ApplicationRunner in SamzaContainer
Xinyu Liu [Fri, 24 Mar 2017 20:49:30 +0000 (13:49 -0700)] 
SAMZA-1137: Instantiate ApplicationRunner in SamzaContainer

Create an ApplicationRunner in SamzaContainer to provide StreamSpecs for fluent API.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #94 from xinyuiscool/SAMZA-1137

22 months agoSAMZA-1134:Simplify barrier for zk version upgrade.
Boris Shkolnik [Thu, 23 Mar 2017 21:25:53 +0000 (14:25 -0700)] 
SAMZA-1134:Simplify barrier for zk version upgrade.

see SAMZA-1134 for details.

Author: Boris Shkolnik <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #81 from sborya/SimplifyBarrierTO

22 months agoSAMZA-1149; upgrade mockito from 1.8.4 to 1.10.19
Fred Ji [Wed, 22 Mar 2017 20:45:02 +0000 (13:45 -0700)] 
SAMZA-1149; upgrade mockito from 1.8.4 to 1.10.19

More details: https://issues.apache.org/jira/browse/SAMZA-1149
Test: gradlew clean check successfully. All unit tests passed.

Author: Fred Ji <fji@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Navina Ramesh <navina@apache.org>

Closes #89 from fredji97/mockitoUpgrade

22 months agosamza-1156: Improve ApplicationRunner method naming and class structure
Jacob Maes [Wed, 22 Mar 2017 14:59:08 +0000 (07:59 -0700)] 
samza-1156: Improve ApplicationRunner method naming and class structure

navina xinyuiscool nickpan47 take a look

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Closes #93 from jmakes/app-runner-class-hier

23 months agoSAMZA-1158 Adding monitor to clean up stale local stores of tasks
Shanthoosh Venkataraman [Mon, 20 Mar 2017 22:18:27 +0000 (15:18 -0700)] 
SAMZA-1158 Adding monitor to clean up stale local stores of tasks

23 months agoSAMZA-1154: Tasks endpoint to list the complete details of all tasks related to a job
Shanthoosh Venkataraman [Mon, 20 Mar 2017 17:43:29 +0000 (10:43 -0700)] 
SAMZA-1154: Tasks endpoint to list the complete details of all tasks related to a job

23 months agoSAMZA-1108: Implementation of Windows and various kinds of Triggers
vjagadish1989 [Sun, 19 Mar 2017 19:25:40 +0000 (12:25 -0700)] 
SAMZA-1108: Implementation of Windows and various kinds of Triggers

* Implemented various triggers and the orchestration logic of the window operator.
* Implemented wire-up of window and the flow of messages through various trigger implementations.
* Implementations for count, time, timeSinceFirst, timeSinceLast, Any, Repeating triggers.

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>, Prateek Maheshwari <pmaheshw@linkedin.com>, Chris Pettitt <cpettitt@linkedin.com>

Closes #66 from vjagadish1989/window-impl

23 months agoSAMZA-1093: instantiating StreamOperatorTask in SamzaContainer
Yi Pan (Data Infrastructure) [Sat, 18 Mar 2017 07:28:41 +0000 (00:28 -0700)] 
SAMZA-1093: instantiating StreamOperatorTask in SamzaContainer

SAMZA-1093: Instantiating StreamOperatorTask in SamzaContainer

- minor refactor to get the TaskFactory based on task class configuration
- add instantiation of StreamOperatorTask in SamzaContainer

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Xinyu Liu <xinyu@apache.org>, Navina Ramesh <nramesh@linkedin.com>, vjagadish1989 <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>, Chris Pettitt <cpettitt@linkedin.com>

Closes #65 from nickpan47/SAMZA-1093 and squashes the following commits:

9daf451 [Yi Pan (Data Infrastructure)] SAMZA-1093: added javadoc for TaskFactoryUtil class
078fee3 [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
e7be967 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review comments
b625beb [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
df27dc4 [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
91b1ba8 [Yi Pan (Data Infrastructure)] SAMZA-1093: fix merge errors. Removing 4 renamed old files
12a1529 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review feedbacks. Opened SAMZA-1137 for ApplicationRunner initialization in SamzaContainer
ac08d55 [Yi Pan (Data Infrastructure)] WIP: SAMZA-1093: rebase w/ ApplicationRunner patch.
9e1e1f6 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review feedbacks.
1c5158d [Yi Pan (Data Infrastructure)] SAMZA-1093: code ready, more unit tests pending
c613334 [Yi Pan (Data Infrastructure)] SAMZA-1093: fix import
6404964 [Yi Pan (Data Infrastructure)] SAMZA-1093: Instantiating StreamOperatorTask in SamzaContainer
b932454 [Yi Pan (Data Infrastructure)] Merge branch 'master' into old-SAMZA-1093
82b839f [Yi Pan (Data Infrastructure)] SAMZA-1093: instantiating StreamOperatorTask in SamzaContainer (WIP)

23 months agoSAMZA-1131: RemoteApplicationRunner for cluster-based Samza applications
Xinyu Liu [Fri, 17 Mar 2017 18:54:57 +0000 (11:54 -0700)] 
SAMZA-1131: RemoteApplicationRunner for cluster-based Samza applications

RemoteApplicationRunner starts the Samza StreamApplication on the remote cluster, e.g. Yarn. It uses ExecutionPlanner for planning physical execution, and JobRunner to start each stage of the application.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #88 from xinyuiscool/SAMZA-1131

23 months agoSAMZA-1140; Non blocking commit in Async Runloop
Shanthoosh Venkataraman [Fri, 17 Mar 2017 18:46:34 +0000 (11:46 -0700)] 
SAMZA-1140; Non blocking commit in Async Runloop

Adds non blocking commit into AsyncRunLoop. Clients can enable it by setting an opt-in config task.async.commit (which is disabled by default). Contains the following list of changes for AsyncStreamTask type of tasks.

a) Enable commit for a task,  when there're uncompleted callbacks from a task. Progressive commit of finished callbacks, when there are messages in flight for a task.
b) Enable processing of messages for a task, when there're commits in progress from a task.

Documentation for this change will be done in a separate PR.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #85 from shanthoosh/asyncCommitSupport

23 months agoSAMZA-1146; TaskCallbackManager commit fix.
Shanthoosh Venkataraman [Fri, 17 Mar 2017 18:42:34 +0000 (11:42 -0700)] 
SAMZA-1146; TaskCallbackManager commit fix.

Each task callback in samza belongs to different SystemStreamPartition. When multiple callbacks in contagious order are available for commit, callback with highest sequence number is chosen for commit. This will prevent checkpointing of completed callbacks that has commit request and doesn't have highest sequence number. Upon task restart this will lead to duplicate reprocessing of already processed messages (since completed callbacks for some SystemStreamPartition's aren't committed earlier).

This PR fixes it and commits all completed callbacks that has commit request defined. Added a test to verify the behavior.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>
Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>
Author: vjagadish1989 <jvenkatr@linkedin.com>
Author: Boris Shkolnik <boryas@apache.org>
Author: Prateek Maheshwari <pmaheshw@linkedin.com>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Chen Song <csong@appnexus.com>
Author: Tommy Becker <tobecker@tivo.com>
Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #87 from shanthoosh/Fixing_CallBackManager_Commit

23 months agoResolve unit test compilation issue due to conflicting commits at the same time
Xinyu Liu [Thu, 16 Mar 2017 22:18:37 +0000 (15:18 -0700)] 
Resolve unit test compilation issue due to conflicting commits at the same time

23 months agoSAMZA-1067; Physical execution graph and planner for fluent API
Xinyu Liu [Thu, 16 Mar 2017 01:27:01 +0000 (18:27 -0700)] 
SAMZA-1067; Physical execution graph and planner for fluent API

Initial commit for the physical graph and plan. Design is there: https://issues.apache.org/jira/secure/attachment/12856670/SAMZA-1067.0.pdf.

The commit includes:

1) Physical ProcessorGraph, where each processor represents a physical execution unit (e.g. a job in Yarn).
2) A planner does the following:
   - create ProcessorGraph from StreamGraph. For this phase, the graph only contains a single node (single stage);
   - figure out the partitions of intermediate topics
   - create the topics

Please note currently the planner is used in the remote runner for now. Further changes/refactoring/cleanup are expected to be integrated with local runner.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jagadish Venkatraman <jvenkatraman@linkedin.com>

Closes #75 from xinyuiscool/SAMZA-1067

23 months agoFix an import issue on TestJoinOperator
vjagadish1989 [Tue, 14 Mar 2017 21:00:19 +0000 (14:00 -0700)] 
Fix an import issue on TestJoinOperator

23 months agoSAMZA-1091; Implement key-based inner join operator with no time constraints
Prateek Maheshwari [Tue, 14 Mar 2017 20:44:54 +0000 (13:44 -0700)] 
SAMZA-1091; Implement key-based inner join operator with no time constraints

Author: Prateek Maheshwari <pmaheshw@linkedin.com>
Author: Prateek Maheshwari <prateekm@utexas.edu>

Reviewers: Jagadish Venkatraman <jagadish@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #60 from prateekm/master

23 months agoSAMZA-1100; Exception when using a stream as both bootstrap and broadcast.
Shanthoosh Venkataraman [Mon, 13 Mar 2017 19:07:45 +0000 (12:07 -0700)] 
SAMZA-1100; Exception when using a stream as both bootstrap and broadcast.

When a task input stream is used as both broadcast and bootstrap stream in a samza job, Bootstrappingchooser marks the stream as bootstrapped when a single task finishes consuming all the SystemStreamPartitions(This happens when all the starting offset for each partition in the input stream is of type upcoming). This patch fixes this, by marking a stream as bootstrapped, only when all the systemStreamPartitions in a input stream is consumed by all the expected tasks.

More details here : https://issues.apache.org/jira/browse/SAMZA-1100

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>

Closes #68 from shanthoosh/master

23 months agoSAMZA-1112; BrokerProxy does not log fatal errors
Tommy Becker [Sat, 11 Mar 2017 14:56:48 +0000 (06:56 -0800)] 
SAMZA-1112; BrokerProxy does not log fatal errors

Add an UncaughtExceptionHandler to the broker proxy thread so
failures there get logged.

Author: Tommy Becker <tobecker@tivo.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #80 from twbecker/SAMZA-1112

23 months agoSAMZA-1123; Create intermediate stream in partitionBy() operator
Xinyu Liu [Fri, 10 Mar 2017 22:08:18 +0000 (14:08 -0800)] 
SAMZA-1123; Create intermediate stream in partitionBy() operator

For partitionBy() operator, Samza generates an intermediate stream with id based on operator name and id, and system based on config. The intermediate streams will be materialized later by different execution environments. For example, if the intermediate stream is a Kafka stream, the topic will be created before the application starts.

Also renamed the config from "job.runner.class" to "app.runner.class".

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #79 from xinyuiscool/SAMZA-1123

23 months agoSAMZA-1124; Job coordinator with time out
Boris Shkolnik [Thu, 9 Mar 2017 01:16:51 +0000 (17:16 -0800)] 
SAMZA-1124; Job coordinator with time out

If a processor doesn't join the barrier for the TimeOut time - the barrier is cancelled. All the processor should unsubscribe from it.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <boryas@apache.org>
Author: navina <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #77 from sborya/JobCoordinatorWithTO and squashes the following commits:

ed055dd [Boris Shkolnik] checkstyle
1468902 [Boris Shkolnik] renambed a method
579d2e7 [Boris Shkolnik] Merge branch 'JobCoordinator' into JobCoordinatorWithTO
54ab688 [Boris Shkolnik] removed JavaJobConfig.java
3f95d46 [Boris Shkolnik] checkstyle
75f9a94 [Boris Shkolnik] added test for time out
b73ba32 [Boris Shkolnik] merge + test
5d67be0 [Boris Shkolnik] removed extra empty lines
9cf3c3e [Boris Shkolnik] addressed review comments
c47031d [Boris Shkolnik] added timeout for ZkBarrierForVersionUpgrade
93a55a9 [Boris Shkolnik] use write directly in the barrier when all joined
f89a037 [Boris Shkolnik] addressed some notes
4219720 [Boris Shkolnik] added JavaJobConfig
659ae7c [Boris Shkolnik] add ZkJobCoordinatorFactory
520a083 [Boris Shkolnik] added ZKJobCoordinatorFactory
a42218e [Boris Shkolnik] checkstyle
02ca658 [Boris Shkolnik] typo
c1fd0b2 [Boris Shkolnik] merge cleanup
1c8eef4 [Boris Shkolnik] merge cleanup
a7e014a [Boris Shkolnik] merged
b4a0642 [Boris Shkolnik] typo
33bacdd [Boris Shkolnik] merge
d1582a9 [Boris Shkolnik] missed method name change
4c481a2 [Boris Shkolnik] merge
b811f3d [Boris Shkolnik] changed to private final
5d43419 [Boris Shkolnik] addressed review comments
1a0c54d [Boris Shkolnik] fixed test
e19d77b [Boris Shkolnik] renamed method
0c5edab [Boris Shkolnik] makey tryBecomeALeader async
b34f6b7 [Boris Shkolnik] added java doc
c2305b6 [Boris Shkolnik] cleanup
f83fc57 [Boris Shkolnik] merge
f8c8a6d [Boris Shkolnik] make a smaller PR for publish functionality only
251aad7 [Boris Shkolnik] removed unneeded interface for real
9892dee [Boris Shkolnik] removed unneeded interface
f20d15f [Boris Shkolnik] some updates to JobCoordinator
bd53c07 [Boris Shkolnik] deleteing already committed files
18198d1 [Boris Shkolnik] added test for zk barrier
9dba992 [Boris Shkolnik] moved the Test in to test subdir
c16d864 [Boris Shkolnik] added test
817a7b6 [Boris Shkolnik] merge complete
1e5947f [Boris Shkolnik] merged
6506b48 [Boris Shkolnik] merged with latest
4290b13 [Boris Shkolnik] merged
43eb076 [Boris Shkolnik] checkstyle errors
e0c44fe [Boris Shkolnik] merged
8e8d833 [Boris Shkolnik] merge
e59d38c [Boris Shkolnik] merge with ZkController
6a71cf6 [Boris Shkolnik] renamed the listners
6cbcf6e [Boris Shkolnik] review comments
efbee84 [Boris Shkolnik] Checkstyle errors
132300c [Boris Shkolnik] converted ZkLeaderElector.tryBecomeLeader to async method
13a05d7 [Boris Shkolnik] merge
2d59e0c [Boris Shkolnik] merge
82c819b [Boris Shkolnik] merge
7ebe9a6 [Boris Shkolnik] check style
41b2e46 [Boris Shkolnik] refactoring to match the new ZkUtils constructor
b07d63a [Boris Shkolnik] merge JavaJobConfig
bdc953b [Boris Shkolnik] merged
4301372 [Boris Shkolnik] added tests
4d48d83 [Boris Shkolnik] merge
c9bb475 [Boris Shkolnik] added tests for ZkUtils
592e9bb [Boris Shkolnik] added missing functionality for ZkControllerImpl into zkUtils and zkKeyBuilder
b473a6e [Boris Shkolnik] Added the new file ZkControllerListener.java
3412ed4 [Boris Shkolnik] Renamed ZkListener to ZkControllerListener
fabddc9 [Boris Shkolnik] merge
ad9108a [Boris Shkolnik] merge with ScheduleAfterDebounceTime
ba583d6 [Boris Shkolnik] merge
3d6b993 [Boris Shkolnik] cleaned up
fe69e70 [Boris Shkolnik] merge
7f8125b [Boris Shkolnik] Merge branch 'master' into ZkTestUtils
eaf04bb [Boris Shkolnik] added more comments
358ae6b [Boris Shkolnik] Added test and addressed review comments
9b22eb6 [Boris Shkolnik] JobModelPublish
0ef90b6 [Boris Shkolnik] Merge branch 'JobModel' into JobModelPublish
9c59048 [Boris Shkolnik] merge
017fe79 [Boris Shkolnik] added tests
5c8aa20 [Boris Shkolnik] JobModel Generation using SimpleGroupByContainerCount
2c841e1 [Boris Shkolnik] added awaitStart
eeb69ca [Boris Shkolnik] Merge branch 'ZkBarrier' into JobCoordinator
dc26bd2 [Boris Shkolnik] added BarrierForVersionUpgrade
cfdb4c7 [Boris Shkolnik] Merge branch 'ZkController' into ZkBarrier
b28ba14 [Boris Shkolnik] ZkBarrier
c8d26ba [Boris Shkolnik] merged
efc4d03 [Boris Shkolnik] ZkController
c9b3fe4 [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime' into ZkController
f0cae7b [Boris Shkolnik] Merge branch 'LeaderElector' into ZkController
3df0def [Boris Shkolnik] ZkControllerImpl
4801613 [Boris Shkolnik] cleanup
d32045b [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime1' into JobCoordinator
32f96b4 [Boris Shkolnik] added ScheduleAfterDebounce
ce83409 [Boris Shkolnik] cleanup
d7a7ccb [Boris Shkolnik] Merge branch 'LeaderElector' into ScheduleAfterDebounceTime
9c6b20a [Boris Shkolnik] added ZkListener
5f867c0 [Boris Shkolnik] added Apache license info
d0687b9 [Boris Shkolnik] Merge branch 'LeaderElector' into JobCoordinator
372829f [Boris Shkolnik] Merge branch 'master' into JobCoordinator
14db43d [Boris Shkolnik] added TestZkStreamProcessor - main manual test
b3b27c6 [Boris Shkolnik] Merge branch 'LeaderElector' into TestZkStreamProcessor
6649c80 [navina] Fixing ZkUtils close(). No need to close underlying connection explicitly
7f17e26 [Boris Shkolnik] ZkTestUtils
f904cd3 [Boris Shkolnik] ScheduleAfterDebounceTime
d126b10 [Boris Shkolnik] ScheduleAfterDebounceTime
63d8d60 [Boris Shkolnik] JavaJobConfig
ff15501 [Boris Shkolnik] JavaJobConfig
d20bacf [Boris Shkolnik] ZkTestUtils
737eb2f [Boris Shkolnik] ZkTestUtils
8e2d6c1 [Boris Shkolnik] add main manual test
7a47f84 [Boris Shkolnik] add main manual test
fa2186b [Boris Shkolnik] added ZkController
a0a7409 [Boris Shkolnik] Merge branch 'master' of https://github.com/navina/samza
edda60d [navina] Removing an unintended change to the grouper
6dd6b8d [navina] Adding tests for ZkLeaderElector
1734f8f [navina] Adding tests for ZkUtils
317cf16 [navina] Adding tests for ZkKeyBuilder
aaaf24e [navina] Adding EmbeddedZookeeper for testing
37c2c8b [navina] Extracting files related to LeaderElection
76b5167 [Boris Shkolnik] added new line at then end

23 months agoSAMZA-1121; StreamAppender should not propagate exceptions to the caller
Prateek Maheshwari [Wed, 8 Mar 2017 08:26:46 +0000 (00:26 -0800)] 
SAMZA-1121; StreamAppender should not propagate exceptions to the caller

StreamAppender#append currently propagates any exceptions while sending messages to the underlying logging system to the calling code. Since users don't expect log statements to throw exceptions, this can cause unexpected failures scenarios. We should catch exceptions and log to stderr instead.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #78 from prateekm/stream-appender-fix

23 months agoSAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner
Xinyu Liu [Tue, 7 Mar 2017 23:02:30 +0000 (15:02 -0800)] 
SAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner

Some refactoring/cleanup:

- rename ExecutionEnvironment to ApplicationRunner, including all the subclasses.
- rename the package to be org.apache.samza.runtime
- rename the StandalondApplicationRunner to be LocalApplicationRunner

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #76 from xinyuiscool/SAMZA-1122 and squashes the following commits:

cff5206 [Xinyu Liu] Merge branch 'SAMZA-1122' of https://github.com/xinyuiscool/samza into SAMZA-1122
c341d3d [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner
6a71205 [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner

23 months agoSAMZA-1096: StreamSpec constructors in the ExecutionEnvironments
Jacob Maes [Mon, 6 Mar 2017 22:52:18 +0000 (14:52 -0800)] 
SAMZA-1096: StreamSpec constructors in the ExecutionEnvironments

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>,Xinyu Liu <xiliu@linkedin.com>,Navina Ramesh <navina@apache.org>

Closes #74 from jmakes/samza-1096

23 months agoSAMZA-1107:Job model publish
Boris Shkolnik [Wed, 1 Mar 2017 21:49:29 +0000 (13:49 -0800)] 
SAMZA-1107:Job model publish

add utils for publishing job model and job model version to ZK.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <boryas@apache.org>
Author: navina <navina@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>, Fred Ji <fredji97@yahoo.com>

Closes #67 from sborya/JobModelPublish1

23 months agoSAMZA-1103: ZkBarrier
Boris Shkolnik [Wed, 1 Mar 2017 01:56:50 +0000 (17:56 -0800)] 
SAMZA-1103: ZkBarrier

SAMZA-1103: Barrier for JobModel upgrades. When all the processors got notification about the new JobModel, only after that they can start using the new model.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <boryas@apache.org>
Author: navina <navina@apache.org>

Reviewers: Fred Ji <fredji97@yahoo.com>, Navina Ramesh <navina@apache.org>, Xiliu Liu <xiliu@linkedin.com>

Closes #61 from sborya/ZkBarrier

23 months agoFix a rendering issue in the Samza security web-page
vjagadish1989 [Tue, 28 Feb 2017 01:38:17 +0000 (17:38 -0800)] 
Fix a rendering issue in the Samza security web-page

23 months agoSAMZA-1104; fix yarn security page link from index.html page
Chen Song [Tue, 28 Feb 2017 01:31:29 +0000 (17:31 -0800)] 
SAMZA-1104; fix yarn security page link from index.html page

Author: Chen Song <csong@appnexus.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #62 from garlicbulb-puzhuo/SAMZA-1104

23 months agoFixing checkstyle error in StreamGraphImpl causing build failures
navina [Sat, 25 Feb 2017 02:17:29 +0000 (18:17 -0800)] 
Fixing checkstyle error in StreamGraphImpl causing build failures

23 months agoSAMZA-1102: Zk controller
Boris Shkolnik [Thu, 23 Feb 2017 22:02:05 +0000 (14:02 -0800)] 
SAMZA-1102: Zk controller

SAMZA-1102: Added ZKController and ZkControllerImpl

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: navina <navina@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>, Fred Ji <fji@apache.org>, Xinyu Liu <xiliu@linkedin.com>

Closes #50 from sborya/ZkController

23 months agoSAMZA-1092: replace stream spec in fluent API
Yi Pan (Data Infrastructure) [Thu, 23 Feb 2017 20:48:56 +0000 (12:48 -0800)] 
SAMZA-1092: replace stream spec in fluent API

Replaced the StreamSpec class w/ the new one from SAMZA-1075.

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #58 from nickpan47/replace-stream-spec and squashes the following commits:

761ebb5 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class in system package
df953c2 [Yi Pan (Data Infrastructure)] SAMZA-1092: fix unit test
71331d8 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class
2fb19e9 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class in fluent API
ed3ad8e [Yi Pan (Data Infrastructure)] WIP: replace stream spec in fluent API

23 months agoSAMZA-1099: Documentation updates for Samza 0.12 release (for master branch)
vjagadish1989 [Wed, 22 Feb 2017 18:53:12 +0000 (10:53 -0800)] 
SAMZA-1099: Documentation updates for Samza 0.12 release (for master  branch)

23 months agoSAMZA-1097: update master branch to use 0.13.0-SNAPSHOT version
Yi Pan (Data Infrastructure) [Wed, 22 Feb 2017 01:16:58 +0000 (17:16 -0800)] 
SAMZA-1097: update master branch to use 0.13.0-SNAPSHOT version

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: jagadish <jvenkatr@linkedin.com>

Closes #59 from nickpan47/SAMZA-1097

23 months agoSAMZA-1075: fix partitionCount assertion from PR53
Jacob Maes [Wed, 22 Feb 2017 00:49:37 +0000 (16:49 -0800)] 
SAMZA-1075: fix partitionCount assertion from PR53

nickpan47 here's the fix for the issue you found in PR53

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #57 from jmakes/samza-1075-2

2 years agoFix hyphens in url for committer instructions
Jacob Maes [Fri, 17 Feb 2017 23:37:22 +0000 (15:37 -0800)] 
Fix hyphens in url for committer instructions

2 years agoSAMZA-1075: Refactor SystemAdmin Interface to expose a common method to create streams
Jacob Maes [Fri, 17 Feb 2017 20:49:19 +0000 (12:49 -0800)] 
SAMZA-1075: Refactor SystemAdmin Interface to expose a common method to create streams

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Closes #53 from jmakes/samza-1075

2 years agoSAMZA-1073: moving all operator classes into samza-core
Yi Pan (Data Infrastructure) [Thu, 16 Feb 2017 23:04:01 +0000 (15:04 -0800)] 
SAMZA-1073: moving all operator classes into samza-core

2 years agoSAMZA-1086; New Grouper for ZK based standalone.
Boris Shkolnik [Thu, 16 Feb 2017 19:40:12 +0000 (11:40 -0800)] 
SAMZA-1086; New Grouper for ZK based standalone.

SAMZA-1086.
Create new grouper with support for arbitrary container ids.
Add support for this list of container IDs in the JobModelManager.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Fred Ji <fredji97@yahoo.com>

Closes #52 from sborya/JobModel

2 years agoSAMZA-1073: top-level fluent API
Yi Pan (Data Infrastructure) [Thu, 16 Feb 2017 18:18:09 +0000 (10:18 -0800)] 
SAMZA-1073: top-level fluent API

`Initial draft of top-level fluent API for operator DAGs

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Jacob Maes <jmaes@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #51 from nickpan47/samza-fluent-api-v1 and squashes the following commits:

001be63 [Yi Pan (Data Infrastructure)] SAMZA-1073: Addressing review feedbacks. Change StreamGraphFactory to StreamGraphBuilder.
373048a [Yi Pan (Data Infrastructure)] SAMZA-1073: top-level fluent API `

2 years agoSAMZA-1087: Schedule after debounce time
Boris Shkolnik [Thu, 16 Feb 2017 01:17:01 +0000 (17:17 -0800)] 
SAMZA-1087: Schedule after debounce time

SAMZA-1087: Allows scheduling an action (a Runnable) after some de-bounce delay.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <boryas@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>, Fred Ji <fji@linkedin.com>

Closes #49 from sborya/ScheduleAfterDebounceTime1

2 years agoSAMZA-1082 : Implement Leader Election using ZK
navina [Mon, 13 Feb 2017 21:32:06 +0000 (13:32 -0800)] 
SAMZA-1082 : Implement Leader Election using ZK

Simple implementation of leader election recipe along with unit tests

Author: navina <navina@apache.org>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>, Fred Ji <fji@linkedin.com>

Closes #48 from navina/LeaderElector

2 years agoSAMZA-1083 Do not load task stores which are older than delete tombstones during...
Shanthoosh Venkataraman [Thu, 9 Feb 2017 00:07:27 +0000 (16:07 -0800)] 
SAMZA-1083 Do not load task stores which are older than delete tombstones during container startup

2 years agoFix integration test config issue in 0.12 release candidate release-0.12.0-rc2
vjagadish1989 [Mon, 6 Feb 2017 23:39:56 +0000 (15:39 -0800)] 
Fix integration test config issue in 0.12 release candidate

2 years agoFix javadoc issues introduced by Hdfs consumer release-0.12.0-rc1
Xinyu Liu [Thu, 2 Feb 2017 22:31:23 +0000 (14:31 -0800)] 
Fix javadoc issues introduced by Hdfs consumer

2 years agoUpdate Samza version to prepare for 0.12.0 Samza release
vjagadish1989 [Thu, 2 Feb 2017 22:09:40 +0000 (14:09 -0800)] 
Update Samza version to prepare for 0.12.0 Samza release

2 years agoAdd release PGP keys for Jagadish <jagadish@apache.org> release-0.12.0-rc0
vjagadish1989 [Wed, 1 Feb 2017 21:37:23 +0000 (13:37 -0800)] 
Add release PGP keys for Jagadish <jagadish@apache.org>

2 years agoUpdate README.md to point to the most recent hello-samza release
vjagadish1989 [Tue, 31 Jan 2017 23:36:18 +0000 (15:36 -0800)] 
Update README.md to point to the most recent hello-samza release