samza.git
14 months agoSAMZA-1268: Javadoc cleanup for public APIs for 0.13 release
Prateek Maheshwari [Tue, 9 May 2017 02:24:03 +0000 (19:24 -0700)] 
SAMZA-1268: Javadoc cleanup for public APIs for 0.13 release

This PR cleans up javadocs for Fluent API classes in the samza-api module.
Also updates the TaskContext (existing) and ContextManager (new) interfaces to add support for setting an pass-through user-defined context.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #169 from prateekm/api-docs-cleanup

14 months agoSAMZA-1150 : Handling Error propagation between ZkJobCoordinator & DebounceTimer
Navina Ramesh [Tue, 9 May 2017 00:58:55 +0000 (17:58 -0700)] 
SAMZA-1150 : Handling Error propagation between ZkJobCoordinator & DebounceTimer

This PR depends on PR #153
* Treats all errors in jobcoordinator as FATAL and shuts-down the streamprocessor
* [Bug] Fixed bug reported in SAMZA-1241
* Introduced a callback to be associated with the timer (same callback for every Runnable failure)

**TBD**: some more unit tests

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@apache.org>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #166 from navina/SAMZA-1150

14 months agoSAMZA-1206; Fix TestJMXServer.
Shanthoosh Venkataraman [Tue, 9 May 2017 00:22:14 +0000 (17:22 -0700)] 
SAMZA-1206; Fix TestJMXServer.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh<navina@apache.org>

Closes #164 from shanthoosh/fix-test-jmx-server-1

14 months agoSAMZA-1196; Fix TestJmxReporter
Shanthoosh Venkataraman [Tue, 9 May 2017 00:19:36 +0000 (17:19 -0700)] 
SAMZA-1196; Fix TestJmxReporter

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #140 from shanthoosh/fix-test-jmx-reporter

14 months agoSAMZA-1232; Log configuration value in RunLoopFactory
Shanthoosh Venkataraman [Tue, 9 May 2017 00:18:38 +0000 (17:18 -0700)] 
SAMZA-1232; Log configuration value in RunLoopFactory

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #137 from shanthoosh/adding_logging_into_asyncrunloop

14 months agoDisabled a few flaky tests and added corresponding tickets to fix.
Prateek Maheshwari [Mon, 8 May 2017 20:29:23 +0000 (13:29 -0700)] 
Disabled a few flaky tests and added corresponding tickets to fix.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #171 from prateekm/disable-flaky-test

14 months agoSAMZA-1267: ApplicationRunner#getLocalRunner returns null
Xinyu Liu [Mon, 8 May 2017 16:39:43 +0000 (09:39 -0700)] 
SAMZA-1267: ApplicationRunner#getLocalRunner returns null

Remove ApplicationRunner#getLocalRunner and clean up any usage examples.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #168 from xinyuiscool/SAMZA-1267

14 months agoSAMZA-1155; Validate users configure window.ms when using the fluent API
vjagadish1989 [Sat, 6 May 2017 02:21:29 +0000 (19:21 -0700)] 
SAMZA-1155; Validate users configure window.ms when using the fluent API

We will compute triggering duration as follows:
- If user configures `task.window.ms` we will honor it as the triggering duration
- If not, we will use the `GCD(windowTriggerDurations, joinTTLs)` as the triggering duration.

Changes in this PR:
- Common Interface for all time based triggers
- Additional APIs in `StreamGraphImpl` to recursively traverse all `OperatorSpec`s
- Recursive computation of `triggerInterval` for each `WindowOperatorSpec`
- Tests for all the above

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Jacob Maes <jmaes@linkedin.com>, Xinyu Liu <xinyu@apache.org>

Closes #160 from vjagadish1989/samza-1155

14 months agoSAMZA-1266: Unable to use MetricsSnapshotReporterFactory with fluent API
Jacob Maes [Fri, 5 May 2017 22:48:32 +0000 (15:48 -0700)] 
SAMZA-1266: Unable to use MetricsSnapshotReporterFactory with fluent API

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Navina Ramesh <navina@apache.org>

Closes #167 from jmakes/samza-1266

14 months agoSAMZA-1251 - Remove DebounceTimer dependency from ZkLeaderElector & ZkController
Navina Ramesh [Fri, 5 May 2017 20:50:08 +0000 (13:50 -0700)] 
SAMZA-1251 - Remove DebounceTimer dependency from ZkLeaderElector & ZkController

Addresses the following:
* Makes LeaderElectionListener to be explicitly registered by the caller
* Removes debouncetimer dependency from ZkLeaderElector implementation
* [Bug] onBecomeLeader was scheduling a task in timer under "OnBecomeLeader", when it should actually be the same as "OnProcessorChange". Otherwise, it will not cancel when there is a new OnProcessorChange event.
* [Transient Test Failure] `TestScheduleAfterDebounceTime` tests were relying on timing controlled by sleep. Fixed it by using latch

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Boris Shkolnik <boryas@apache.org>

Closes #153 from navina/SAMZA-1251

14 months agoSAMZA-1228 : StreamProcessor should stop JmxServer
Navina Ramesh [Fri, 5 May 2017 20:22:35 +0000 (13:22 -0700)] 
SAMZA-1228 : StreamProcessor should stop JmxServer

This is not the solution posted in SAMZA-1228. For now, we are moving jmxserver lifecycle to be within the container. Ideally, it should be within the Streamprocessor so that the job coordinator can also be associated with the same instance.

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #162 from navina/SAMZA-1228

14 months agoSAMZA-1222: Clean up LocalApplicationRunner
Xinyu Liu [Thu, 4 May 2017 22:17:56 +0000 (15:17 -0700)] 
SAMZA-1222: Clean up LocalApplicationRunner

Clean up the LocalApplicationRunner based on the further feedback. The changes include the following:
1. Remove the processorId from the JobCoordinatorFactory/JobCoordinator interfaces
2. LocalApplicationRunner.run() is non-blocking. Add LocalApplicationRunner.waitForFinish() for blocking for completion
3. Remove the config for CooridnatorServiceFactory, and now the CoordinatorService is created based on the type of JobCoordinator.
4. Clean up the StreamProcessor life cycle listener logic inside LocalApplicationRunner.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #135 from xinyuiscool/SAMZA-1222

14 months agoSAMZA-1263: Samza Fluent: NPE is streams.id.samza.system is missing
Jacob Maes [Thu, 4 May 2017 20:57:50 +0000 (13:57 -0700)] 
SAMZA-1263: Samza Fluent: NPE is streams.id.samza.system is missing

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #165 from jmakes/samza-1263

14 months agoSAMZA-1247: MessageStreamImpl#merge shouldn't mutate input collection
Prateek Maheshwari [Thu, 4 May 2017 19:28:58 +0000 (12:28 -0700)] 
SAMZA-1247: MessageStreamImpl#merge shouldn't mutate input collection

Also fixes
SAMZA-1253: MessageStream.merge operator broken for nested types

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>, Jagadish <jvenkatr@linkedin.com>

Closes #159 from prateekm/merge-fixes

14 months agoSAMZA-1224 : Revert job coordinator factory config to the old format
Navina Ramesh [Thu, 4 May 2017 06:17:59 +0000 (23:17 -0700)] 
SAMZA-1224 : Revert job coordinator factory config to the old format

We didn't release since adding this config. So, it is ok to change the format now.

Author: nramesh <nramesh@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #146 from navina/SAMZA-1224

14 months agoSAMZA-1259: LocalApplicationRunner throws exception when configured with a ProcessorI...
Sean McCauliff [Thu, 4 May 2017 05:58:51 +0000 (22:58 -0700)] 
SAMZA-1259: LocalApplicationRunner throws exception when configured with a ProcessorIdGenerator

Minor fix.  Unit tests were updated and passed.

Author: Sean McCauliff <smccauliff@linkedin.com>

Reviewers: Navina Ramesh <nramesh@linkedin.com>

Closes #161 from smccauliff/samza-1259

14 months agoSAMZA-1257: make sure a dataChange listener in LeaderElection is always initialized.
Boris Shkolnik [Thu, 4 May 2017 00:03:08 +0000 (17:03 -0700)] 
SAMZA-1257: make sure a dataChange listener in LeaderElection is always initialized.

xinyuiscool, navina please review.

Author: Boris Shkolnik <boryas@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #156 from sborya/SAMZA-1257

14 months agoSAMZA-1212 - Refactor interaction between StreamProcessor, JobCoordinator and SamzaCo...
Navina Ramesh [Wed, 3 May 2017 22:10:13 +0000 (15:10 -0700)] 
SAMZA-1212 - Refactor interaction between StreamProcessor, JobCoordinator and SamzaContainer

See SAMZA-1212 for motivation toward this refactoring.
Changes here are:
* Removed awaitStart (blocking) method in StreamProcessor, JobCoordinator and SamzaContainer
* Introduced SamzaContainerListener and JobCoordinatorListener interface implemented by StreamProcessor
* Introduced SamzaContainerStatus to handler failures and lifecycle using Listener interfaces

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #148 from navina/SAMZA-1212

14 months agoSAMZA-1256: Improve trace logging for troubleshooting the fluent API
Jacob Maes [Wed, 3 May 2017 20:54:35 +0000 (13:54 -0700)] 
SAMZA-1256: Improve trace logging for troubleshooting the fluent API

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #155 from jmakes/operator-trace-logging

14 months agoSAMZA-1246: ApplicatonRunner.stats() should include exception in case of failure
Xinyu Liu [Wed, 3 May 2017 19:17:22 +0000 (12:17 -0700)] 
SAMZA-1246: ApplicatonRunner.stats() should include exception in case of failure

Current when ApplicationRunner.stats() only returns the enum representing the status. It also need to include the exception if the status is failure.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #154 from xinyuiscool/SAMZA-1246

14 months agofix HdfsSystemAdmin when staging directory is empty
Hai Lu [Tue, 2 May 2017 19:09:55 +0000 (12:09 -0700)] 
fix HdfsSystemAdmin when staging directory is empty

getSystemStreamMetadata has the potential side effect to persist metadata to a staging directory on hdfs. This could fail if staging directory is empty. This patch addresses the issue with test to cover the scenario.

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #151 from lhaiesp/master

14 months agoSAMZA-1204: Visualize StreamGraph and ExecutionPlan
Xinyu Liu [Tue, 2 May 2017 17:09:22 +0000 (10:09 -0700)] 
SAMZA-1204: Visualize StreamGraph and ExecutionPlan

Once a Samza application (using fluent API) is deployed, an execution plan will be generated by the ExecutionPlanner. The plan JSON will be written to a file (plan.json) under the ./plan directory, which also contains the plan.html and javscripts (js folder).

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>
Author: xinyuiscool <xinyuliu.us@gmail.com>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #127 from xinyuiscool/SAMZA-1204

14 months agoSAMZA-1248 - Fix StandAlone barrier start list.
Boris Shkolnik [Tue, 2 May 2017 01:32:48 +0000 (18:32 -0700)] 
SAMZA-1248 - Fix StandAlone barrier start list.

14 months agoSAMZA-1249: Fix equality for WindowKey for Non-keyed tumbling windows
vjagadish1989 [Mon, 1 May 2017 20:57:25 +0000 (13:57 -0700)] 
SAMZA-1249: Fix equality for WindowKey for Non-keyed tumbling windows

- Fix a `ClassCastException` and an NPE when using Tumbling window without keys
- Fix equality and hashCode for `WindowKey`
- Refactor the `TestWindowOperator` unit tests using simpler types and a mock `MessageCollector`.

More details in `SAMZA-1249`

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Jacob Maes <jmaes@linkedin.com>

Closes #149 from vjagadish1989/samza-1249

14 months agoSAMZA-1214: Bug - system-scoped stream default configs may not be honored
Jacob Maes [Mon, 1 May 2017 20:52:13 +0000 (13:52 -0700)] 
SAMZA-1214: Bug - system-scoped stream default configs may not be honored

* Re-introduced deprecated system-stream configs into config table
* Fixed position of task.consumer.batch.size in config table
* Moved system-scoped defaults from StreamConfig to SystemConfig

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #150 from jmakes/samza-1214

14 months agoSAMZA-1250: JobRunner.kill doesn't terminate cleanly with YarnJob.
Jacob Maes [Mon, 1 May 2017 20:44:54 +0000 (13:44 -0700)] 
SAMZA-1250: JobRunner.kill doesn't terminate cleanly with YarnJob.

1. The ClientHelper now checks inactive application IDs so it can get status for terminated jobs in addition to running jobs
2. JobRunner.kill() waits for any finish, not just successful finish.
3. A killed job is now considered successful.

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #152 from jmakes/samza-1250

14 months agoSAMZA-1219; Add metrics for operator message received and execution times
Prateek Maheshwari [Thu, 27 Apr 2017 22:55:11 +0000 (15:55 -0700)] 
SAMZA-1219; Add metrics for operator message received and execution times

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #142 from prateekm/operator-metrics

14 months agoSamza 1214: Allow users to set a default replication.factor for intermediate topics
Jacob Maes [Thu, 27 Apr 2017 22:05:28 +0000 (15:05 -0700)] 
Samza 1214: Allow users to set a default replication.factor for intermediate topics

* Add a new "systems.sysName.default.stream.*" config structure that allows users to set system-wide defaults for streams.
* More thorough testing of system defaults and stream defaults
* Removed the old migration config from the config table since there's no code to support it.
* Moved 2 kafka-specific config accessors out of JobConfig and into KafkaConfig
* Removed duplicate impl of getChangelogStream()

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #141 from jmakes/samza-1214

14 months agoSAMZA-1210: Fixing merge issue and container id generations.
Boris Shkolnik [Thu, 27 Apr 2017 21:37:08 +0000 (14:37 -0700)] 
SAMZA-1210: Fixing merge issue and container id generations.

This PR is just for fixing issues introduced by a merge and changing container id generation.

Author: Boris Shkolnik <boryas@apache.org>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #122 from sborya/mergeFixes

14 months agoSAMZA-1245: Make stream samza.physical.name config name string public
Xinyu Liu [Thu, 27 Apr 2017 20:21:11 +0000 (13:21 -0700)] 
SAMZA-1245: Make stream samza.physical.name config name string public

For certain system such as hdfs, the physical stream name might need to be finalized during the config generation. In order to do that, we will need to expose the stream samza.physical.name config string.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #145 from xinyuiscool/SAMZA-1245

14 months agoSAMZA-1200: Scala compile for samza-core fails with ambiguous reference error for...
Prateek Maheshwari [Wed, 26 Apr 2017 23:58:55 +0000 (16:58 -0700)] 
SAMZA-1200: Scala compile for samza-core fails with ambiguous reference error for some compiler versions

... for some compiler versions.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #143 from prateekm/logging-compile-fix

14 months agoSAMZA-1026: HDFS System Producer should not have Kafka dependency
Prateek Maheshwari [Wed, 26 Apr 2017 23:55:25 +0000 (16:55 -0700)] 
SAMZA-1026: HDFS System Producer should not have Kafka dependency

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <nramesh@linkedin.com>

Closes #144 from prateekm/hdfs-kafka-dependency

14 months agoSAMZA-1233: Create SystemAdmin only for JobCoordinator System in SamzaTaskProxy
Shanthoosh Venkataraman [Wed, 26 Apr 2017 16:05:41 +0000 (09:05 -0700)] 
SAMZA-1233: Create SystemAdmin only for JobCoordinator System in SamzaTaskProxy

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #138 from shanthoosh/fix_samza_task_proxy

14 months agoSAMZA-1220 : Add thread name to SamzaContainer shutdown hook and prevent shutdown...
Navina Ramesh [Tue, 25 Apr 2017 23:27:02 +0000 (16:27 -0700)] 
SAMZA-1220 : Add thread name to SamzaContainer shutdown hook and prevent shutdown deadlock

* SamzaContainerExceptionHandler is written in Java and used by LocalContainerRunner.java

Author: nramesh <nramesh@linkedin.com>

Reviewers: Jagadish Venkataraman <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #139 from navina/SAMZA-1220

14 months agoSAMZA-1202; Multiple calls to getInputStream() result in non-deterministic behavior
vjagadish1989 [Tue, 25 Apr 2017 00:15:47 +0000 (17:15 -0700)] 
SAMZA-1202; Multiple calls to getInputStream() result in non-deterministic behavior

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>

Closes #136 from vjagadish1989/samza-1202

14 months agoSAMZA-1229; Disk space monitor should only count data in use by the container
Prateek Maheshwari [Mon, 24 Apr 2017 22:03:00 +0000 (15:03 -0700)] 
SAMZA-1229; Disk space monitor should only count data in use by the container

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #134 from prateekm/disk-space-monitor

14 months agoSAMZA-1226: relax type parameters in MessageStream functions
Yi Pan (Data Infrastructure) [Fri, 21 Apr 2017 23:14:41 +0000 (16:14 -0700)] 
SAMZA-1226: relax type parameters in MessageStream functions

relax the type parameter in user supplied functions in fluent API

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Navina Ramesh <navina@apache.org>, Jacob Maes <jmaes@linkedin.com>

Closes #133 from nickpan47/SAMZA-1226 and squashes the following commits:

b8d3461 [Yi Pan (Data Infrastructure)] SAMZA-1226: cleanup code example in StreamApplication javadoc
93fa471 [Yi Pan (Data Infrastructure)] SAMZA-1226: added more unit tests for type-cast functions
18e1e9f [Yi Pan (Data Infrastructure)] SAMZA-1226: address review feedbacks
b5da53b [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1226
7981b83 [Yi Pan (Data Infrastructure)] SAMZA-1226: relax type parameters in MessageStream functions

14 months agoSAMZA-868: support elasticsearch version 2.x
Jiri Humpolicek [Thu, 20 Apr 2017 08:25:31 +0000 (01:25 -0700)] 
SAMZA-868: support elasticsearch version 2.x

15 months agoSAMZA-1176: Intermittent TestJoinOperator unit test failure
Prateek Maheshwari [Wed, 19 Apr 2017 21:26:00 +0000 (14:26 -0700)] 
SAMZA-1176: Intermittent TestJoinOperator unit test failure

Join TTL was set too low (10 ms). `joinRetainsMatchedMessagesReverse` will fail if the execution time between line 176 and 186 is longer than that. Increased TTL to 1 min (will not affect test execution time).

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #131 from prateekm/join-test-fix

15 months agoSAMZA-1213 - StreamProcessorLifeCycleAware interface should not use processorId
Navina Ramesh [Wed, 19 Apr 2017 19:33:37 +0000 (12:33 -0700)] 
SAMZA-1213 - StreamProcessorLifeCycleAware interface should not use processorId

Refactoring LocalApplicationRunner s.t. each processor has its own listener instance, instead of a single listener keeping track of all processors.

Author: Navina Ramesh <navina@apache.org>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Xinyu Liu <xiliu@linkedin.com>

Closes #125 from navina/SAMZA-1213

15 months agoSAMZA-1209; Improve error handling in LocalStoreMonitor
Shanthoosh Venkataraman [Wed, 19 Apr 2017 06:18:13 +0000 (23:18 -0700)] 
SAMZA-1209; Improve error handling in LocalStoreMonitor

Changes:

1. Add opt-in configuration to continue garbage collection of local stores
   when there’s a failure in garbage collecting one local store.
2. Handle failures gracefully. In getJobModel, JobModel is expected as return response.
   Incase of failures, returned error-message Map is deserialized to JobModel
   resulting in ClassCastException. Handle 2xx, failures separately. Log error
   messages returned properly in httpGet call.
3. Fix getTasks url format.
4. Minor code cleanup(Switch to using ',' as seperator for job.status.servers configuration
   instead of '.' for ease of use).
5. Fix disabled tests in TestLocalStoreMonitor.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Closes #124 from shanthoosh/fix_config_format_in_localstore_monitor

15 months agoSAMZA-1217: Fix broken build due to TestCheckstyle problem.
Shanthoosh Venkataraman [Wed, 19 Apr 2017 04:23:39 +0000 (21:23 -0700)] 
SAMZA-1217: Fix broken build due to TestCheckstyle problem.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #130 from shanthoosh/master

15 months agoSAMZA-1157: Serialization/deserialization throwables should not be suppressed
Yi Pan (Data Infrastructure) [Wed, 19 Apr 2017 00:15:32 +0000 (17:15 -0700)] 
SAMZA-1157: Serialization/deserialization throwables should not be suppressed

…ppressed if user don't configure to drop those errors

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #128 from nickpan47/fix-serde-error-suppressed

15 months agoSAMZA-1215, SAMZA-1216: remove ProcessKiller and change the ZooKeeper connection...
Yi Pan (Data Infrastructure) [Wed, 19 Apr 2017 00:04:29 +0000 (17:04 -0700)] 
SAMZA-1215, SAMZA-1216: remove ProcessKiller and change the ZooKeeper connection string in test

… test to use 127.0.0.1 in connect string

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #129 from nickpan47/fix-some-tests

15 months agoSAMZA-1182 : Disable flaky tests in TestAsyncRunLoop
Shanthoosh Venkataraman [Tue, 18 Apr 2017 22:05:27 +0000 (15:05 -0700)] 
SAMZA-1182 : Disable flaky tests in TestAsyncRunLoop

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #126 from shanthoosh/disable_all_async_run_loop_tests

15 months agoSAMZA-1205; Disable flaky tests in TestJmxServer.
Shanthoosh Venkataraman [Mon, 17 Apr 2017 21:25:11 +0000 (14:25 -0700)] 
SAMZA-1205; Disable flaky tests in TestJmxServer.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org?

Closes #120 from shanthoosh/disable_jmx_server_tests

15 months agoSAMZA-1211; Remove Thread.sleep() from TestJoinOperator tests
Prateek Maheshwari [Mon, 17 Apr 2017 19:09:25 +0000 (12:09 -0700)] 
SAMZA-1211; Remove Thread.sleep() from TestJoinOperator tests

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #123 from prateekm/join-test-no-sleep

15 months agoSAMZA-1145: Provide Ability To Confgure The Default Number Of Changel…
James Lent [Fri, 14 Apr 2017 19:09:07 +0000 (12:09 -0700)] 
SAMZA-1145: Provide Ability To Confgure The Default Number Of Changel…

…og Replicas

Author: James Lent <jlent@nc.rr.com>

Reviewers: Yi Pan <nickpan47@apache.org>, Jagadish <vjagadish1989@gmail.com>

Closes #86 from jwlent55/SAMZA-1145

15 months agoSAMZA-1195 fix bug in Samza application master on Kerberos
Chen Song [Fri, 14 Apr 2017 19:04:08 +0000 (12:04 -0700)] 
SAMZA-1195 fix bug in Samza application master on Kerberos

Author: Chen Song <chens_cs@hotmail.com>

Reviewers: Yi Pan <nickpan47@apache.org>

Closes #119 from garlicbulb-puzhuo/SAMZA-1195

15 months agoSAMZA-1183: Fix flaky tests in TestAsyncRunLoop
Shanthoosh Venkataraman [Thu, 13 Apr 2017 22:44:13 +0000 (15:44 -0700)] 
SAMZA-1183: Fix flaky tests in TestAsyncRunLoop

Changes:

* Moved all states locally within the test methods(no global state, since tests can be run parallely).
* Increased the timeout waiting for all task threads to finish.
* Changed message chooser mock to return null at the end(When EOS message is used a final msg, during some runs task.process gets invoked more than once for it. This is due to test setup, other tests within this test class also do the same.)

How was it verified:

Ran the test class repeatedly to check for failures(there were none) using following script.

`
while [ $i -lt 30 ] do     sleep 3;     i=`expr $i + 1`;     ./gradlew :samza-core:clean :samza-core:test -Dtest.single=TestAsyncRunLoop  done
`

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #108 from shanthoosh/fix-async-run-loop-tests-1

15 months agoSAMZA-1132: LocalApplicationRunner for StreamApplication
Xinyu Liu [Thu, 13 Apr 2017 22:39:23 +0000 (15:39 -0700)] 
SAMZA-1132: LocalApplicationRunner for StreamApplication

LocalApplicationRunner runs the StreamApplication locally on every node that the application is deployed to. LocalRunner.start() is blocking until the process is killed or user invoke LocalRunner.stop() from another thread. It can be configured to use Zk for coordination.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #117 from xinyuiscool/SAMZA-1132

15 months agoSAMZA-1208 - IllegalFormatConversionException in LocalContainerRunner
Navina Ramesh [Thu, 13 Apr 2017 01:40:01 +0000 (18:40 -0700)] 
SAMZA-1208 - IllegalFormatConversionException in LocalContainerRunner

Author: Navina Ramesh <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #121 from navina/SAMZA-1208

15 months agoSAMZA-1170: Integration testing harness for StreamApplications
vjagadish1989 [Wed, 12 Apr 2017 00:56:54 +0000 (17:56 -0700)] 
SAMZA-1170: Integration testing harness for StreamApplications

- Added an integration testing harness for testing `StreamApplication`s. This is convenient for running and interacting with Kafka brokers, Zk servers and Samza components locally.
- Added the following integration tests that use this harness to test actual `StreamApplication`s:
1. Keyed Tumbling Window
2. Keyed Session Window
3. Repartition with Keyed Session Window

Bonus: A couple of additional bug-fixes that were blockers for this test.

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>, Jacob Maes<jmaes@linkedin.com>

Closes #96 from vjagadish1989/integration-tests

15 months agoSAMZA-1198: disable the flaky test TestZkBarrierForVersionUpgrade.testZkBarrierForVer...
Fred Ji [Mon, 10 Apr 2017 20:53:03 +0000 (13:53 -0700)] 
SAMZA-1198: disable the flaky test TestZkBarrierForVersionUpgrade.testZkBarrierForVersionUpgrade

We are seeing the fails sometimes from this test, disabling it for build success first. See details in SAMZA-1198, and the JIRA for fix is in SAMZA-1193.

Author: Fred Ji <fji@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #116 from fredji97/disableFlaky

15 months agoSamza-1187 : TestZkProcessorLatch tests share state causing transient failures in...
Navina Ramesh [Sat, 8 Apr 2017 00:00:28 +0000 (17:00 -0700)] 
Samza-1187 : TestZkProcessorLatch tests share state causing transient failures in CI builds

* Removed testSingleCountdown - didn't understand what was being tested
* The timeout behavior for ZkProcessorLatch has not been documented. Hence, I have fixed testLatchExpires as best I could understand.
* Removed redundant/unused member variables from ZkProcessorLatch

Author: navina <navina@apache.org>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #112 from navina/SAMZA-1187

15 months agoSAMZA-1126 - Semantics of processorId in Samza
Navina Ramesh [Fri, 7 Apr 2017 22:22:13 +0000 (15:22 -0700)] 
SAMZA-1126 - Semantics of processorId in Samza

Implementation based on [SEP-1](https://cwiki.apache.org/confluence/display/SAMZA/SEP-1%3A+Semantics+of+ProcessorId+in+Samza)

Author: navina <navina@apache.org>

Reviewers: Yi Pan <nickpan47@gmail.com>, Jacob Maes <jmaes@linkedin.com>, Jagadish <jagadish@apache.org>

Closes #103 from navina/SAMZA-1126

15 months agoSAMZA-1194 - Commenting out testJmxReporter flaky test
Shanthoosh Venkataraman [Fri, 7 Apr 2017 22:17:10 +0000 (15:17 -0700)] 
SAMZA-1194 - Commenting out testJmxReporter flaky test

TestJmxReporter brings up an JmxServer and uses it to test the reporter behavior. However, due to failures in bringing up/connecting to JMX server, this fails intermittently. Temporarily disabling it.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #118 from shanthoosh/disable_jmx_reporter_tests

15 months agoSAMZA-1089: Enable YarnJob and ClientHelper to kill a job by name rather than YARN...
Jacob Maes [Fri, 7 Apr 2017 21:32:02 +0000 (14:32 -0700)] 
SAMZA-1089: Enable YarnJob and ClientHelper to kill a job by name rather than YARN ApplicationID

Missed a couple files in the previous commit to enable YarnJob to kill and get status of a Job based on the job name rather than the YARN ApplicationName.

These changes have been manually verified in a Yarn cluster at LI.

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #114 from jmakes/samza-1089-3

15 months agoSAMZA-1192: Fixed TestJoinOperator test failure on JDK 1.8.0_05
Prateek Maheshwari [Thu, 6 Apr 2017 22:35:39 +0000 (15:35 -0700)] 
SAMZA-1192: Fixed TestJoinOperator test failure on JDK 1.8.0_05

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #115 from prateekm/join-test-fix

15 months agoSAMZA-1178: Generate JSON from StreamPlan
Xinyu Liu [Thu, 6 Apr 2017 16:58:38 +0000 (09:58 -0700)] 
SAMZA-1178: Generate JSON from StreamPlan

As the first step to visualize the StreamGraph/Plan, this patch generates a json representation of it. For the example StreamGraph in TestPlanJsonGenerator, here is the json that is generated (https://github.com/xinyuiscool/examples/blob/master/example_plan.json). This patch also introduces the immutable StreamPlan and did some code cleanup.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #110 from xinyuiscool/SAMZA-1178

15 months agoSAMZA-1143; Include fs.<scheme>.impl.* subkeys to YarnConfiguration used in YarnJobFa...
Fred Ji [Thu, 6 Apr 2017 05:28:38 +0000 (22:28 -0700)] 
SAMZA-1143; Include fs.<scheme>.impl.* subkeys to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager

SAMZA-1143 Include fs.&lt;scheme&gt;.impl.* subkeys, in addition tofs.&lt;scheme&gt;.impl, to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager.

When there are additional subconfigurations under fs.myScheme.impl, such as fs.myScheme.impl.client, we need to keep the set of configuration completed in  YarnJobFactory and YarnClusterResourceManager. When the context is set for localizing the resource in ClientHelper and YarnContainerRunner, it may rely on this configuration to get the FileStatus information, which may or may not depends on fs.&lt;scheme&gt;.impl and all possible fs.&lt;scheme&gt;.impl.* sub-configuration.

This is an enhanced feature to the PR#90.

Author: Fred Ji <fji@linkedin.com>
Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #97 from fredji97/fsImplSubkeys

15 months agoSAMZA-1191; Fixed flaky test: TestExponentialSleepStrategy testThreadInterruptInRetryLoop
Prateek Maheshwari [Thu, 6 Apr 2017 05:26:09 +0000 (22:26 -0700)] 
SAMZA-1191; Fixed flaky test: TestExponentialSleepStrategy testThreadInterruptInRetryLoop

It's possible that the interruptee thread (see `#interruptedThread`) gets pre-empted before it has a chance to run the operation (increment `iterations`) and then gets interrupted, causing these assertions to fail.

I think these assertions also aren't critical for the tests which I presume want to test interrupt propagation behavior, so removing them in this change.

vjagadish1989 & xinyuiscool, please take a look.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #113 from prateekm/flaky-ess-test-fix

15 months agoSAMZA-1094 SAMZA-1101 SAMZA-1159; Remove MessageEnvelope from public operator APIs...
Prateek Maheshwari [Wed, 5 Apr 2017 23:42:15 +0000 (16:42 -0700)] 
SAMZA-1094 SAMZA-1101 SAMZA-1159; Remove MessageEnvelope from public operator APIs. : Delay the creation of SinkFunction for output streams. : Move StreamSpec from a public API to an internal class.

Removed the MessageEnvelope and OutputStream interfaces from public operator APIs.
Moved the creation of SinkFunction for output streams to SinkOperatorSpec.
Moved StreamSpec from a public API to an internal class.

Additionally,
1. Removed references to StreamGraph in OperatorSpecs. It was being used to getNextOpId(). MessageStreamsImpl now gets the ID and gives it to OperatorSpecs itself.
2. Updated and cleaned up the StreamGraphBuilder examples.
3. Renamed SinkOperatorSpec to OutputOperatorSpec since its used by sink, sendTo and partitionBy.

nickpan47 and xinyuiscool, please take a look.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #92 from prateekm/message-envelope-removal

15 months agoSAMZA-1189: Fix file system closed issue on hdfs system producer
Jacob Maes [Wed, 5 Apr 2017 03:37:32 +0000 (20:37 -0700)] 
SAMZA-1189: Fix file system closed issue on hdfs system producer

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #111 from jmakes/samza-1189

15 months agoSamza 1186: Rename Processor to Job
Xinyu Liu [Tue, 4 Apr 2017 01:24:33 +0000 (18:24 -0700)] 
Samza 1186: Rename Processor to Job

Now we have the top level Samza application, and each stage is called a job, the previous introduced "processor" naming should be renamed as Job. This includes renaming PrcessorGraph to JobGraph, and ProcessorNode to JobNode, etc.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jake Maes <jmakes@apache.org>

Closes #109 from xinyuiscool/SAMZA-1186

15 months agoSAMZA-1089: Runner should support kill and status commands
Jacob Maes [Mon, 3 Apr 2017 16:50:27 +0000 (09:50 -0700)] 
SAMZA-1089: Runner should support kill and status commands

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>,Xinyu Liu <xiliu@linkedin.com>

Closes #106 from jmakes/samza-1089-2

15 months agoSAMZA-1138; Yarn capability check is broken
Maksim Logvinenko [Sun, 2 Apr 2017 23:18:43 +0000 (16:18 -0700)] 
SAMZA-1138; Yarn capability check is broken

After migration from `yarn.container.*` properties to `cluster-manager.container.*` properties we have to use either of them and `ClusterManagerConfig` provides backward compatibility for these properties. But in `YarnClusterResourceManage`r only old properties are used (from `YarnConfig`), hence if job config migrated to new `cluster-manager.*` properties names then check will be evaluated against default values, not against actual values.

Author: Maksim Logvinenko <mlogvinenko@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #84 from logarithm/yarn-properties-fix

15 months agoSAMZA-1181: Fix AppMaster hang after submitting jobs to Yarn
Shanthoosh Venkataraman [Fri, 31 Mar 2017 22:58:22 +0000 (15:58 -0700)] 
SAMZA-1181: Fix AppMaster hang after submitting jobs to Yarn

Currently when a job is submitted to Yarn, it is going to hang after AppMaster is created. The log shows that it hangs during bootstrapping from Coordinator stream. Further debugging shows that the jobs hang in the second time of bootstrap while reading locality data from LocalityManager. The sequence is the following:
1. JobModelManager creates CoordinatorStreamConsumer, and bootstrap it,
2. LocalityManager writes locality info into coordinator stream
3. JobModelManager closes CoordinatorStreamConsumer
4. Later localityManager bootstraps CoordinatorStreamConsumer again
Step 3 is the problem here. Since CoordinatorStreamConsumer is still held by LocalityManager, it cannot be closed prematurely. Step 3 is introduced in SAMZA-1154, as a refactoring of JobModelManager for task rest end point. To fix this issue, we will revert this change of step 3.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #104 from shanthoosh/master

15 months agoSAMZA-1176; Make TestJoinOperator unit tests safe for concurrent execution
Prateek Maheshwari [Fri, 31 Mar 2017 21:00:30 +0000 (14:00 -0700)] 
SAMZA-1176; Make TestJoinOperator unit tests safe for concurrent execution

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #105 from prateekm/join-tests

15 months agoSAMZA-1182 - Commenting out some of the flaky tests
Navina Ramesh [Fri, 31 Mar 2017 20:48:35 +0000 (13:48 -0700)] 
SAMZA-1182 - Commenting out some of the flaky tests

Author: navina <navina@apache.org>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>, Jagadish Venkataraman <vjagadish1989@gmail.com>

Closes #107 from navina/SAMZA-1182

15 months ago SAMZA-1135 - Support scala 2.12
vjagadish1989 [Fri, 31 Mar 2017 17:26:08 +0000 (10:26 -0700)] 
SAMZA-1135 - Support scala 2.12

    Added support for scala 2.12

    Author: Maxim Logvinenko <mlogvinenko@gmail.com>

    Reviewers: Jagadish <jagadish@apache.org>,Prateek Maheshwari <prateekm@linkedin.com>

    Closes #82 from metamx:scala-2.12

15 months agoSAMZA-1175 - Removing CoordinationService from JobCoordinatorFactory interface
Navina Ramesh [Wed, 29 Mar 2017 23:25:16 +0000 (16:25 -0700)] 
SAMZA-1175 - Removing CoordinationService from JobCoordinatorFactory interface

Removing CoordinationService from JobCoordinatorFactory interface

Author: navina <navina@apache.org>

Reviewers: Xinyu Liu <xinyuliu.us@apache.org>,Boris Shkolnik <boryas@apache.org>

Closes #102 from navina/SAMZA-1175

15 months agoSAMZA-1161: Adding metrics into LocalStoreMonitor.
Shanthoosh Venkataraman [Wed, 29 Mar 2017 21:07:34 +0000 (14:07 -0700)] 
SAMZA-1161: Adding metrics into LocalStoreMonitor.

Rocksdb LocalStoreMonitor is responsible for clearing up unused local task partition stores. Metrics are required to understand the behavior of this monitor in production(especially when it's clearing up unused rocksdb local stores).

The following two metrics will be emitted from this monitor:
a) Total disk space cleared in bytes.
b) Total number of rocksdb task partition stores cleared.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #95 from shanthoosh/metrics_into_local_store_monitor

15 months agoSAMZA-1167: New streamId-specific configs do not override equivalent system-scoped...
Jacob Maes [Wed, 29 Mar 2017 20:57:26 +0000 (13:57 -0700)] 
SAMZA-1167: New streamId-specific configs do not override equivalent system-scoped configs

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #101 from jmakes/samza-1167

15 months agoFix an import issue in unit test
Xinyu Liu [Wed, 29 Mar 2017 20:43:10 +0000 (13:43 -0700)] 
Fix an import issue in unit test

15 months agoSAMZA-1172: Fix for the topological sort to handle single-node loop
Xinyu Liu [Wed, 29 Mar 2017 17:49:25 +0000 (10:49 -0700)] 
SAMZA-1172: Fix for the topological sort to handle single-node loop

In the processor graph, the topological sort missed adding to the visited set during graph traversal. This caused wrong graph being generated for single-node loop. This is fixed in the patch.

Also fixed the maxPartition method not handling empty collection correctly.

Added a few new unit tests for these. Also adjust the timing of previous async commit unit tests so it can run more reliably. Long term wise we need to fix the timer inside the AsyncRunLoop tests.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #100 from xinyuiscool/SAMZA-1172

15 months agoSAMZA-1151 - Coordination Service
Boris Shkolnik [Wed, 29 Mar 2017 17:44:49 +0000 (10:44 -0700)] 
SAMZA-1151 - Coordination Service

Author: Boris Shkolnik <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@apache.org>, Navina Ramesh <navina@apache.org>

Closes #91 from sborya/CoordinationService

15 months agoSAMZA-1171: Rewrite config in ApplicationRunnerMain when creating ApplicationRunner
Xinyu Liu [Tue, 28 Mar 2017 00:33:42 +0000 (17:33 -0700)] 
SAMZA-1171: Rewrite config in ApplicationRunnerMain when creating ApplicationRunner

The config needs to be rewritten before passing down to the ApplicationRunner. This is a bug that was introduced during some refactoring/cleanup of the config in the ApplicationRunner interface.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #98 from xinyuiscool/SAMZA-1171

15 months agoSAMZA-1143; Universal config support for localized resource
Fred Ji [Mon, 27 Mar 2017 17:34:12 +0000 (10:34 -0700)] 
SAMZA-1143; Universal config support for localized resource

More details in https://issues.apache.org/jira/browse/SAMZA-1143

Tests: ./gradlew clean check successful and all unit tests passed

Author: Fred Ji <fji@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #90 from fredji97/universalLocalizer

15 months agoSAMZA-1137: Instantiate ApplicationRunner in SamzaContainer
Xinyu Liu [Fri, 24 Mar 2017 20:49:30 +0000 (13:49 -0700)] 
SAMZA-1137: Instantiate ApplicationRunner in SamzaContainer

Create an ApplicationRunner in SamzaContainer to provide StreamSpecs for fluent API.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #94 from xinyuiscool/SAMZA-1137

15 months agoSAMZA-1134:Simplify barrier for zk version upgrade.
Boris Shkolnik [Thu, 23 Mar 2017 21:25:53 +0000 (14:25 -0700)] 
SAMZA-1134:Simplify barrier for zk version upgrade.

see SAMZA-1134 for details.

Author: Boris Shkolnik <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>

Reviewers: Navina Ramesh <navina@apache.org>

Closes #81 from sborya/SimplifyBarrierTO

15 months agoSAMZA-1149; upgrade mockito from 1.8.4 to 1.10.19
Fred Ji [Wed, 22 Mar 2017 20:45:02 +0000 (13:45 -0700)] 
SAMZA-1149; upgrade mockito from 1.8.4 to 1.10.19

More details: https://issues.apache.org/jira/browse/SAMZA-1149
Test: gradlew clean check successfully. All unit tests passed.

Author: Fred Ji <fji@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Navina Ramesh <navina@apache.org>

Closes #89 from fredji97/mockitoUpgrade

15 months agosamza-1156: Improve ApplicationRunner method naming and class structure
Jacob Maes [Wed, 22 Mar 2017 14:59:08 +0000 (07:59 -0700)] 
samza-1156: Improve ApplicationRunner method naming and class structure

navina xinyuiscool nickpan47 take a look

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Closes #93 from jmakes/app-runner-class-hier

15 months agoSAMZA-1158 Adding monitor to clean up stale local stores of tasks
Shanthoosh Venkataraman [Mon, 20 Mar 2017 22:18:27 +0000 (15:18 -0700)] 
SAMZA-1158 Adding monitor to clean up stale local stores of tasks

15 months agoSAMZA-1154: Tasks endpoint to list the complete details of all tasks related to a job
Shanthoosh Venkataraman [Mon, 20 Mar 2017 17:43:29 +0000 (10:43 -0700)] 
SAMZA-1154: Tasks endpoint to list the complete details of all tasks related to a job

16 months agoSAMZA-1108: Implementation of Windows and various kinds of Triggers
vjagadish1989 [Sun, 19 Mar 2017 19:25:40 +0000 (12:25 -0700)] 
SAMZA-1108: Implementation of Windows and various kinds of Triggers

* Implemented various triggers and the orchestration logic of the window operator.
* Implemented wire-up of window and the flow of messages through various trigger implementations.
* Implementations for count, time, timeSinceFirst, timeSinceLast, Any, Repeating triggers.

Author: vjagadish1989 <jvenkatr@linkedin.com>

Reviewers: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>, Prateek Maheshwari <pmaheshw@linkedin.com>, Chris Pettitt <cpettitt@linkedin.com>

Closes #66 from vjagadish1989/window-impl

16 months agoSAMZA-1093: instantiating StreamOperatorTask in SamzaContainer
Yi Pan (Data Infrastructure) [Sat, 18 Mar 2017 07:28:41 +0000 (00:28 -0700)] 
SAMZA-1093: instantiating StreamOperatorTask in SamzaContainer

SAMZA-1093: Instantiating StreamOperatorTask in SamzaContainer

- minor refactor to get the TaskFactory based on task class configuration
- add instantiation of StreamOperatorTask in SamzaContainer

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Xinyu Liu <xinyu@apache.org>, Navina Ramesh <nramesh@linkedin.com>, vjagadish1989 <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>, Chris Pettitt <cpettitt@linkedin.com>

Closes #65 from nickpan47/SAMZA-1093 and squashes the following commits:

9daf451 [Yi Pan (Data Infrastructure)] SAMZA-1093: added javadoc for TaskFactoryUtil class
078fee3 [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
e7be967 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review comments
b625beb [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
df27dc4 [Yi Pan (Data Infrastructure)] Merge branch 'master' into SAMZA-1093
91b1ba8 [Yi Pan (Data Infrastructure)] SAMZA-1093: fix merge errors. Removing 4 renamed old files
12a1529 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review feedbacks. Opened SAMZA-1137 for ApplicationRunner initialization in SamzaContainer
ac08d55 [Yi Pan (Data Infrastructure)] WIP: SAMZA-1093: rebase w/ ApplicationRunner patch.
9e1e1f6 [Yi Pan (Data Infrastructure)] SAMZA-1093: addressing review feedbacks.
1c5158d [Yi Pan (Data Infrastructure)] SAMZA-1093: code ready, more unit tests pending
c613334 [Yi Pan (Data Infrastructure)] SAMZA-1093: fix import
6404964 [Yi Pan (Data Infrastructure)] SAMZA-1093: Instantiating StreamOperatorTask in SamzaContainer
b932454 [Yi Pan (Data Infrastructure)] Merge branch 'master' into old-SAMZA-1093
82b839f [Yi Pan (Data Infrastructure)] SAMZA-1093: instantiating StreamOperatorTask in SamzaContainer (WIP)

16 months agoSAMZA-1131: RemoteApplicationRunner for cluster-based Samza applications
Xinyu Liu [Fri, 17 Mar 2017 18:54:57 +0000 (11:54 -0700)] 
SAMZA-1131: RemoteApplicationRunner for cluster-based Samza applications

RemoteApplicationRunner starts the Samza StreamApplication on the remote cluster, e.g. Yarn. It uses ExecutionPlanner for planning physical execution, and JobRunner to start each stage of the application.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #88 from xinyuiscool/SAMZA-1131

16 months agoSAMZA-1140; Non blocking commit in Async Runloop
Shanthoosh Venkataraman [Fri, 17 Mar 2017 18:46:34 +0000 (11:46 -0700)] 
SAMZA-1140; Non blocking commit in Async Runloop

Adds non blocking commit into AsyncRunLoop. Clients can enable it by setting an opt-in config task.async.commit (which is disabled by default). Contains the following list of changes for AsyncStreamTask type of tasks.

a) Enable commit for a task,  when there're uncompleted callbacks from a task. Progressive commit of finished callbacks, when there are messages in flight for a task.
b) Enable processing of messages for a task, when there're commits in progress from a task.

Documentation for this change will be done in a separate PR.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #85 from shanthoosh/asyncCommitSupport

16 months agoSAMZA-1146; TaskCallbackManager commit fix.
Shanthoosh Venkataraman [Fri, 17 Mar 2017 18:42:34 +0000 (11:42 -0700)] 
SAMZA-1146; TaskCallbackManager commit fix.

Each task callback in samza belongs to different SystemStreamPartition. When multiple callbacks in contagious order are available for commit, callback with highest sequence number is chosen for commit. This will prevent checkpointing of completed callbacks that has commit request and doesn't have highest sequence number. Upon task restart this will lead to duplicate reprocessing of already processed messages (since completed callbacks for some SystemStreamPartition's aren't committed earlier).

This PR fixes it and commits all completed callbacks that has commit request defined. Added a test to verify the behavior.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>
Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>
Author: vjagadish1989 <jvenkatr@linkedin.com>
Author: Boris Shkolnik <boryas@apache.org>
Author: Prateek Maheshwari <pmaheshw@linkedin.com>
Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Chen Song <csong@appnexus.com>
Author: Tommy Becker <tobecker@tivo.com>
Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #87 from shanthoosh/Fixing_CallBackManager_Commit

16 months agoResolve unit test compilation issue due to conflicting commits at the same time
Xinyu Liu [Thu, 16 Mar 2017 22:18:37 +0000 (15:18 -0700)] 
Resolve unit test compilation issue due to conflicting commits at the same time

16 months agoSAMZA-1067; Physical execution graph and planner for fluent API
Xinyu Liu [Thu, 16 Mar 2017 01:27:01 +0000 (18:27 -0700)] 
SAMZA-1067; Physical execution graph and planner for fluent API

Initial commit for the physical graph and plan. Design is there: https://issues.apache.org/jira/secure/attachment/12856670/SAMZA-1067.0.pdf.

The commit includes:

1) Physical ProcessorGraph, where each processor represents a physical execution unit (e.g. a job in Yarn).
2) A planner does the following:
   - create ProcessorGraph from StreamGraph. For this phase, the graph only contains a single node (single stage);
   - figure out the partitions of intermediate topics
   - create the topics

Please note currently the planner is used in the remote runner for now. Further changes/refactoring/cleanup are expected to be integrated with local runner.

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Jagadish Venkatraman <jvenkatraman@linkedin.com>

Closes #75 from xinyuiscool/SAMZA-1067

16 months agoFix an import issue on TestJoinOperator
vjagadish1989 [Tue, 14 Mar 2017 21:00:19 +0000 (14:00 -0700)] 
Fix an import issue on TestJoinOperator

16 months agoSAMZA-1091; Implement key-based inner join operator with no time constraints
Prateek Maheshwari [Tue, 14 Mar 2017 20:44:54 +0000 (13:44 -0700)] 
SAMZA-1091; Implement key-based inner join operator with no time constraints

Author: Prateek Maheshwari <pmaheshw@linkedin.com>
Author: Prateek Maheshwari <prateekm@utexas.edu>

Reviewers: Jagadish Venkatraman <jagadish@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #60 from prateekm/master

16 months agoSAMZA-1100; Exception when using a stream as both bootstrap and broadcast.
Shanthoosh Venkataraman [Mon, 13 Mar 2017 19:07:45 +0000 (12:07 -0700)] 
SAMZA-1100; Exception when using a stream as both bootstrap and broadcast.

When a task input stream is used as both broadcast and bootstrap stream in a samza job, Bootstrappingchooser marks the stream as bootstrapped when a single task finishes consuming all the SystemStreamPartitions(This happens when all the starting offset for each partition in the input stream is of type upcoming). This patch fixes this, by marking a stream as bootstrapped, only when all the systemStreamPartitions in a input stream is consumed by all the expected tasks.

More details here : https://issues.apache.org/jira/browse/SAMZA-1100

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek Maheshwari<prateekm@linkedin.com>

Closes #68 from shanthoosh/master

16 months agoSAMZA-1112; BrokerProxy does not log fatal errors
Tommy Becker [Sat, 11 Mar 2017 14:56:48 +0000 (06:56 -0800)] 
SAMZA-1112; BrokerProxy does not log fatal errors

Add an UncaughtExceptionHandler to the broker proxy thread so
failures there get logged.

Author: Tommy Becker <tobecker@tivo.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #80 from twbecker/SAMZA-1112

16 months agoSAMZA-1123; Create intermediate stream in partitionBy() operator
Xinyu Liu [Fri, 10 Mar 2017 22:08:18 +0000 (14:08 -0800)] 
SAMZA-1123; Create intermediate stream in partitionBy() operator

For partitionBy() operator, Samza generates an intermediate stream with id based on operator name and id, and system based on config. The intermediate streams will be materialized later by different execution environments. For example, if the intermediate stream is a Kafka stream, the topic will be created before the application starts.

Also renamed the config from "job.runner.class" to "app.runner.class".

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #79 from xinyuiscool/SAMZA-1123

16 months agoSAMZA-1124; Job coordinator with time out
Boris Shkolnik [Thu, 9 Mar 2017 01:16:51 +0000 (17:16 -0800)] 
SAMZA-1124; Job coordinator with time out

If a processor doesn't join the barrier for the TimeOut time - the barrier is cancelled. All the processor should unsubscribe from it.

Author: Boris Shkolnik <bshkolni@bshkolni-ld1.linkedin.biz>
Author: Boris Shkolnik <boryas@apache.org>
Author: navina <navina@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #77 from sborya/JobCoordinatorWithTO and squashes the following commits:

ed055dd [Boris Shkolnik] checkstyle
1468902 [Boris Shkolnik] renambed a method
579d2e7 [Boris Shkolnik] Merge branch 'JobCoordinator' into JobCoordinatorWithTO
54ab688 [Boris Shkolnik] removed JavaJobConfig.java
3f95d46 [Boris Shkolnik] checkstyle
75f9a94 [Boris Shkolnik] added test for time out
b73ba32 [Boris Shkolnik] merge + test
5d67be0 [Boris Shkolnik] removed extra empty lines
9cf3c3e [Boris Shkolnik] addressed review comments
c47031d [Boris Shkolnik] added timeout for ZkBarrierForVersionUpgrade
93a55a9 [Boris Shkolnik] use write directly in the barrier when all joined
f89a037 [Boris Shkolnik] addressed some notes
4219720 [Boris Shkolnik] added JavaJobConfig
659ae7c [Boris Shkolnik] add ZkJobCoordinatorFactory
520a083 [Boris Shkolnik] added ZKJobCoordinatorFactory
a42218e [Boris Shkolnik] checkstyle
02ca658 [Boris Shkolnik] typo
c1fd0b2 [Boris Shkolnik] merge cleanup
1c8eef4 [Boris Shkolnik] merge cleanup
a7e014a [Boris Shkolnik] merged
b4a0642 [Boris Shkolnik] typo
33bacdd [Boris Shkolnik] merge
d1582a9 [Boris Shkolnik] missed method name change
4c481a2 [Boris Shkolnik] merge
b811f3d [Boris Shkolnik] changed to private final
5d43419 [Boris Shkolnik] addressed review comments
1a0c54d [Boris Shkolnik] fixed test
e19d77b [Boris Shkolnik] renamed method
0c5edab [Boris Shkolnik] makey tryBecomeALeader async
b34f6b7 [Boris Shkolnik] added java doc
c2305b6 [Boris Shkolnik] cleanup
f83fc57 [Boris Shkolnik] merge
f8c8a6d [Boris Shkolnik] make a smaller PR for publish functionality only
251aad7 [Boris Shkolnik] removed unneeded interface for real
9892dee [Boris Shkolnik] removed unneeded interface
f20d15f [Boris Shkolnik] some updates to JobCoordinator
bd53c07 [Boris Shkolnik] deleteing already committed files
18198d1 [Boris Shkolnik] added test for zk barrier
9dba992 [Boris Shkolnik] moved the Test in to test subdir
c16d864 [Boris Shkolnik] added test
817a7b6 [Boris Shkolnik] merge complete
1e5947f [Boris Shkolnik] merged
6506b48 [Boris Shkolnik] merged with latest
4290b13 [Boris Shkolnik] merged
43eb076 [Boris Shkolnik] checkstyle errors
e0c44fe [Boris Shkolnik] merged
8e8d833 [Boris Shkolnik] merge
e59d38c [Boris Shkolnik] merge with ZkController
6a71cf6 [Boris Shkolnik] renamed the listners
6cbcf6e [Boris Shkolnik] review comments
efbee84 [Boris Shkolnik] Checkstyle errors
132300c [Boris Shkolnik] converted ZkLeaderElector.tryBecomeLeader to async method
13a05d7 [Boris Shkolnik] merge
2d59e0c [Boris Shkolnik] merge
82c819b [Boris Shkolnik] merge
7ebe9a6 [Boris Shkolnik] check style
41b2e46 [Boris Shkolnik] refactoring to match the new ZkUtils constructor
b07d63a [Boris Shkolnik] merge JavaJobConfig
bdc953b [Boris Shkolnik] merged
4301372 [Boris Shkolnik] added tests
4d48d83 [Boris Shkolnik] merge
c9bb475 [Boris Shkolnik] added tests for ZkUtils
592e9bb [Boris Shkolnik] added missing functionality for ZkControllerImpl into zkUtils and zkKeyBuilder
b473a6e [Boris Shkolnik] Added the new file ZkControllerListener.java
3412ed4 [Boris Shkolnik] Renamed ZkListener to ZkControllerListener
fabddc9 [Boris Shkolnik] merge
ad9108a [Boris Shkolnik] merge with ScheduleAfterDebounceTime
ba583d6 [Boris Shkolnik] merge
3d6b993 [Boris Shkolnik] cleaned up
fe69e70 [Boris Shkolnik] merge
7f8125b [Boris Shkolnik] Merge branch 'master' into ZkTestUtils
eaf04bb [Boris Shkolnik] added more comments
358ae6b [Boris Shkolnik] Added test and addressed review comments
9b22eb6 [Boris Shkolnik] JobModelPublish
0ef90b6 [Boris Shkolnik] Merge branch 'JobModel' into JobModelPublish
9c59048 [Boris Shkolnik] merge
017fe79 [Boris Shkolnik] added tests
5c8aa20 [Boris Shkolnik] JobModel Generation using SimpleGroupByContainerCount
2c841e1 [Boris Shkolnik] added awaitStart
eeb69ca [Boris Shkolnik] Merge branch 'ZkBarrier' into JobCoordinator
dc26bd2 [Boris Shkolnik] added BarrierForVersionUpgrade
cfdb4c7 [Boris Shkolnik] Merge branch 'ZkController' into ZkBarrier
b28ba14 [Boris Shkolnik] ZkBarrier
c8d26ba [Boris Shkolnik] merged
efc4d03 [Boris Shkolnik] ZkController
c9b3fe4 [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime' into ZkController
f0cae7b [Boris Shkolnik] Merge branch 'LeaderElector' into ZkController
3df0def [Boris Shkolnik] ZkControllerImpl
4801613 [Boris Shkolnik] cleanup
d32045b [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime1' into JobCoordinator
32f96b4 [Boris Shkolnik] added ScheduleAfterDebounce
ce83409 [Boris Shkolnik] cleanup
d7a7ccb [Boris Shkolnik] Merge branch 'LeaderElector' into ScheduleAfterDebounceTime
9c6b20a [Boris Shkolnik] added ZkListener
5f867c0 [Boris Shkolnik] added Apache license info
d0687b9 [Boris Shkolnik] Merge branch 'LeaderElector' into JobCoordinator
372829f [Boris Shkolnik] Merge branch 'master' into JobCoordinator
14db43d [Boris Shkolnik] added TestZkStreamProcessor - main manual test
b3b27c6 [Boris Shkolnik] Merge branch 'LeaderElector' into TestZkStreamProcessor
6649c80 [navina] Fixing ZkUtils close(). No need to close underlying connection explicitly
7f17e26 [Boris Shkolnik] ZkTestUtils
f904cd3 [Boris Shkolnik] ScheduleAfterDebounceTime
d126b10 [Boris Shkolnik] ScheduleAfterDebounceTime
63d8d60 [Boris Shkolnik] JavaJobConfig
ff15501 [Boris Shkolnik] JavaJobConfig
d20bacf [Boris Shkolnik] ZkTestUtils
737eb2f [Boris Shkolnik] ZkTestUtils
8e2d6c1 [Boris Shkolnik] add main manual test
7a47f84 [Boris Shkolnik] add main manual test
fa2186b [Boris Shkolnik] added ZkController
a0a7409 [Boris Shkolnik] Merge branch 'master' of https://github.com/navina/samza
edda60d [navina] Removing an unintended change to the grouper
6dd6b8d [navina] Adding tests for ZkLeaderElector
1734f8f [navina] Adding tests for ZkUtils
317cf16 [navina] Adding tests for ZkKeyBuilder
aaaf24e [navina] Adding EmbeddedZookeeper for testing
37c2c8b [navina] Extracting files related to LeaderElection
76b5167 [Boris Shkolnik] added new line at then end

16 months agoSAMZA-1121; StreamAppender should not propagate exceptions to the caller
Prateek Maheshwari [Wed, 8 Mar 2017 08:26:46 +0000 (00:26 -0800)] 
SAMZA-1121; StreamAppender should not propagate exceptions to the caller

StreamAppender#append currently propagates any exceptions while sending messages to the underlying logging system to the calling code. Since users don't expect log statements to throw exceptions, this can cause unexpected failures scenarios. We should catch exceptions and log to stderr instead.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #78 from prateekm/stream-appender-fix

16 months agoSAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner
Xinyu Liu [Tue, 7 Mar 2017 23:02:30 +0000 (15:02 -0800)] 
SAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner

Some refactoring/cleanup:

- rename ExecutionEnvironment to ApplicationRunner, including all the subclasses.
- rename the package to be org.apache.samza.runtime
- rename the StandalondApplicationRunner to be LocalApplicationRunner

Author: Xinyu Liu <xiliu@xiliu-ld.linkedin.biz>

Reviewers: Prateek Maheshwari <pmaheshwari@linkedin.com>

Closes #76 from xinyuiscool/SAMZA-1122 and squashes the following commits:

cff5206 [Xinyu Liu] Merge branch 'SAMZA-1122' of https://github.com/xinyuiscool/samza into SAMZA-1122
c341d3d [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner
6a71205 [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner