23 months agoResolve unit test compilation issue due to conflicting commits at the same time
Xinyu Liu [Thu, 16 Mar 2017 22:18:37 +0000 (15:18 -0700)] 
Resolve unit test compilation issue due to conflicting commits at the same time

23 months agoSAMZA-1067; Physical execution graph and planner for fluent API
Xinyu Liu [Thu, 16 Mar 2017 01:27:01 +0000 (18:27 -0700)] 
SAMZA-1067; Physical execution graph and planner for fluent API

Initial commit for the physical graph and plan. Design is there:

The commit includes:

1) Physical ProcessorGraph, where each processor represents a physical execution unit (e.g. a job in Yarn).
2) A planner does the following:
   - create ProcessorGraph from StreamGraph. For this phase, the graph only contains a single node (single stage);
   - figure out the partitions of intermediate topics
   - create the topics

Please note currently the planner is used in the remote runner for now. Further changes/refactoring/cleanup are expected to be integrated with local runner.

Author: Xinyu Liu <>

Reviewers: Jagadish Venkatraman <>

Closes #75 from xinyuiscool/SAMZA-1067

23 months agoFix an import issue on TestJoinOperator
vjagadish1989 [Tue, 14 Mar 2017 21:00:19 +0000 (14:00 -0700)] 
Fix an import issue on TestJoinOperator

23 months agoSAMZA-1091; Implement key-based inner join operator with no time constraints
Prateek Maheshwari [Tue, 14 Mar 2017 20:44:54 +0000 (13:44 -0700)] 
SAMZA-1091; Implement key-based inner join operator with no time constraints

Author: Prateek Maheshwari <>
Author: Prateek Maheshwari <>

Reviewers: Jagadish Venkatraman <>, Yi Pan <>

Closes #60 from prateekm/master

23 months agoSAMZA-1100; Exception when using a stream as both bootstrap and broadcast.
Shanthoosh Venkataraman [Mon, 13 Mar 2017 19:07:45 +0000 (12:07 -0700)] 
SAMZA-1100; Exception when using a stream as both bootstrap and broadcast.

When a task input stream is used as both broadcast and bootstrap stream in a samza job, Bootstrappingchooser marks the stream as bootstrapped when a single task finishes consuming all the SystemStreamPartitions(This happens when all the starting offset for each partition in the input stream is of type upcoming). This patch fixes this, by marking a stream as bootstrapped, only when all the systemStreamPartitions in a input stream is consumed by all the expected tasks.

More details here :

Author: Shanthoosh Venkataraman <>

Reviewers: Prateek Maheshwari<>

Closes #68 from shanthoosh/master

23 months agoSAMZA-1112; BrokerProxy does not log fatal errors
Tommy Becker [Sat, 11 Mar 2017 14:56:48 +0000 (06:56 -0800)] 
SAMZA-1112; BrokerProxy does not log fatal errors

Add an UncaughtExceptionHandler to the broker proxy thread so
failures there get logged.

Author: Tommy Becker <>

Reviewers: Jagadish <>

Closes #80 from twbecker/SAMZA-1112

23 months agoSAMZA-1123; Create intermediate stream in partitionBy() operator
Xinyu Liu [Fri, 10 Mar 2017 22:08:18 +0000 (14:08 -0800)] 
SAMZA-1123; Create intermediate stream in partitionBy() operator

For partitionBy() operator, Samza generates an intermediate stream with id based on operator name and id, and system based on config. The intermediate streams will be materialized later by different execution environments. For example, if the intermediate stream is a Kafka stream, the topic will be created before the application starts.

Also renamed the config from "job.runner.class" to "app.runner.class".

Author: Xinyu Liu <>

Reviewers: Prateek Maheshwari <>

Closes #79 from xinyuiscool/SAMZA-1123

23 months agoSAMZA-1124; Job coordinator with time out
Boris Shkolnik [Thu, 9 Mar 2017 01:16:51 +0000 (17:16 -0800)] 
SAMZA-1124; Job coordinator with time out

If a processor doesn't join the barrier for the TimeOut time - the barrier is cancelled. All the processor should unsubscribe from it.

Author: Boris Shkolnik <>
Author: Boris Shkolnik <>
Author: navina <>

Reviewers: Xinyu Liu <>

Closes #77 from sborya/JobCoordinatorWithTO and squashes the following commits:

ed055dd [Boris Shkolnik] checkstyle
1468902 [Boris Shkolnik] renambed a method
579d2e7 [Boris Shkolnik] Merge branch 'JobCoordinator' into JobCoordinatorWithTO
54ab688 [Boris Shkolnik] removed
3f95d46 [Boris Shkolnik] checkstyle
75f9a94 [Boris Shkolnik] added test for time out
b73ba32 [Boris Shkolnik] merge + test
5d67be0 [Boris Shkolnik] removed extra empty lines
9cf3c3e [Boris Shkolnik] addressed review comments
c47031d [Boris Shkolnik] added timeout for ZkBarrierForVersionUpgrade
93a55a9 [Boris Shkolnik] use write directly in the barrier when all joined
f89a037 [Boris Shkolnik] addressed some notes
4219720 [Boris Shkolnik] added JavaJobConfig
659ae7c [Boris Shkolnik] add ZkJobCoordinatorFactory
520a083 [Boris Shkolnik] added ZKJobCoordinatorFactory
a42218e [Boris Shkolnik] checkstyle
02ca658 [Boris Shkolnik] typo
c1fd0b2 [Boris Shkolnik] merge cleanup
1c8eef4 [Boris Shkolnik] merge cleanup
a7e014a [Boris Shkolnik] merged
b4a0642 [Boris Shkolnik] typo
33bacdd [Boris Shkolnik] merge
d1582a9 [Boris Shkolnik] missed method name change
4c481a2 [Boris Shkolnik] merge
b811f3d [Boris Shkolnik] changed to private final
5d43419 [Boris Shkolnik] addressed review comments
1a0c54d [Boris Shkolnik] fixed test
e19d77b [Boris Shkolnik] renamed method
0c5edab [Boris Shkolnik] makey tryBecomeALeader async
b34f6b7 [Boris Shkolnik] added java doc
c2305b6 [Boris Shkolnik] cleanup
f83fc57 [Boris Shkolnik] merge
f8c8a6d [Boris Shkolnik] make a smaller PR for publish functionality only
251aad7 [Boris Shkolnik] removed unneeded interface for real
9892dee [Boris Shkolnik] removed unneeded interface
f20d15f [Boris Shkolnik] some updates to JobCoordinator
bd53c07 [Boris Shkolnik] deleteing already committed files
18198d1 [Boris Shkolnik] added test for zk barrier
9dba992 [Boris Shkolnik] moved the Test in to test subdir
c16d864 [Boris Shkolnik] added test
817a7b6 [Boris Shkolnik] merge complete
1e5947f [Boris Shkolnik] merged
6506b48 [Boris Shkolnik] merged with latest
4290b13 [Boris Shkolnik] merged
43eb076 [Boris Shkolnik] checkstyle errors
e0c44fe [Boris Shkolnik] merged
8e8d833 [Boris Shkolnik] merge
e59d38c [Boris Shkolnik] merge with ZkController
6a71cf6 [Boris Shkolnik] renamed the listners
6cbcf6e [Boris Shkolnik] review comments
efbee84 [Boris Shkolnik] Checkstyle errors
132300c [Boris Shkolnik] converted ZkLeaderElector.tryBecomeLeader to async method
13a05d7 [Boris Shkolnik] merge
2d59e0c [Boris Shkolnik] merge
82c819b [Boris Shkolnik] merge
7ebe9a6 [Boris Shkolnik] check style
41b2e46 [Boris Shkolnik] refactoring to match the new ZkUtils constructor
b07d63a [Boris Shkolnik] merge JavaJobConfig
bdc953b [Boris Shkolnik] merged
4301372 [Boris Shkolnik] added tests
4d48d83 [Boris Shkolnik] merge
c9bb475 [Boris Shkolnik] added tests for ZkUtils
592e9bb [Boris Shkolnik] added missing functionality for ZkControllerImpl into zkUtils and zkKeyBuilder
b473a6e [Boris Shkolnik] Added the new file
3412ed4 [Boris Shkolnik] Renamed ZkListener to ZkControllerListener
fabddc9 [Boris Shkolnik] merge
ad9108a [Boris Shkolnik] merge with ScheduleAfterDebounceTime
ba583d6 [Boris Shkolnik] merge
3d6b993 [Boris Shkolnik] cleaned up
fe69e70 [Boris Shkolnik] merge
7f8125b [Boris Shkolnik] Merge branch 'master' into ZkTestUtils
eaf04bb [Boris Shkolnik] added more comments
358ae6b [Boris Shkolnik] Added test and addressed review comments
9b22eb6 [Boris Shkolnik] JobModelPublish
0ef90b6 [Boris Shkolnik] Merge branch 'JobModel' into JobModelPublish
9c59048 [Boris Shkolnik] merge
017fe79 [Boris Shkolnik] added tests
5c8aa20 [Boris Shkolnik] JobModel Generation using SimpleGroupByContainerCount
2c841e1 [Boris Shkolnik] added awaitStart
eeb69ca [Boris Shkolnik] Merge branch 'ZkBarrier' into JobCoordinator
dc26bd2 [Boris Shkolnik] added BarrierForVersionUpgrade
cfdb4c7 [Boris Shkolnik] Merge branch 'ZkController' into ZkBarrier
b28ba14 [Boris Shkolnik] ZkBarrier
c8d26ba [Boris Shkolnik] merged
efc4d03 [Boris Shkolnik] ZkController
c9b3fe4 [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime' into ZkController
f0cae7b [Boris Shkolnik] Merge branch 'LeaderElector' into ZkController
3df0def [Boris Shkolnik] ZkControllerImpl
4801613 [Boris Shkolnik] cleanup
d32045b [Boris Shkolnik] Merge branch 'ScheduleAfterDebounceTime1' into JobCoordinator
32f96b4 [Boris Shkolnik] added ScheduleAfterDebounce
ce83409 [Boris Shkolnik] cleanup
d7a7ccb [Boris Shkolnik] Merge branch 'LeaderElector' into ScheduleAfterDebounceTime
9c6b20a [Boris Shkolnik] added ZkListener
5f867c0 [Boris Shkolnik] added Apache license info
d0687b9 [Boris Shkolnik] Merge branch 'LeaderElector' into JobCoordinator
372829f [Boris Shkolnik] Merge branch 'master' into JobCoordinator
14db43d [Boris Shkolnik] added TestZkStreamProcessor - main manual test
b3b27c6 [Boris Shkolnik] Merge branch 'LeaderElector' into TestZkStreamProcessor
6649c80 [navina] Fixing ZkUtils close(). No need to close underlying connection explicitly
7f17e26 [Boris Shkolnik] ZkTestUtils
f904cd3 [Boris Shkolnik] ScheduleAfterDebounceTime
d126b10 [Boris Shkolnik] ScheduleAfterDebounceTime
63d8d60 [Boris Shkolnik] JavaJobConfig
ff15501 [Boris Shkolnik] JavaJobConfig
d20bacf [Boris Shkolnik] ZkTestUtils
737eb2f [Boris Shkolnik] ZkTestUtils
8e2d6c1 [Boris Shkolnik] add main manual test
7a47f84 [Boris Shkolnik] add main manual test
fa2186b [Boris Shkolnik] added ZkController
a0a7409 [Boris Shkolnik] Merge branch 'master' of
edda60d [navina] Removing an unintended change to the grouper
6dd6b8d [navina] Adding tests for ZkLeaderElector
1734f8f [navina] Adding tests for ZkUtils
317cf16 [navina] Adding tests for ZkKeyBuilder
aaaf24e [navina] Adding EmbeddedZookeeper for testing
37c2c8b [navina] Extracting files related to LeaderElection
76b5167 [Boris Shkolnik] added new line at then end

23 months agoSAMZA-1121; StreamAppender should not propagate exceptions to the caller
Prateek Maheshwari [Wed, 8 Mar 2017 08:26:46 +0000 (00:26 -0800)] 
SAMZA-1121; StreamAppender should not propagate exceptions to the caller

StreamAppender#append currently propagates any exceptions while sending messages to the underlying logging system to the calling code. Since users don't expect log statements to throw exceptions, this can cause unexpected failures scenarios. We should catch exceptions and log to stderr instead.

Author: Prateek Maheshwari <>

Reviewers: Jagadish <>

Closes #78 from prateekm/stream-appender-fix

23 months agoSAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner
Xinyu Liu [Tue, 7 Mar 2017 23:02:30 +0000 (15:02 -0800)] 
SAMZA-1122; Rename ExecutionEnvironment to ApplicationRunner

Some refactoring/cleanup:

- rename ExecutionEnvironment to ApplicationRunner, including all the subclasses.
- rename the package to be org.apache.samza.runtime
- rename the StandalondApplicationRunner to be LocalApplicationRunner

Author: Xinyu Liu <>

Reviewers: Prateek Maheshwari <>

Closes #76 from xinyuiscool/SAMZA-1122 and squashes the following commits:

cff5206 [Xinyu Liu] Merge branch 'SAMZA-1122' of into SAMZA-1122
c341d3d [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner
6a71205 [Xinyu Liu] SAMZA-1122: Rename ExecutionEnvironment to ApplicationRunner

23 months agoSAMZA-1096: StreamSpec constructors in the ExecutionEnvironments
Jacob Maes [Mon, 6 Mar 2017 22:52:18 +0000 (14:52 -0800)] 
SAMZA-1096: StreamSpec constructors in the ExecutionEnvironments

Author: Jacob Maes <>

Reviewers: Yi Pan (Data Infrastructure) <>,Xinyu Liu <>,Navina Ramesh <>

Closes #74 from jmakes/samza-1096

23 months agoSAMZA-1107:Job model publish
Boris Shkolnik [Wed, 1 Mar 2017 21:49:29 +0000 (13:49 -0800)] 
SAMZA-1107:Job model publish

add utils for publishing job model and job model version to ZK.

Author: Boris Shkolnik <>
Author: Boris Shkolnik <>
Author: navina <>

Reviewers: Navina Ramesh <>, Fred Ji <>

Closes #67 from sborya/JobModelPublish1

23 months agoSAMZA-1103: ZkBarrier
Boris Shkolnik [Wed, 1 Mar 2017 01:56:50 +0000 (17:56 -0800)] 
SAMZA-1103: ZkBarrier

SAMZA-1103: Barrier for JobModel upgrades. When all the processors got notification about the new JobModel, only after that they can start using the new model.

Author: Boris Shkolnik <>
Author: Boris Shkolnik <>
Author: navina <>

Reviewers: Fred Ji <>, Navina Ramesh <>, Xiliu Liu <>

Closes #61 from sborya/ZkBarrier

23 months agoFix a rendering issue in the Samza security web-page
vjagadish1989 [Tue, 28 Feb 2017 01:38:17 +0000 (17:38 -0800)] 
Fix a rendering issue in the Samza security web-page

23 months agoSAMZA-1104; fix yarn security page link from index.html page
Chen Song [Tue, 28 Feb 2017 01:31:29 +0000 (17:31 -0800)] 
SAMZA-1104; fix yarn security page link from index.html page

Author: Chen Song <>

Reviewers: Jagadish <>

Closes #62 from garlicbulb-puzhuo/SAMZA-1104

23 months agoFixing checkstyle error in StreamGraphImpl causing build failures
navina [Sat, 25 Feb 2017 02:17:29 +0000 (18:17 -0800)] 
Fixing checkstyle error in StreamGraphImpl causing build failures

23 months agoSAMZA-1102: Zk controller
Boris Shkolnik [Thu, 23 Feb 2017 22:02:05 +0000 (14:02 -0800)] 
SAMZA-1102: Zk controller

SAMZA-1102: Added ZKController and ZkControllerImpl

Author: Boris Shkolnik <>
Author: navina <>

Reviewers: Navina Ramesh <>, Fred Ji <>, Xinyu Liu <>

Closes #50 from sborya/ZkController

23 months agoSAMZA-1092: replace stream spec in fluent API
Yi Pan (Data Infrastructure) [Thu, 23 Feb 2017 20:48:56 +0000 (12:48 -0800)] 
SAMZA-1092: replace stream spec in fluent API

Replaced the StreamSpec class w/ the new one from SAMZA-1075.

Author: Yi Pan (Data Infrastructure) <>

Reviewers: Jacob Maes <>

Closes #58 from nickpan47/replace-stream-spec and squashes the following commits:

761ebb5 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class in system package
df953c2 [Yi Pan (Data Infrastructure)] SAMZA-1092: fix unit test
71331d8 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class
2fb19e9 [Yi Pan (Data Infrastructure)] SAMZA-1092: replace StreamSpec w/ new class in fluent API
ed3ad8e [Yi Pan (Data Infrastructure)] WIP: replace stream spec in fluent API

2 years agoSAMZA-1099: Documentation updates for Samza 0.12 release (for master branch)
vjagadish1989 [Wed, 22 Feb 2017 18:53:12 +0000 (10:53 -0800)] 
SAMZA-1099: Documentation updates for Samza 0.12 release (for master  branch)

2 years agoSAMZA-1097: update master branch to use 0.13.0-SNAPSHOT version
Yi Pan (Data Infrastructure) [Wed, 22 Feb 2017 01:16:58 +0000 (17:16 -0800)] 
SAMZA-1097: update master branch to use 0.13.0-SNAPSHOT version

Author: Yi Pan (Data Infrastructure) <>

Reviewers: jagadish <>

Closes #59 from nickpan47/SAMZA-1097

2 years agoSAMZA-1075: fix partitionCount assertion from PR53
Jacob Maes [Wed, 22 Feb 2017 00:49:37 +0000 (16:49 -0800)] 
SAMZA-1075: fix partitionCount assertion from PR53

nickpan47 here's the fix for the issue you found in PR53

Author: Jacob Maes <>

Reviewers: Yi Pan <>

Closes #57 from jmakes/samza-1075-2

2 years agoFix hyphens in url for committer instructions
Jacob Maes [Fri, 17 Feb 2017 23:37:22 +0000 (15:37 -0800)] 
Fix hyphens in url for committer instructions

2 years agoSAMZA-1075: Refactor SystemAdmin Interface to expose a common method to create streams
Jacob Maes [Fri, 17 Feb 2017 20:49:19 +0000 (12:49 -0800)] 
SAMZA-1075: Refactor SystemAdmin Interface to expose a common method to create streams

Author: Jacob Maes <>

Reviewers: Yi Pan (Data Infrastructure) <>

Closes #53 from jmakes/samza-1075

2 years agoSAMZA-1073: moving all operator classes into samza-core
Yi Pan (Data Infrastructure) [Thu, 16 Feb 2017 23:04:01 +0000 (15:04 -0800)] 
SAMZA-1073: moving all operator classes into samza-core

2 years agoSAMZA-1086; New Grouper for ZK based standalone.
Boris Shkolnik [Thu, 16 Feb 2017 19:40:12 +0000 (11:40 -0800)] 
SAMZA-1086; New Grouper for ZK based standalone.

Create new grouper with support for arbitrary container ids.
Add support for this list of container IDs in the JobModelManager.

Author: Boris Shkolnik <>

Reviewers: Xinyu Liu <>, Fred Ji <>

Closes #52 from sborya/JobModel

2 years agoSAMZA-1073: top-level fluent API
Yi Pan (Data Infrastructure) [Thu, 16 Feb 2017 18:18:09 +0000 (10:18 -0800)] 
SAMZA-1073: top-level fluent API

`Initial draft of top-level fluent API for operator DAGs

Author: Yi Pan (Data Infrastructure) <>

Reviewers: Xinyu Liu <>, Jacob Maes <>, Prateek Maheshwari <>

Closes #51 from nickpan47/samza-fluent-api-v1 and squashes the following commits:

001be63 [Yi Pan (Data Infrastructure)] SAMZA-1073: Addressing review feedbacks. Change StreamGraphFactory to StreamGraphBuilder.
373048a [Yi Pan (Data Infrastructure)] SAMZA-1073: top-level fluent API `

2 years agoSAMZA-1087: Schedule after debounce time
Boris Shkolnik [Thu, 16 Feb 2017 01:17:01 +0000 (17:17 -0800)] 
SAMZA-1087: Schedule after debounce time

SAMZA-1087: Allows scheduling an action (a Runnable) after some de-bounce delay.

Author: Boris Shkolnik <>
Author: Boris Shkolnik <>

Reviewers: Navina Ramesh <>, Fred Ji <>

Closes #49 from sborya/ScheduleAfterDebounceTime1

2 years agoSAMZA-1082 : Implement Leader Election using ZK
navina [Mon, 13 Feb 2017 21:32:06 +0000 (13:32 -0800)] 
SAMZA-1082 : Implement Leader Election using ZK

Simple implementation of leader election recipe along with unit tests

Author: navina <>

Reviewers: Xinyu Liu <>, Fred Ji <>

Closes #48 from navina/LeaderElector

2 years agoSAMZA-1083 Do not load task stores which are older than delete tombstones during...
Shanthoosh Venkataraman [Thu, 9 Feb 2017 00:07:27 +0000 (16:07 -0800)] 
SAMZA-1083 Do not load task stores which are older than delete tombstones during container startup

2 years agoFix integration test config issue in 0.12 release candidate release-0.12.0-rc2
vjagadish1989 [Mon, 6 Feb 2017 23:39:56 +0000 (15:39 -0800)] 
Fix integration test config issue in 0.12 release candidate

2 years agoFix javadoc issues introduced by Hdfs consumer release-0.12.0-rc1
Xinyu Liu [Thu, 2 Feb 2017 22:31:23 +0000 (14:31 -0800)] 
Fix javadoc issues introduced by Hdfs consumer

2 years agoUpdate Samza version to prepare for 0.12.0 Samza release
vjagadish1989 [Thu, 2 Feb 2017 22:09:40 +0000 (14:09 -0800)] 
Update Samza version to prepare for 0.12.0 Samza release

2 years agoAdd release PGP keys for Jagadish <> release-0.12.0-rc0
vjagadish1989 [Wed, 1 Feb 2017 21:37:23 +0000 (13:37 -0800)] 
Add release PGP keys for Jagadish <>

2 years agoUpdate to point to the most recent hello-samza release
vjagadish1989 [Tue, 31 Jan 2017 23:36:18 +0000 (15:36 -0800)] 
Update to point to the most recent hello-samza release

2 years agoSAMZA-1080 : Initial Standalone JobCoordinator and StreamProcessor API
navina [Tue, 31 Jan 2017 02:30:02 +0000 (18:30 -0800)] 
SAMZA-1080 : Initial Standalone JobCoordinator and StreamProcessor API

This patch contains changes associated with the Standalone StreamProcessor, where there is no coordination. This will work for load-balanced consumers like new Kafka consumer and statically partitioned cases.

Additionally, we have introduced TaskFactory for StreamTask and AsyncStreamTask.

Author: navina <>

Reviewers: xinyuiscool,fredji97

Closes #44 from navina/Noop-JC

2 years agoSAMZA-1074; Fix builds for hello-samza on various CDH versions
Dongkyu Hwangbo [Mon, 30 Jan 2017 23:56:56 +0000 (15:56 -0800)] 
SAMZA-1074; Fix builds for hello-samza on various CDH versions

This issue is mainly related of hello-samza, but modification of document is needed to help user understand this process well.

Author: Dongkyu Hwangbo <>

Reviewers: jvenkatr

Closes #38 from dkhwangbo/SAMZA-1074

2 years agoSAMZA-1079: Add timeouts for reads from HttpFileSystem. Add tests.
vjagadish1989 [Mon, 30 Jan 2017 23:10:06 +0000 (15:10 -0800)] 
SAMZA-1079: Add timeouts for reads from HttpFileSystem. Add tests.

* Wrote a unit/integration test to simulate a stuck connection when reading binaries for the job.
Other misc. changes:
- Moved some debug log messages to be info for better debugging.

Author: vjagadish1989 <>

Reviewers: jmakes,nickpan47

Closes #42 from vjagadish/http-fs

2 years agoSAMZA-984; Upgraded RocksDB version to 5.0.1 and added configuration for managing...
Prateek Maheshwari [Mon, 30 Jan 2017 22:58:50 +0000 (14:58 -0800)] 
SAMZA-984; Upgraded RocksDB version to 5.0.1 and added configuration for managing RocksDB logging.

Author: Prateek Maheshwari <>

Reviewers: Jake Maes <>, Jagadish <>

Closes #46 from prateekm/rocksdb-upgrade

2 years agoSAMZA-1077: SamzaContainer should catch all Throwables instead of only exceptions
vjagadish1989 [Mon, 30 Jan 2017 22:41:57 +0000 (14:41 -0800)] 
SAMZA-1077: SamzaContainer should catch all Throwables instead of only exceptions

Author: vjagadish1989 <>

Reviewers: Jake Maes <>

Closes #30 from vjagadish1989/samza-1077

2 years agoSpecification of various Window and Trigger APIs in Samza
vjagadish1989 [Mon, 30 Jan 2017 21:50:16 +0000 (13:50 -0800)] 
Specification of various Window and Trigger APIs in Samza

- Defined APIs for specifying different types of windows - sessions, tumbling, global and keyed variants.
- Defined APIs for specifying early and late triggers for a window.
- Standardized all above Window types to be expressed as a combination of default, early and late triggers.
- Defined classes for different types of trigger specifications.
- Hide the WindowState class from programmers and move it from samza-api to samza-operator. We can choose to add it later if need be.
- Removed some implementation classes in Window and Trigger. We can revisit them later when we implement Windows.
- New API for specifying Time durations meaningfully in Samza.
- Unit tests for most of the above changes.
- Misc. Documentation, readability related changes to public APIs.

Author: vjagadish1989 <>

Reviewers: Yi Pan <>, Prateek Maheshwari <>

Closes #30 from vjagadish/samza-operator-v3

2 years agoSAMZA-1025: Documentation for HdfsSystemConsumer
Hai Lu [Mon, 30 Jan 2017 18:58:15 +0000 (10:58 -0800)] 
SAMZA-1025: Documentation for HdfsSystemConsumer

2 years agoSAMZA-1078: Add my gpg key to KEYS
Xinyu Liu [Wed, 11 Jan 2017 18:42:26 +0000 (10:42 -0800)] 
SAMZA-1078: Add my gpg key to KEYS

Author: Xinyu Liu <>

Closes #41 from xinyuiscool/KEYS

2 years agoSAMZA-1076; getKafkaChangelogEnabledStores() should use StorageConfig.getChangelo…
Boris Shkolnik [Fri, 6 Jan 2017 20:22:16 +0000 (12:22 -0800)] 
SAMZA-1076; getKafkaChangelogEnabledStores() should use StorageConfig.getChangelo…

getKafkaChangelogEnabledStores() should use StorageConfig.getChangelogStream to get changelog

Author: Boris Shkolnik <>

Reviewers: xiliu <>

Closes #39 from sborya/KafkConfigForChangelogStream

2 years agoFix checkstyle failure
Yi Pan (Data Infrastructure) [Sat, 24 Dec 2016 00:13:00 +0000 (16:13 -0800)] 
Fix checkstyle failure

2 years agoSAMZA-1065: Change the commit order to support at least once processing when using...
Prateek Maheshwari [Fri, 23 Dec 2016 23:08:55 +0000 (15:08 -0800)] 
SAMZA-1065: Change the commit order to support at least once processing when using local state store for deduping.

Author: Prateek Maheshwari <>

Reviewers: Yi Pan <>, Jagadish <jagadish1989@gmail,com>

Closes #35 from prateekm/commit-order

2 years agoSAMZA-1069: Fix Deadlock between KafkaSystemProducer and KafkaProducer
Xinyu Liu [Fri, 23 Dec 2016 22:32:01 +0000 (14:32 -0800)] 
SAMZA-1069: Fix Deadlock between KafkaSystemProducer and KafkaProducer

Moving the producer.close() and sources.flush() outside the lock so it won't have race condition with the kafka network thread callbacks.

Author: Xinyu Liu <>

Reviewers: Yi Pan <>

Closes #37 from xinyuiscool/SAMZA-1069

2 years agoSAMZA-1066 : JavaStorageConfig handling job.changelog.system
Boris Shkolnik [Fri, 23 Dec 2016 20:29:45 +0000 (12:29 -0800)] 
SAMZA-1066 : JavaStorageConfig handling job.changelog.system


Same as change for
Allows user to set changelog system in job.changelog.system and specify stream only in 'stores.<store>.changelog'

Author: Boris Shkolnik <>

Reviewers: Navina Ramesh <>

Closes #36 from sborya/JavaStorageConfig

2 years agoSAMZA-855; Upgrade Samza's Kafka client version to
Shanthoosh Venkataraman [Fri, 23 Dec 2016 20:22:55 +0000 (12:22 -0800)] 
SAMZA-855; Upgrade Samza's Kafka client version to

This is based out of PR 15, with merge conflicts resolved and version number of zkClient set to 0.8(compatible with Kafka version).

Tested and validated this patch with internal Samza build at LinkedIn. This looks good.

Author: Shanthoosh Venkataraman <>

Reviewers: Yi Pan <>

Closes #33 from shanthoosh/kafka_10_upgrade

2 years agoSAMZA-1031: Update to Java 1.8 source compatibility in Samza
Shanthoosh Venkataraman [Thu, 22 Dec 2016 07:06:58 +0000 (23:06 -0800)] 
SAMZA-1031: Update to Java 1.8 source compatibility in Samza

Author: Shanthoosh Venkataraman <>

Reviewers: Yi Pan <>

Closes #34 from shanthoosh/update_check_all_script

2 years agoSAMZA-469: Update Scala version to 2.11
McIntosh, Craig [Thu, 22 Dec 2016 01:25:31 +0000 (17:25 -0800)] 
SAMZA-469: Update Scala version to 2.11

Author: McIntosh, Craig <>

Reviewers: Yi Pan <>, Shanthoosh Venkataraman <>

Closes #28 from craigtmc/master-scala-2.11 and squashes the following commits:

2705376 [McIntosh, Craig] Fix scala version array in
94f1f7c [McIntosh, Craig] SAMZA-469: Update Scala version to 2.11

2 years agoSAMZA-1062: add docs for the new job.changelog.system config
Boris Shkolnik [Fri, 16 Dec 2016 21:56:21 +0000 (13:56 -0800)] 
SAMZA-1062: add docs for the new job.changelog.system config

Author: Boris Shkolnik <>

Reviewers: Jagadish <>

Closes #32 from sborya/master

2 years agoSAMZA-1060 add new config job.changelog.system
Boris Shkolnik [Thu, 15 Dec 2016 02:24:10 +0000 (18:24 -0800)] 
SAMZA-1060 add new config job.changelog.system

Allow to specify a changelog system separately, so user can only specify stream name for each store.
If user specifies both (system and stream) it overwrites the job.changelog.system setting.

Author: Boris Shkolnik <>

Reviewers: navina

Closes #31 from sborya/master

2 years agoAdding Build Status to
Navina Ramesh [Mon, 12 Dec 2016 19:51:17 +0000 (11:51 -0800)] 
Adding Build Status to

I setup a Jenkins Build Job to  get triggered every time changes are pushed to samza repository.

In case of build failures, it will send an email to We can customize this build as need arises.

Author: Navina Ramesh <>

Reviewers: nickpan47

Closes #27 from navina/build-icon

2 years agoSAMZA-1054: Refactor Operator APIs
Prateek Maheshwari [Thu, 1 Dec 2016 22:50:52 +0000 (14:50 -0800)] 
SAMZA-1054: Refactor Operator APIs

Some suggestions for an Operator API refactor and misc. cleanup. It does contain some implementation changes, mostly due to deleted, extracted or merged classes. (e.g. OperatorFactory + ChainedOperators == OperatorImpls).

Since git marked several moved classes as (delete + new) instead, it's probably best to apply the diff locally and  browse the code in an IDE.

Some of the changes, in no particular order:
* Extracted XFunction interfaces into a .functions package in -api.
* -api's internal.Operators is now the -operators's spec.* package. Extracted interfaces and classes. Factory methods are now in OperatorSpecs.
* -api's MessageStreams is now -api's MessageStream interface and -operators's MessageStreamImpl.
* -api's internal.Windows classes are now in -api's .window package. Extracted interfaces and classes, but no implementation changes.
* OperatorFactory + ChainedOperators is now OperatorImpls, which is used from StreamOperatorAdaptorTask.
* Added a NoOpOperatorImpl, which acts as the root node for the OperatorImpl DAG returned by OperatorImpls.
* Removed usages of reactivestreams APIs since current code looks simpler without them. We can add them back when we need features like backpressure etc.
* Removed the InputSystemMessage interface.
* Made field names consistent (e.g Fn suffix for functions everywhere etc.).
* Some method/class visibility changes due to moved classes.
* General documentation changes, mostly to make public APIs clearer.

There are additional questions/tasks that we can address in future RBs:
* Updating Window and Trigger APIs.
* Merging samza-operator into samza-core.
* Questions about Message timestamp and Offset comparison semantics.
* Questions about OperatorSpec serialization (e.g. ID generation).
* Questions about StateStoreImpl and StoreFunctions.

Author: Prateek Maheshwari <>

Reviewers: Yi Pan <>, Jagadish <>

Closes #25 from prateekm/master

2 years agoSAMZA-1047: testEndOfStreamWithOutOfOrderProcess is flaky
Branislav Cogic [Wed, 30 Nov 2016 22:34:49 +0000 (14:34 -0800)] 
SAMZA-1047: testEndOfStreamWithOutOfOrderProcess is flaky

Chooser always pools end of stream message until final callback is trigered so process-enelopes metrics didn't match. Changed the test methods to return null envelopes after end of stream message.

Deleted unused boolean variable "completed" in AsyncTaskWorker

Author: banecogic <>

Reviewers: Yi Pan <>

Closes #24 from banecogic/SAMZA-1047

2 years agoSAMZA-1055: Disable broken tests in SamzaRest
Shanthoosh Venkataraman [Wed, 30 Nov 2016 22:06:54 +0000 (14:06 -0800)] 
SAMZA-1055: Disable broken tests in SamzaRest

Disables a broken test in SamzaRest due to Jetty version upgrade in Samza. This is a temporary solution just to keep the build green on master. Longer term solution is to mock the Jetty objects properly through Mockito.

Author: Shanthoosh Venkataraman <>

Reviewers: Yi Pan <>

Closes #26 from shanthoosh/master

2 years agoSAMZA-1048 : upgrade jetty dependency to Jetty 9 from Jetty 8
Fred Ji [Wed, 30 Nov 2016 00:17:37 +0000 (16:17 -0800)] 
SAMZA-1048 : upgrade jetty dependency to Jetty 9 from Jetty 8

Jetty 8 is a very old jetty version and the current widely used version is Jetty 9. If a user is using standalone Samza in a Jetty container, and he/she is using Jetty 9, he/she may see some incompatibility issue and it makes a lot of sense for a user to upgrade the Jetty version on his/her side instead of downgrading the Jetty version.

Author: Fred Ji <>

Reviewers: Yi Pan <>

Closes #20 from fredji97/master

2 years agoSAMZA-1051: merge operator APIs to master
Yi Pan (Data Infrastructure) [Tue, 22 Nov 2016 23:47:48 +0000 (15:47 -0800)] 
SAMZA-1051: merge operator APIs to master

2 years agoSAMZA-1046: Docs for checkpointable consumer
Boris Shkolnik [Tue, 22 Nov 2016 22:36:09 +0000 (14:36 -0800)] 
SAMZA-1046: Docs for checkpointable consumer

2 years agoSAMZA-1049 Enable support for reporting metrics from Monitors in SamzaRest. 21/head
Shanthoosh Venkataraman [Tue, 22 Nov 2016 00:55:35 +0000 (16:55 -0800)] 
SAMZA-1049 Enable support for reporting metrics from Monitors in SamzaRest.

2 years agoSAMZA-1042 - Allow offset notifications for input systems
Boris Shkolnik [Mon, 14 Nov 2016 18:30:07 +0000 (10:30 -0800)] 
SAMZA-1042 - Allow offset notifications for input systems

2 years agoSAMZA-1043: Samza performance improvements
Xinyu Liu [Wed, 9 Nov 2016 19:09:32 +0000 (11:09 -0800)] 
SAMZA-1043: Samza performance improvements

2 years agoFix a minor typo in the web-page
vjagadish1989 [Tue, 1 Nov 2016 00:40:24 +0000 (17:40 -0700)] 
Fix a minor typo in the web-page

2 years agoSAMZA-1033: Remove import-control from checkstyle
Navina Ramesh [Tue, 1 Nov 2016 00:14:25 +0000 (17:14 -0700)] 
SAMZA-1033: Remove import-control from checkstyle

Removing import-control after discussion here - [](

Author: Navina Ramesh <>

Reviewers: Jagadish Venkataraman <>

Closes #17 from navina/SAMZA-1033

2 years agoSAMZA-1014: Add property to set YARN AM cpu cores
Maxim Logvinenko [Fri, 21 Oct 2016 01:01:48 +0000 (18:01 -0700)] 
SAMZA-1014: Add property to set YARN AM cpu cores

2 years agoSAMZA-1013: Add node labeling support for Yarn
Maxim Logvinenko [Fri, 21 Oct 2016 00:57:40 +0000 (17:57 -0700)] 
SAMZA-1013: Add node labeling support for Yarn

2 years agoSAMZA-1017 - Added disk quota based throttling to AsyncRunLoop
Prateek Maheshwari [Wed, 19 Oct 2016 19:04:52 +0000 (12:04 -0700)] 
SAMZA-1017 - Added disk quota based throttling to AsyncRunLoop

2 years agoSAMZA-1039 Selective loading of monitors in samza rest
Shanthoosh Venkataraman [Wed, 19 Oct 2016 18:42:36 +0000 (11:42 -0700)] 
SAMZA-1039 Selective loading of monitors in samza rest

2 years agoSAMZA-1040: Revert the ClassLoaderHelper change in SamzaContainer
Xinyu Liu [Wed, 19 Oct 2016 18:27:50 +0000 (11:27 -0700)] 
SAMZA-1040: Revert the ClassLoaderHelper change in SamzaContainer

2 years agoSAMZA-1029: Prepare release candidate for 0.11.0
Xinyu Liu [Mon, 17 Oct 2016 23:27:18 +0000 (16:27 -0700)] 
SAMZA-1029: Prepare release candidate for 0.11.0

2 years agoSAMZA-967: HDFS System Consumer
Hai Lu [Wed, 5 Oct 2016 18:39:17 +0000 (11:39 -0700)] 
SAMZA-967: HDFS System Consumer

2 years agoSAMZA-1012 Generated changelog mappings are not consistent
Tommy Becker [Tue, 4 Oct 2016 15:54:10 +0000 (08:54 -0700)] 
SAMZA-1012 Generated changelog mappings are not consistent

2 years agoSAMZA-974 Support finite data sources that have a notion of end of stream
vjagadish1989 [Sat, 1 Oct 2016 06:07:30 +0000 (23:07 -0700)] 
SAMZA-974 Support finite data sources that have a notion of end of stream

2 years agoSAMZA-1016: Fix support for joint compilation of scala and java sources
vjagadish1989 [Fri, 30 Sep 2016 18:47:53 +0000 (11:47 -0700)] 
SAMZA-1016: Fix support for joint compilation of scala and java sources

2 years agoSAMZA-1015 Add support for streams, Lambdas and other JDK 8 like features in checkstyle
vjagadish1989 [Fri, 30 Sep 2016 18:35:06 +0000 (11:35 -0700)] 
SAMZA-1015 Add support for streams, Lambdas and other JDK 8 like features in checkstyle

2 years agoSAMZA-1030; Add documentation for change in the contribution process
Navina Ramesh [Fri, 30 Sep 2016 01:09:04 +0000 (18:09 -0700)] 
SAMZA-1030; Add documentation for change in the contribution process

I re-organized some of the website files and created a "Contributor's Corner" that collates all info related to new contributors and committers

Also, updated the merge script with constants to match from the website documentation.

Author: Navina Ramesh <>

Reviewers: Jagadish <>

Closes #16 from navina/adding-docs

2 years agoUpdate version for 0.11.0 release release-0.11.0-rc0
Xinyu Liu [Thu, 29 Sep 2016 22:45:17 +0000 (15:45 -0700)] 
Update version for 0.11.0 release

2 years agoSAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for tracking...
Xinyu Liu [Thu, 29 Sep 2016 21:51:23 +0000 (14:51 -0700)] 
SAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for tracking producer exceptions

2 years agoSAMZA-995: Remove deprecated property yarn.container.count from config table.
Xinyu Liu [Thu, 29 Sep 2016 18:50:42 +0000 (11:50 -0700)] 
SAMZA-995: Remove deprecated property yarn.container.count from config table.

2 years agoSAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in the...
vjagadish1989 [Thu, 29 Sep 2016 18:48:22 +0000 (11:48 -0700)] 
SAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in the 0.10 release.

2 years agoSAMZA-1023 ContainerProcessManager spams the job coordinator log with 'TaskManager...
Branislav Cogic [Thu, 29 Sep 2016 13:56:24 +0000 (06:56 -0700)] 
SAMZA-1023 ContainerProcessManager spams the job coordinator log with 'TaskManager state' messages

2 years agoFixed typo in
Anatolie Lupacescu [Wed, 28 Sep 2016 05:05:09 +0000 (22:05 -0700)] 
Fixed typo in

Author: Anatolie Lupacescu <>

Reviewers: Navina Ramesh <>

Closes #11 from anatollupacescu/docs-typo-fix

2 years agoSAMZA-880 - Moving to github/pull-request for code review and check-in
Jagadish [Wed, 28 Sep 2016 02:40:21 +0000 (19:40 -0700)] 
SAMZA-880 - Moving to github/pull-request for code review and check-in

2 years agoSAMZA-1024 Enable support for defining custom configuration for individual monitors
Shanthoosh Venkataraman [Tue, 27 Sep 2016 19:15:56 +0000 (12:15 -0700)] 
SAMZA-1024 Enable support for defining custom configuration for individual monitors

2 years agoSAMZA-927: added docs for split deployment
Boris Shkolnik [Tue, 27 Sep 2016 01:07:10 +0000 (18:07 -0700)] 
SAMZA-927: added docs for split deployment

2 years agoAdd Jagadish and Xinyu to committers
Xinyu Liu [Mon, 26 Sep 2016 23:14:46 +0000 (16:14 -0700)] 
Add Jagadish and Xinyu to committers

2 years agoFix SAMZA-1018.
Tommy Becker [Mon, 19 Sep 2016 12:21:12 +0000 (08:21 -0400)] 
Fix SAMZA-1018.

Check error code from metadata fetch in getSystemStreamPartitionCounts to avoid returning no data for newly created topics.

2 years agoSAMZA-977: User doc for samza multi-threading
Xinyu Liu [Thu, 22 Sep 2016 23:44:30 +0000 (16:44 -0700)] 
SAMZA-977: User doc for samza multi-threading

2 years agoAdd Jake Maes to commiter page
Jacob Maes [Thu, 22 Sep 2016 23:27:06 +0000 (16:27 -0700)] 
Add Jake Maes to commiter page

2 years agoSAMZA-1021 : Remove the redundent poll waiting inside AsyncRunLoop blockIfBusy
Xinyu Liu [Thu, 22 Sep 2016 18:42:13 +0000 (11:42 -0700)] 
SAMZA-1021 : Remove the redundent poll waiting inside AsyncRunLoop blockIfBusy

2 years agoSAMZA-1022 - convert warn() to debug() for 'hostname .. resolves to a loopback address'
Boris Shkolik [Wed, 21 Sep 2016 18:52:54 +0000 (11:52 -0700)] 
SAMZA-1022 - convert warn() to debug() for 'hostname .. resolves to a loopback address'

2 years agoSAMZA 998: Documentation updates for refactored Job Coordinator
Jagadish Venkatraman [Tue, 20 Sep 2016 06:37:10 +0000 (23:37 -0700)] 
SAMZA 998: Documentation updates for refactored Job Coordinator

2 years agoSAMZA-842: Job does not quit when task shutdown is requested using ThreadJobFactory
Tommy Becker [Mon, 12 Sep 2016 22:26:10 +0000 (15:26 -0700)] 
SAMZA-842: Job does not quit when task shutdown is requested using ThreadJobFactory

2 years agoSAMZA-702 - Document the significance of all the different metrics emitted by Samza...
Branislav Cogic [Wed, 14 Sep 2016 01:10:41 +0000 (18:10 -0700)] 
SAMZA-702 - Document the significance of all the different metrics emitted by Samza out of the box

2 years agoSAMZA-1007 - Broadcast streams incompatible with GroupBySystemStreamPartitionFactory
Neil Fordyce [Sun, 11 Sep 2016 00:30:23 +0000 (17:30 -0700)] 
SAMZA-1007 - Broadcast streams incompatible with GroupBySystemStreamPartitionFactory

2 years agoSAMZA-1005 - Refactor class instantiation code to a helper class
Branislav Cogic [Sat, 10 Sep 2016 23:43:30 +0000 (16:43 -0700)] 
SAMZA-1005 - Refactor class instantiation code to a helper class

2 years agoSAMZA-1004: Fix some logging and javadoc issues for AsyncStreamTask
Xinyu Liu [Wed, 7 Sep 2016 22:29:37 +0000 (15:29 -0700)] 
SAMZA-1004: Fix some logging and javadoc issues for AsyncStreamTask

2 years agoSAMZA-976 - Samza REST Documentation
Jacob Maes [Tue, 23 Aug 2016 06:50:10 +0000 (23:50 -0700)] 
SAMZA-976 - Samza REST Documentation

2 years agoSAMZA-975 - Initial Samza REST Implementation
Jacob Maes [Tue, 23 Aug 2016 06:38:13 +0000 (23:38 -0700)] 
SAMZA-975 - Initial Samza REST Implementation

2 years agoSAMZA-1003: Restore lazy init for kafka system producer
Xinyu Liu [Thu, 18 Aug 2016 23:32:13 +0000 (16:32 -0700)] 
SAMZA-1003: Restore lazy init for kafka system producer