samza.git
47 hours agoSAMZA-1595: Fix scalacCompileOptions format to build with zinc scala compiler. master
Shanthoosh Venkataraman [Fri, 23 Feb 2018 00:44:00 +0000 (16:44 -0800)] 
SAMZA-1595: Fix scalacCompileOptions format to build with zinc scala compiler.

Zinc scala compiler(part of gradle version >= 3.0) expects the scala compilation arguments as a list(where each compilation argument is an element of the list).

In samza, the compilation arguments are concatenated into a single string and passed to the compiler.

This causes build failures when samza is built with Zinc scala compiler.

Existing ant scala compiler used to build samza in open source accepts the compilation arguments both as list and string.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #430 from shanthoosh/master

2 days agoSAMZA-1594: Remove ScalaCompileOptions to make samza codebase build with gradle versi...
Shanthoosh Venkataraman [Thu, 22 Feb 2018 23:58:39 +0000 (15:58 -0800)] 
SAMZA-1594: Remove ScalaCompileOptions to make samza codebase build with gradle version > 3.0.

When samza repository is built with the gradle version greater than 3.0, we notice the following build failure.

No such property: useAnt for class: org.gradle.api.tasks.scala.ScalaCompileOptions

This needs to be fixed to build samza with  gradle version >= 3.0.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #427 from shanthoosh/fix_scalac_compile_options

2 days agoSAMZA-1593: Upgrade gradle nexus plugin.
Shanthoosh Venkataraman [Thu, 22 Feb 2018 20:39:10 +0000 (12:39 -0800)] 
SAMZA-1593: Upgrade gradle nexus plugin.

Current nexus plugin which samza repository is using is not compatible with the gradle versions greater than 3.0.

For doing the ligradle migration at linkedin, we need to update the nexus gradle plugin to the latest version.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #426 from shanthoosh/master

3 days agoFixes to get the build working with Scala 2.10 build
Srinivasulu Punuru [Wed, 21 Feb 2018 19:45:23 +0000 (11:45 -0800)] 
Fixes to get the build working with Scala 2.10 build

The fixes needed to get the build working with the Scala 2.10.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Jacob M<jmakes@apache.org>, Bharath K<bkumaras@linkedin.com>, Xinyu L<xiliu@linkedin.com>

Closes #424 from srinipunuru/rel-fixes.1

8 days agoSAMZA-1555: Move creation of checkpoint and changelog streams to the Job Coordinators
Daniel Nishimura [Fri, 16 Feb 2018 22:01:06 +0000 (14:01 -0800)] 
SAMZA-1555: Move creation of checkpoint and changelog streams to the Job Coordinators

**Overview**
The purpose of this PR is to consolidate the creation of the changelog and checkpoint streams into the JobCoordinators. In the current state, the changelog stream is created from the JobModelManager and the checkpoint stream is created within the OffsetManager. The issue with creating the checkpoint in the OffsetManager is that the first call happens from the first SamzaContainer that runs and each subsequent SamzaContainer run will attempt to create the checkpoint stream.

**Motivations**
There are three driving forces for this refactoring. The first motivation is to assign the creation of the changelog and checkpoint streams to the JobCoordinators where it is most appropriate. This was discussed in more detail with nickpan47  . The second motivation is to have any potential failure to stream creation happen no later than during job coordination. The third motivation is to accommodate future security work to provide a robust way to set ACLs on streams.

Follow on to this PR will be: https://issues.apache.org/jira/browse/SAMZA-1564

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Yi Pan <nickpan47@gmail.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #413 from dnishimura/samza-1555-move-changelog-checkpoint-creation and squashes the following commits:

1102314 [Daniel Nishimura] Fix a comment change that Intellij inadvertently refactored.
2b2eb16 [Daniel Nishimura] Trigger CI again
366b7b7 [Daniel Nishimura] Trigger CI
dce36b6 [Daniel Nishimura] Add isBootstrapped flag back to CoordinatorStreamSystemConsumer.
2efcfd4 [Daniel Nishimura] Trigger CI
6b2d912 [Daniel Nishimura] Changes related to Yi's latest review.
9d02e7a [Daniel Nishimura] Change createChangeLogStream to a static method.
effec24 [Daniel Nishimura] Cleanup from latest code review.
1bdfda7 [Daniel Nishimura] CoordinatorStreamManager lifecycle refactoring.
5178ca5 [Daniel Nishimura] Refactor AbstractCoordinatorStreamManager as a concrete class.
894af9c [Daniel Nishimura] Merge from master branch.
4009a0b [Daniel Nishimura] Changes from code review.
3a12a75 [Daniel Nishimura] Separation of changelog manager and jobmodel manager. Create CoordinatorStream class to encapsulate creation and management of coordinator stream consumer and producer.
c188adb [Daniel Nishimura] Merge from master branch.
971fa91 [Daniel Nishimura] Move the responsibility of changelog and checkpoint stream creation to the job coordinators.

8 days agoSAMZA-1588: Add random jitter to monitor’s scheduling interval.
Shanthoosh Venkataraman [Fri, 16 Feb 2018 18:34:21 +0000 (10:34 -0800)] 
SAMZA-1588: Add random jitter to monitor’s scheduling interval.

We’ve observed in LinkedIn execution environments that, all the monitors running on the YARN node-manager machines hitting an external service at the same time based upon the configured monitor scheduling interval.

To eliminate unnecessary monitor execution spike and congestion caused to an external service at the same time, it’s essential to add a random jitter to the monitor scheduling interval.

Random jitter will be added to monitor scheduling interval based upon a boolean configuration.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #422 from shanthoosh/SAMZA-1588

9 days agoSupport source types where the last part of the source is not the streamName
Srinivasulu Punuru [Thu, 15 Feb 2018 23:46:17 +0000 (15:46 -0800)] 
Support source types where the last part of the source is not the streamName

Contains following fixes

1. Right now Samza SQL framework assumes that the last part of the source is the stream Name, removed the assumption
2. Made consoleLoggingSystemFactory to log formatted json so that it's easily readable.
3. Added support in SamzaSqlRelMessage where the key may not be present.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #421 from srinipunuru/four-part.1

10 days agoSAMZA-1568: Handle ZkInterruptedException in zkclient.close.
Shanthoosh Venkataraman [Wed, 14 Feb 2018 03:30:03 +0000 (19:30 -0800)] 
SAMZA-1568: Handle ZkInterruptedException in zkclient.close.

When zookeeper session failures occur in a stream processor,   leaves the group(zkClient is closed) and joins the group again.

The last step in that shutdown sequence is zkClient.close(). In some scenarios, it throws the following exception,

    org.I0Itec.zkclient.exception.ZkInterruptedException: java.lang.InterruptedException
    at org.I0Itec.zkclient.ZkClient.close(ZkClient.java:1278)
    at org.apache.samza.zk.ZkControllerImpl.stop(ZkControllerImpl.java:92)

    at org.apache.samza.zk.ZkJobCoordinator.stop(ZkJobCoordinator.java:141)
In existing implementation this is not handled, there by killing the stream processor.  The following codepath triggers this exception:

`StreamProcessor.stop -> ZkJobCoordinator.stop() ->  zkController.stop() -> zkUtils.close`

This exception causes the integration test to fail occasionally  and can cause LocalApplicationRunner.waitForFinish method call to block indefinitely(since this callback event success, updates the latch state required for waitForFinish to end).

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #416 from shanthoosh/zk_utils_close

11 days agoSAMZA-1572: Add fixed retries on failure in KafkaCheckpointManager.
Shanthoosh Venkataraman [Tue, 13 Feb 2018 19:22:34 +0000 (11:22 -0800)] 
SAMZA-1572: Add fixed retries on failure in KafkaCheckpointManager.

KafkaCheckpointManager.writeCheckpoint currently goes into a infinite loop when an irrecoverable failure happens, this indefinitely blocks the commit phase (there by preventing processing). Added finite retries (50), which would retry for fixed time in case of failure before giving up.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek M<prateekm@cs.utexas.edu>

Closes #420 from shanthoosh/add_fixed_retries_in_kafka_checkpoint_manager

3 weeks agoSAMZA-1489: TaskInstance should commit offset before it closes() if auto commit is...
Dong Lin [Thu, 1 Feb 2018 20:07:56 +0000 (12:07 -0800)] 
SAMZA-1489: TaskInstance should commit offset before it closes() if auto commit is enabled

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #417 from lindong28/SAMZA-1489

3 weeks agoSAMZA-1552: Host affinity improvements - Improve matching of hosts to allocated resources
Jagadish [Thu, 1 Feb 2018 05:38:14 +0000 (21:38 -0800)] 
SAMZA-1552: Host affinity improvements - Improve matching of hosts to allocated resources

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek <prateekm@cs.utexas.edu>

Closes #401 from vjagadish1989/host-affinity-fix

4 weeks agoSAMZA-1578: Fix watermark bug found by BEAM tests
Xinyu Liu [Fri, 26 Jan 2018 22:02:42 +0000 (14:02 -0800)] 
SAMZA-1578: Fix watermark bug found by BEAM tests

The problem is getOutputWatermark() does not return the real outputWatermark. This caused problem in user override watermark function.

Author: xiliu <xiliu@linkedin.com>

Reviewers: Jagadish <vjagadish1989@gmail.com>

Closes #415 from xinyuiscool/SAMZA-1578

4 weeks agoAdded some logging to stdout for easier parsing by tools.
Prateek Maheshwari [Fri, 26 Jan 2018 20:04:29 +0000 (12:04 -0800)] 
Added some logging to stdout for easier parsing by tools.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #414 from prateekm/print-container-info

4 weeks agoSAMZA-1548; Add start() and stop() to SystemAdmin
Dong Lin [Fri, 26 Jan 2018 01:28:13 +0000 (17:28 -0800)] 
SAMZA-1548; Add start() and stop() to SystemAdmin

This patch adds start() and stop() to SystemAdmin interface. This can be useful for e.g. kafka.admin.AdminClient which needs to be started before it can be used.

Since we add this method in interface and expect AdminClient to be stateful and probably has its own thread, there will be higher cost to instantiate a new SystemAdmin. Thus we probably want to re-use the SystemAdmin instances instead of creating SystemAdmin on demand when needed. Therefore, this patch also adds SystemAdmins class to help manage a map from system to SystemAdmin, similar to the existing SystemProducers class in Samza.

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #397 from lindong28/SAMZA-1548

4 weeks agoSAMZA-1557: Broadcast operator
Xinyu Liu [Wed, 24 Jan 2018 22:27:39 +0000 (14:27 -0800)] 
SAMZA-1557: Broadcast operator

This patch adds Broadcast operator that allows broadcasting messages to all tasks. It's the counterpart of the Samza broadcast stream in low level api, and will be used by BEAM runner to broadcast views as side input to other part of the pipeline.

Author: xiliu <xiliu@linkedin.com>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #410 from xinyuiscool/SAMZA-1557

4 weeks agofix the bug containsValue method
michaelwong [Mon, 22 Jan 2018 22:11:25 +0000 (14:11 -0800)] 
fix the bug containsValue method

containsValue method should invoke map.containsValue not map.containsKey

Author: michaelwong <michaelwong95@users.noreply.github.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #12 from jwongo/master

4 weeks agoupTime should be field not method
zhangyijun [Mon, 22 Jan 2018 22:05:51 +0000 (14:05 -0800)] 
upTime should be field not method

Author: zhangyijun <joa.zhang@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #9 from zhangyijun/patch-1

4 weeks agoSAMZA-1561: Fix inconsistency problem in JobModel publish.
Shanthoosh Venkataraman [Mon, 22 Jan 2018 19:31:08 +0000 (11:31 -0800)] 
SAMZA-1561: Fix inconsistency problem in JobModel publish.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish V<jagadish@apache.org>, Xinyu Liu<xinyu@apache.org>

Closes #409 from shanthoosh/master

5 weeks agoSAMZA-1407: upgrade junit version to 4.12
Fred Ji [Thu, 18 Jan 2018 02:35:34 +0000 (18:35 -0800)] 
SAMZA-1407: upgrade junit version to 4.12

"./gradlew clean check" passed

Author: Fred Ji <haifeng.ji@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #406 from fredji97/junit4_12_new

5 weeks agoSAMZA-1560: Handle key-serde errors in KafkaCheckpointManager
Jagadish [Thu, 18 Jan 2018 02:27:40 +0000 (18:27 -0800)] 
SAMZA-1560: Handle key-serde errors in KafkaCheckpointManager

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Xinyu Liu<xinyuiscool@gmail.com>

Closes #408 from vjagadish1989/kcm-fix

5 weeks agoSAMZA-1558: State restore metrics should be duplicated and deprecated…
Jacob Maes [Wed, 17 Jan 2018 22:26:53 +0000 (14:26 -0800)] 
SAMZA-1558: State restore metrics should be duplicated and deprecated…

… to avoid type conflicts

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #407 from jmakes/samza-1558

5 weeks agoSAMZA-1523: Cleanup table entries before shutting down the processor
navina [Wed, 17 Jan 2018 18:11:40 +0000 (10:11 -0800)] 
SAMZA-1523: Cleanup table entries before shutting down the processor

Modified the `TableUtils#deleteProcessorEntity` to provide an option to disable optimistic locking during a call to Azure Table Storage service.

sborya PawasChhokra nickpan47   Review please?

Author: navina <navina@apache.org>

Reviewers: Shanthoosh V<svenkata@linkedin.com>, Boris S<bshkolni@linkedin.com>

Closes #379 from navina/azure-etag-fix

5 weeks agoSAMZA-1556: Adding support for multi level sources in queries
Srinivasulu Punuru [Wed, 17 Jan 2018 18:06:50 +0000 (10:06 -0800)] 
SAMZA-1556: Adding support for multi level sources in queries

Right now Samza SQL supports queries with just two levels i.e. `select * from foo.bar`. But there can be sources that are identified though multiple levels. for e.g. `select * from kafka.clusterName.topicName`.

This change adds the support for sql queries with sources that have more than two levels.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Miguel S<misanchez@linkedin.com>, Aditya T<atoomula@linkedin.com>

Closes #405 from srinipunuru/multi-level.1

5 weeks agoSAMZA-1500: Added metrics for RocksDB state store memory usage
Prateek Maheshwari [Wed, 17 Jan 2018 17:39:26 +0000 (09:39 -0800)] 
SAMZA-1500: Added metrics for RocksDB state store memory usage

Approximate RocksDB memory usage = Configured Block Cache size + MemTable size + Indexes and Bloom Filters size =
rocksdb.block-cache-size + rocksdb.size-all-mem-tables + rocksdb.estimate-table-readers-mem

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #404 from prateekm/rocksdb-memory

6 weeks agoFix for the TestSamzaSqlApplicationConfig.testConfigInit
Srinivasulu Punuru [Thu, 11 Jan 2018 01:14:03 +0000 (17:14 -0800)] 
Fix for the TestSamzaSqlApplicationConfig.testConfigInit

Currently testConfigInit checks for a hardcoded number for udfs. Whenever a new UDF is added, This test is going to fail if it is not updated. Changed the test to validate the number of udfs based on the config that is passed.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #403 from srinipunuru/testfix.1

6 weeks agoSAMZA-1535: Support for UDFs in where clauses
Srinivasulu Punuru [Wed, 10 Jan 2018 19:17:39 +0000 (11:17 -0800)] 
SAMZA-1535: Support for UDFs in where clauses

The existing version of the udf implementation doesn't seem to support udfs in the where clauses because the Type of the object returned is "ANY" and when you do a
`select * from kafka.topic where regexMatch('.*foo', Name)` it fails in the query validation, because calcite doesn't know the type of regexMatch.

To solve the problem, We made the scalarUdf generic with a strongly typed return type.

https://issues.apache.org/jira/browse/SAMZA-1535

This PR can be merged into trunk not the 0.14.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #386 from srinipunuru/udf-where.1

6 weeks agoSAMZA-1530; Bump up Kafka dependency to 0.11
Dong Lin [Wed, 10 Jan 2018 18:52:38 +0000 (10:52 -0800)] 
SAMZA-1530; Bump up Kafka dependency to 0.11

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #395 from lindong28/SAMZA-1530

6 weeks agoRevert "SAMZA-1553: Add log4j for latest Kafka build"
xiliu [Wed, 10 Jan 2018 18:33:23 +0000 (10:33 -0800)] 
Revert "SAMZA-1553: Add log4j for latest Kafka build"

This reverts commit 5238aaa6cee81a87079c9d432204422ececea793.

6 weeks agoSAMZA-1550: Update samza 0.14 version in tests
xiliu [Tue, 9 Jan 2018 23:53:40 +0000 (15:53 -0800)] 
SAMZA-1550: Update samza 0.14 version in tests

6 weeks agoSAMZA-1553: Add log4j for latest Kafka build
Xinyu Liu [Tue, 9 Jan 2018 18:48:10 +0000 (10:48 -0800)] 
SAMZA-1553: Add log4j for latest Kafka build

Add it so Samza compiles with the latest kafka.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Boris Shkolnik <sborya@gmail.com>

Closes #402 from xinyuiscool/SAMZA-1553

7 weeks agoFix a link in release-notes.md
xiliu [Thu, 4 Jan 2018 17:51:20 +0000 (09:51 -0800)] 
Fix a link in release-notes.md

7 weeks agoSAMZA-1550: Update master to use 0.14.1-SNAPSHOT version
Xinyu Liu [Thu, 4 Jan 2018 00:10:45 +0000 (16:10 -0800)] 
SAMZA-1550: Update master to use 0.14.1-SNAPSHOT version

Update master to use 0.14.1-SNAPSHOT version.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #400 from xinyuiscool/SAMZA-1550-2

7 weeks agoSAMZA-1550: Doc for 0.14.0 release
Xinyu Liu [Wed, 3 Jan 2018 17:56:23 +0000 (09:56 -0800)] 
SAMZA-1550: Doc for 0.14.0 release

Docs update for both master and 0.14.0 branch.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #396 from xinyuiscool/SAMZA-1550

7 weeks agoSAMZA-1528: Change ClusterResourceManager to use the async NMClient
Jagadish [Tue, 2 Jan 2018 20:16:51 +0000 (12:16 -0800)] 
SAMZA-1528: Change ClusterResourceManager to use the async NMClient

- Rewrite container handling to be asynchronous
- Verified various failure scenarios using Unit tests, and deployments of a local Samza job.

Author: Jagadish <jvenkatraman@linkedin.com>
Author: Fred Ji <haifeng.ji@gmail.com>
Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Jacob Maes<jmakes@linkedin.com>, Xinyu Liu<xinyuiscool@gmail.com>

Closes #380 from vjagadish1989/cluster-mgr-refactor1

2 months agoSAMZA-1547; Parse default value grouper-factory config in KafkaCheckpointMgr
Jagadish [Fri, 22 Dec 2017 21:58:33 +0000 (13:58 -0800)] 
SAMZA-1547; Parse default value grouper-factory config in KafkaCheckpointMgr

- Additionally, updated all unit-tests.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek M <prmaheshw@linkedin.com>

Closes #394 from vjagadish1989/kcm-fix

2 months agoFix a couple sonarcloud issues with samza-1537
Jacob Maes [Fri, 22 Dec 2017 19:41:27 +0000 (11:41 -0800)] 
Fix a couple sonarcloud issues with samza-1537

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #393 from jmakes/streamappender-sonarcloud

2 months agoSAMZA-1539: KafkaProducer potential hang on close() when task.drop.pr…
Jacob Maes [Fri, 22 Dec 2017 18:36:37 +0000 (10:36 -0800)] 
SAMZA-1539: KafkaProducer potential hang on close() when task.drop.pr…

…oducer.errors==true

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Boris Shkolnik <boryas@apache.org>

Closes #390 from jmakes/samza-1539

2 months agoSAMZA-1537: StreamAppender can deadlock due to locks held by Kafka an…
Jacob Maes [Thu, 21 Dec 2017 19:43:09 +0000 (11:43 -0800)] 
SAMZA-1537: StreamAppender can deadlock due to locks held by Kafka an…

…d Log4j

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Jagadish <jvenkatr@linkedin.com>,Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Closes #388 from jmakes/async-stream-appender

2 months agoMerge script improvement - use colons instead of semicolons
Jacob Maes [Thu, 21 Dec 2017 18:43:47 +0000 (10:43 -0800)] 
Merge script improvement - use colons instead of semicolons

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>,Jagadish <jvenkatr@linkedin.com>,Boris Shkolnik <boryas@apache.org>

Closes #391 from jmakes/merge-script-improvements

2 months agoSAMZA-1356: Improve monitoring for state restore
Jacob Maes [Tue, 19 Dec 2017 16:00:56 +0000 (08:00 -0800)] 
SAMZA-1356: Improve monitoring for state restore

Author: Jacob Maes <jmakes@apache.org>
Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #241 from jmakes/samza-1356

2 months agoSAMZA-1538: Disabled Flaky Tests in TestStreamProcessor
Prateek Maheshwari [Tue, 19 Dec 2017 00:45:06 +0000 (16:45 -0800)] 
SAMZA-1538: Disabled Flaky Tests in TestStreamProcessor

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Shanthoosh Venkataraman <svenkata@linkedin.com>

Closes #389 from prateekm/disable-flaky-test

2 months agoSAMZA-1536; Adding docs for Kinesis consumer
Aditya Toomula [Thu, 14 Dec 2017 23:33:29 +0000 (15:33 -0800)] 
SAMZA-1536; Adding docs for Kinesis consumer

Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Jagadish <jagadish@apache.org>

Closes #384 from atoomula/kinesis-docs

2 months agoAdded document for table API to feature preview
Wei Song [Thu, 14 Dec 2017 20:04:34 +0000 (12:04 -0800)] 
Added document for table API to feature preview

Added document for table API to feature preview
 - Brief description of table
 - sendTo() operator for table
 - join() operator for stream-table-join

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #387 from weisong44/table-api-14

2 months agoSAMZA-1437; Added Eventhub producer and consumer docs
Daniel Chen [Wed, 13 Dec 2017 05:47:38 +0000 (21:47 -0800)] 
SAMZA-1437; Added Eventhub producer and consumer docs

Still need to add tutorials, and configs to configurations table

vjagadish1989  for review

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #382 from dxichen/eventhub-docs

2 months agoSAMZA-1502; Make AllSspToSingleTaskGrouper work with Yarn and ZK JobCoordinator
Aditya Toomula [Wed, 13 Dec 2017 05:41:37 +0000 (21:41 -0800)] 
SAMZA-1502; Make AllSspToSingleTaskGrouper work with Yarn and ZK JobCoordinator

Sending a fresh review as I lost the earlier diffs. This is the new approach that we discussed by adding the processor list in the config and passing it to grouper.

Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>, Shanthoosh V <svenkataraman@linkedin.com>

Closes #383 from atoomula/samza

2 months agoSAMZA-1512: Documentation on the multi-stage batch processing
Xinyu Liu [Wed, 13 Dec 2017 01:07:07 +0000 (17:07 -0800)] 
SAMZA-1512: Documentation on the multi-stage batch processing

Add overview documentation to explain how partitionBy(), checkpoint and state works in batch. Also organized the existing hdfs consumer/producer docs into the same hadoop folder under documentation.

Author: xinyuiscool <xinyuliu.us@gmail.com>

Reviewers: Jake Maes <jmakes@gmail.com>

Closes #381 from xinyuiscool/SAMZA-1512

2 months agoSAMZA-1534: Fix the visualization in job graph with the new PartitionBy Op
Xinyu Liu [Tue, 12 Dec 2017 20:16:39 +0000 (12:16 -0800)] 
SAMZA-1534: Fix the visualization in job graph with the new PartitionBy Op

Seems the stream and the partitionBy op has the same id. So in rendering I added the stream as the id for the node. Also resolved the run.id collision issue.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #385 from xinyuiscool/SAMZA-1534

2 months agoDocumentation for Samza SQL
Srinivasulu Punuru [Tue, 12 Dec 2017 18:46:01 +0000 (10:46 -0800)] 
Documentation for Samza SQL

**Samza tools** :
Contains the following tools that can be used for playing with Samza sql or any other samza job.

1. Generate kafka events : Tool used to generate avro serialized kafka events
2. Event hub consumer : Tool used to consume events from event hubs topic. This can be used if the samza job writes events to event hubs.
3. Samza sql console : Tool used to execute SQL using samza sql.

Adds documentation on how to use Samza SQL on a local machine and on a yarn environment and their associated Samza tooling.

https://issues.apache.org/jira/browse/SAMZA-1526

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Yi Pan<nickpan47@gmail.com>, Jagadish<jagadish@apachehe.org>

Closes #374 from srinipunuru/docs.1

2 months agoSAMZA-1532; Eventhub connector fix
Daniel Chen [Tue, 12 Dec 2017 00:45:44 +0000 (16:45 -0800)] 
SAMZA-1532; Eventhub connector fix

Key fixes vjagadish1989 lhaiesp srinipunuru
- Switched Producer source vs destination assumptions in `send`, `register`
- Check `OME.key` if `OME.partitionId` is null for to get partitionId
- Upcoming offset changed the `END_OF_STREAM` rather than `newestOffset` + 1, eventHub returns an error if the offset does not exist in the system
- Made the NewestOffset+1 as upcoming offset, consumer checks if the offset is valid on startup
- Differentiated between streamNames and streamIds in configs, consumer, producer
- Checkpoint table named after job name
- Checkpoint prints better message for invalid key on write

QOL
- How to ignore integration tests
- Improved logging

EDIT:
- Also added Round Robin producer partitioning

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #377 from dxichen/eventhub-connector-fix

2 months agoInitial version of Table API
Wei Song [Tue, 5 Dec 2017 21:23:44 +0000 (13:23 -0800)] 
Initial version of Table API

Initial version of table API, it includes
 - Core table API (Table, TableDescriptor, TableSpec)
 - Local table implementation for in-memory and RocksDb
 - The writeTo() and stream-table join operators

nickpan47 xinyuiscool prateekm could you help review?

Author: Wei Song <wsong@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>, Christopher Pettitt <cpettitt@linkedin.com>

Closes #349 from weisong44/table-api-14

2 months agoFix compile errors with scala 2.12 and update release notes to use check all
Bharath Kumarasubramanian [Tue, 5 Dec 2017 01:11:03 +0000 (17:11 -0800)] 
Fix compile errors with scala 2.12 and update release notes to use check all

Author: Bharath Kumarasubramanian <bkumaras@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #378 from bharathkk/master

2 months agoUpdated Serde related documentation and error messages for High Level API
Prateek Maheshwari [Mon, 4 Dec 2017 22:21:21 +0000 (14:21 -0800)] 
Updated Serde related documentation and error messages for High Level API

Updated and clarified the documentation and error messages related to Serdes for Input/Output/PartitionBy streams.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <vjagadish1989@gmail.com>

Closes #376 from prateekm/documentation-cleanup

2 months agoSAMZA-1506: Fix for robust ContainerHeartbeatMonitor exception handling.
Abhishek Shivanna [Fri, 1 Dec 2017 23:23:12 +0000 (15:23 -0800)] 
SAMZA-1506: Fix for robust ContainerHeartbeatMonitor exception handling.

The Fix includes the following changes:
- Catch all exceptions inside the heartbeat thread and not just
  IOException.
- A time based force kill when the heartbeat is invalid,
  this makes the monitor immune to threads that may keep the
  container stuck in the shutdown sequence. When the timeout
  occurs, a System.exit(1) is called.
- Increasing number of retries for failed heartbeats from 3 to 6.
  This prevents short intermittent network failurs from causing the
  containers to be invalidated.

Author: Abhishek Shivanna <abhisheks91@gmail.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #375 from abhishekshivanna/container-heartbeat

2 months agoSAMZA-1519: Add release notes to website documentation
navina [Fri, 1 Dec 2017 16:57:06 +0000 (08:57 -0800)] 
SAMZA-1519: Add release notes to website documentation

Adding a versioned page for release/upgrade notes. We can start this process from the next major version release, aka 0.14.0.

Please update this page as and when you add new features/configs/API or deprecate features/configs/API. Basically, anything that can be useful for Samza users trying to upgrade.

Note: `site.version` is not necessarily same as samza release version. For now, I am using it as a placeholder. Hopefully, with the next generation of our website, it will be better defined.

Author: navina <navina@apache.org>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #301 from navina/versioning

2 months agoSAMZA-1518: Add CPU utilization and file descriptor count to JvmMetrics
Jacob Maes [Fri, 1 Dec 2017 01:37:40 +0000 (17:37 -0800)] 
SAMZA-1518: Add CPU utilization and file descriptor count to JvmMetrics

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #372 from jmakes/samza-1518

2 months agoSAMZA-1516: Another round of issues found by BEAM tests
xiliu [Wed, 29 Nov 2017 19:34:28 +0000 (11:34 -0800)] 
SAMZA-1516: Another round of issues found by BEAM tests

A couple of more fixes: 1. fix a bug of identifying input streams for an operator. 2. for partitionBy, set the partitionKey to 0L when key is null.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #370 from xinyuiscool/SAMZA-1516

2 months agoSAMZA-1514: Prevent changelog stream names with empty strings.
Daniel Nishimura [Wed, 29 Nov 2017 00:19:45 +0000 (16:19 -0800)] 
SAMZA-1514: Prevent changelog stream names with empty strings.

This fix prevents changelog stream names with empty or whitespace-only strings.

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #371 from dnishimura/samza-1514-empty-string-changelog

2 months agoSAMZA-1412 replace mockito-all with mockito-core
Fred Ji [Tue, 28 Nov 2017 22:36:24 +0000 (14:36 -0800)] 
SAMZA-1412 replace mockito-all with mockito-core

ran "./gradlew clean check" and all tests passed

Author: Fred Ji <haifeng.ji@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #365 from fredji97/mockito-core

2 months agoSAMZA-1515; Implement a consumer for Kinesis
Aditya Toomula [Tue, 28 Nov 2017 21:12:10 +0000 (13:12 -0800)] 
SAMZA-1515; Implement a consumer for Kinesis

Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Jagadish<jagadish@apache.org>

Closes #368 from atoomula/kinesis

2 months agoSAMZA-1513; Doc updates for persistent windows, joins and serdes.
Jagadish [Tue, 28 Nov 2017 20:30:35 +0000 (12:30 -0800)] 
SAMZA-1513; Doc updates for persistent windows, joins and serdes.

jmakes prateekm nickpan47 for review.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Jake Maes<jmaes@linkedin.com>

Closes #369 from vjagadish1989/doc-updates

2 months agoFixing broken link between Yarn Host Affinity and Resource Localizati…
navina [Mon, 27 Nov 2017 21:49:35 +0000 (13:49 -0800)] 
Fixing broken link between Yarn Host Affinity and Resource Localizati…

Fixing broken link between Yarn Host Affinity and Resource Localization pages under Documentation

Patch needs to be back-ported to 0.13.0 website!

Author: navina <navina@apache.org>

Reviewers: Jake Maes <jmakes@gmail.com>

Closes #302 from navina/website-link-fix

3 months agoSAMZA-1507; Create changelog streams in Leader(ZkJobCoordinator) for stateful operators.
Shanthoosh Venkataraman [Wed, 22 Nov 2017 19:54:10 +0000 (11:54 -0800)] 
SAMZA-1507; Create changelog streams in Leader(ZkJobCoordinator) for stateful operators.

Verified with a test standalone job. Will add integration test for this as a part of fixing and reenabling standalone integration tests.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Boris Shkolnik <boryas@apache.org>

Closes #362 from shanthoosh/master

3 months agoSAMZA-1494; Flush operator state at end of stream
Jagadish [Tue, 21 Nov 2017 20:42:17 +0000 (12:42 -0800)] 
SAMZA-1494; Flush operator state at end of stream

- Propagate operator messages at endOfStream to all down-stream operators.
- Emit all pending windows when endOfStream is reached.
- Flush all state on endOfStream irrespective of auto-commit behavior.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Xinyu Liu <xiliu@linkedin.com>

Closes #366 from vjagadish1989/eos-flush

3 months agoSAMZA-1505: Fix CheckpointTool writing only one ssp per task
Xinyu Liu [Tue, 21 Nov 2017 19:26:55 +0000 (11:26 -0800)] 
SAMZA-1505: Fix CheckpointTool writing only one ssp per task

Currently when using CheckpointTool to write checkpoints, it only writes a checkpoint of a single ssp per task. By debugging the code, looks like the flatMap() on the checkpoint of Optional tuple(taskname -> Map(ssp -> offset)) merges the results by key taskname. This patch stores the results explicitly in a list and then groupBy() on it, which fixes the problem.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Jake Maes <jmakes@gmail.com>

Closes #364 from xinyuiscool/SAMZA-1505

3 months agoSAMZA-1482: add config documentation for auto-restart/fail behavior o…
Yi Pan (Data Infrastructure) [Mon, 20 Nov 2017 19:48:14 +0000 (11:48 -0800)] 
SAMZA-1482: add config documentation for auto-restart/fail behavior o…

…n partition count changes

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #363 from nickpan47/partition-change-docsite and squashes the following commits:

5342ea0 [Yi Pan (Data Infrastructure)] SAMZA-1482: fix configuration table in documentation site
7c9f326 [Yi Pan (Data Infrastructure)] SAMZA-1482: add config documentation for auto-restart/fail behavior on partition count changes

3 months agoSAMZA-1479; Refactor KafkaCheckpointManager, KafkaCheckpointLogKey and their tests
Jagadish [Fri, 17 Nov 2017 22:41:05 +0000 (14:41 -0800)] 
SAMZA-1479; Refactor KafkaCheckpointManager, KafkaCheckpointLogKey and their tests

Notable changes:
* Rewrite `KafkaCheckpointLogKey` into two classes - an immutable class, and a SerDe
* Remove dependency on static setters in the `KafkaCheckpointLogKey`
* Change lifecycle of components in KafkaCheckpointManager
  - It's safe to start producers and consumers during `start` as opposed to lazy loading them during writes, and reads.
  - Initialize systemProducer and systemConsumer during construction
* Simplify logic for ignoring checkpoint validations
* Re-write checkpointManager#readLog() to use a simpler API.
* Remove unnecessary complexity after the migration from 0.8
* Remove unnecessary locking in startup, and shut-down
* Remove dependencies on SimpleConsumer configs like bufferSize, fetchSize, socketTimeout
* Refactor KafkaCheckpointManagerFactory and remove static getCheckpointSystemNameAndFactory
* Bug-fix : Register the taskName correctly (instead of using a dummy string for the taskName)

* Add unit tests to verify more checkpoint scenarios
* Consolidate unit tests into utils for creating producer, consumer and admin instances
* Convert/consolidate most long-running integration tests into unit tests

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #348 from vjagadish1989/kafka-checkpointmanager-refactor

3 months agoSamza SQL implementation for basic projects, filtering and UDFs
Srinivasulu Punuru [Thu, 16 Nov 2017 19:14:32 +0000 (11:14 -0800)] 
Samza SQL implementation for basic projects, filtering and UDFs

## Samza SQL implementation for basic projects, filtering and

## Design document:
https://docs.google.com/document/d/1bE-ZuPfTpntm1hT3GwQEShYDiTqU3IkxeP4-3ZcGHgU/edit?usp=sharing

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #295 from srinipunuru/samza-sql.1

3 months agoSAMZA-1504: Allow user to register container-level metrics
Xinyu Liu [Thu, 16 Nov 2017 18:46:05 +0000 (10:46 -0800)] 
SAMZA-1504: Allow user to register container-level metrics

This change allows user to register the metrics on the per-container basis.

Tested in beam runner and works as expected.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Prateek Maheshwari <prateekm@apache.org>

Closes #361 from xinyuiscool/SAMZA-1504

3 months agoSAMZA-1495: Set intermediate streams as higher priority by default
Prateek Maheshwari [Wed, 15 Nov 2017 18:45:08 +0000 (10:45 -0800)] 
SAMZA-1495: Set intermediate streams as higher priority by default

Most changes in StreamConfig are formatting fixes.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <vjagadish1989@gmail.com>

Closes #358 from prateekm/intermediate-stream-priority

3 months agoSAMZA-1501: Validate operator IDs so that they don't contain special characters and...
Prateek Maheshwari [Wed, 15 Nov 2017 18:27:08 +0000 (10:27 -0800)] 
SAMZA-1501: Validate operator IDs so that they don't contain special characters and spaces

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>, Jagadish Venkatraman <vjagadish1989@gmail.com>

Closes #359 from prateekm/operator-id-validation

3 months agoSAMZA-1482: Restart or fail Samza jobs in YARN when detecting changes…
Yi Pan (Data Infrastructure) [Tue, 14 Nov 2017 20:45:23 +0000 (12:45 -0800)] 
SAMZA-1482: Restart or fail Samza jobs in YARN when detecting changes…

… in input topic partitions

Some high-lights of the changes:
- always instantiating StreamPartitionCountMonitor on all input system streams now
-- it is debatable whether we want to include systems that do not implement the optimized ExtendedSystemAdmin interface. We may need to configure a long partition monitor interval for this case and the case where there are tons of input topics. (Pending perf test)
- moved the instantiation of StreamPartitionCountMonitor out of JobModelManager and allow ClusterBasedJobCoordinator associate a callback method directly to the monitor
- allow callbacks to set different application status code before throwing exception to shutdown the job

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>, Jagadish <jagadish@apache.org>

Closes #351 from nickpan47/restart-on-partition-change and squashes the following commits:

8d04cd6 [Yi Pan (Data Infrastructure)] SAMZA-1482: restart or fail the job when input topic partition count changes
ee3fa65 [Yi Pan (Data Infrastructure)] SAMZA-1482: Restart or fail Samza jobs in YARN when detecting changes in input topic partitions

3 months agoSAMZA-1474: Bump up rocksdb version to 5.7.3 to include licensing changes
Bharath Kumarasubramanian [Tue, 14 Nov 2017 19:00:34 +0000 (11:00 -0800)] 
SAMZA-1474: Bump up rocksdb version to 5.7.3 to include licensing changes

You can find more details on the bug fixes and API changes [here](https://github.com/facebook/rocksdb/releases). I upgraded to 5.7.3 since 5.8.0 has a regression [KAFKA-6100](https://issues.apache.org/jira/browse/KAFKA-6100)

All of our tests passed locally. I will monitor travis for failures.

Author: Bharath Kumarasubramanian <bkumaras@linkedin.com>

Reviewers: Boris Shkolnik <boryas@gmail.com>

Closes #344 from bharathkk/master

3 months agoSAMZA-1487: Disable Flaky Zk Integration tests.
Shanthoosh Venkataraman [Tue, 14 Nov 2017 18:54:32 +0000 (10:54 -0800)] 
SAMZA-1487: Disable Flaky Zk Integration tests.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #355 from shanthoosh/master

3 months agoSAMZA-1490: Fix TestRepartitionJoinWindowApp
Dong Lin [Tue, 14 Nov 2017 18:48:50 +0000 (10:48 -0800)] 
SAMZA-1490: Fix TestRepartitionJoinWindowApp

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Prateek Maheshwari <prateekm@apache.org>

Closes #356 from lindong28/SAMZA-1490

3 months agoSAMZA-1492: Add exception counter to LocalStoreMonitor.
Shanthoosh Venkataraman [Tue, 14 Nov 2017 15:56:13 +0000 (07:56 -0800)] 
SAMZA-1492: Add exception counter to LocalStoreMonitor.

Add storeGCExceptionCounter to LocalStoreMonitor(as a part of LocalStoreMonitorMetrics) to enable alerts setup in an production environment.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #357 from shanthoosh/local_store_monitor_metrics_1

3 months agoSAMZA-1480: TaskStorageManager improperly initializes changelog consu…
Jacob Maes [Thu, 9 Nov 2017 19:54:32 +0000 (11:54 -0800)] 
SAMZA-1480: TaskStorageManager improperly initializes changelog consu…

…mer position when restoring a store from disk

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #350 from jmakes/samza-1480

3 months agoIgnore java fatal error log files from git and rat
Prateek Maheshwari [Thu, 9 Nov 2017 18:45:04 +0000 (10:45 -0800)] 
Ignore java fatal error log files from git and rat

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #354 from prateekm/rat-exclude-hserr

3 months agoSAMZA-1487: Disable Flaky Zk Integration tests.
Shanthoosh Venkataraman [Thu, 9 Nov 2017 18:24:53 +0000 (10:24 -0800)] 
SAMZA-1487: Disable Flaky Zk Integration tests.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek Maheshwari <prateekm@apache.org>

Closes #353 from shanthoosh/master

3 months agoFix test failures caused by PR345
xiliu [Wed, 8 Nov 2017 02:00:03 +0000 (18:00 -0800)] 
Fix test failures caused by PR345

3 months agoSAMZA-1486; Checkpoint manager implementation with Azure Table
Daniel Chen [Wed, 8 Nov 2017 01:25:58 +0000 (17:25 -0800)] 
SAMZA-1486; Checkpoint manager implementation with Azure Table

vjagadish1989

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #341 from dxichen/azure-checkpoint-manager

3 months agoSAMZA-1437; Added tests, make eventData creation extensible
Daniel Chen [Wed, 8 Nov 2017 01:15:28 +0000 (17:15 -0800)] 
SAMZA-1437; Added tests, make eventData creation extensible

vjagadish1989 Required for internal custom eventData creation

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #352 from dxichen/create-event-data-extensible

3 months agoSAMZA-1477: Fix issues found by BEAM tests
Xinyu Liu [Tue, 7 Nov 2017 20:16:15 +0000 (12:16 -0800)] 
SAMZA-1477: Fix issues found by BEAM tests

A bunch of issues were found by BEAM tests, which includes:

1) WatermarkFunction needs to be able to return output after processWatermark()
2) control message doesn't implement the equals() and hashcode()
3) Some kafka system related code is not scala 2.10 compatible for tests.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Prateek

Closes #345 from xinyuiscool/SAMZA-1477

3 months agoSAMZA-1476: Disable flaky TestStatefulTask.testShouldStartAndRestore
Jagadish [Tue, 31 Oct 2017 15:50:19 +0000 (08:50 -0700)] 
SAMZA-1476: Disable flaky TestStatefulTask.testShouldStartAndRestore

3 months agoSAMZA-1473; Fix handling of initial values for aggregating windows
Jagadish [Fri, 27 Oct 2017 06:14:41 +0000 (23:14 -0700)] 
SAMZA-1473; Fix handling of initial values for aggregating windows

- Handle initial values for windows with foldLeftFunctions configured
- Improve trace level logging

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek M< <pmaheshw@linkedin.com>

Closes #343 from vjagadish1989/aggregating-window-fix

3 months agoSAMZA-1438; SystemProducer, Consumer and Admin interfaces for EventHubs
Daniel Chen [Fri, 27 Oct 2017 05:26:29 +0000 (22:26 -0700)] 
SAMZA-1438; SystemProducer, Consumer and Admin interfaces for EventHubs

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish <jagadish@apache.org>, Prateek M<pmaheshw@linkedin.com>

Closes #308 from dxichen/eventHub-Connector

3 months agoSAMZA-1471: SystemConsumers should not poll ssp that hit end of stream
Hai Lu [Fri, 27 Oct 2017 00:48:47 +0000 (17:48 -0700)] 
SAMZA-1471: SystemConsumers should not poll ssp that hit end of stream

When SystemConsumers poll from SSPs that have hit end of stream, obviously there will be no data return and the poll will exhaust the timeout. This would cause performance issue.

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #342 from lhaiesp/master

4 months agoSAMZA-1472: YarnRestJobStatusProvider constructor must be public
Jacob Maes [Thu, 26 Oct 2017 01:03:11 +0000 (18:03 -0700)] 
SAMZA-1472: YarnRestJobStatusProvider constructor must be public

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>,Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Closes #340 from jmakes/samza-1472

4 months agoSAMZA-1454: Globally unique and user settable IDs for stateful operators
Prateek Maheshwari [Thu, 26 Oct 2017 00:18:36 +0000 (17:18 -0700)] 
SAMZA-1454: Globally unique and user settable IDs for stateful operators

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <jagadish@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #324 from prateekm/operator-id-uniqueness

4 months agoSAMZA-1464: Flushing a closed RocksDB store causes SIGSEGVs
Prateek Maheshwari [Wed, 25 Oct 2017 23:27:52 +0000 (16:27 -0700)] 
SAMZA-1464: Flushing a closed RocksDB store causes SIGSEGVs

Made RocksDB operations check if DB is still open to avoid segfaults.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <jagadish@apache.org>, Xinyu Liu <xinyuiscool@gmail.com>

Closes #334 from prateekm/segfault-fix

4 months agoSAMZA-1470: Wrong job status returned by YarnRestJobStatusProvider wh…
Jacob Maes [Wed, 25 Oct 2017 22:18:11 +0000 (15:18 -0700)] 
SAMZA-1470: Wrong job status returned by YarnRestJobStatusProvider wh…

…en there are multiple app

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #339 from jmakes/samza-1470-4

4 months agoTestExecutionPlanner compilation error
Boris S [Wed, 25 Oct 2017 01:43:51 +0000 (18:43 -0700)] 
TestExecutionPlanner compilation error

Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #337 from sborya/TestExecutionPlannerCompilationError

4 months agoSAMZA-1457: Set retention for internal streams for Batch application
Xinyu Liu [Wed, 25 Oct 2017 01:20:32 +0000 (18:20 -0700)] 
SAMZA-1457: Set retention for internal streams for Batch application

For intermediate streams, checkpoint and changelog, we need to set a short retention period for batch.

Author: Xinyu Liu <xiliu@xiliu-ld1.linkedin.biz>
Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #328 from xinyuiscool/SAMZA-1457

4 months agoStreamOperatorTask does not need to be final.
Boris S [Wed, 25 Oct 2017 00:44:21 +0000 (17:44 -0700)] 
StreamOperatorTask does not need to be final.

Author: Boris S <boryas@apache.org>
Author: Boris Shkolnik <bshkolni@linkedin.com>

Reviewers: Xinyu Liu <xinyu@apache.org>

Closes #336 from sborya/UnmakeStreamOperatorTaskFinal and squashes the following commits:

c44ee58 [Boris S] StreamOperatorTask does not need to be final
d4620d6 [Boris S] Merge branch 'master' of https://github.com/apache/samza
410ce78 [Boris S] Merge branch 'master' of https://github.com/apache/samza
a31a7aa [Boris Shkolnik] reduce debugging from info to debug in KafkaCheckpointManager.java

4 months agoMinor fixes to KeyValueStore and RocksDBKeyValueStore
Prateek Maheshwari [Thu, 19 Oct 2017 18:36:02 +0000 (11:36 -0700)] 
Minor fixes to KeyValueStore and RocksDBKeyValueStore

1. Replaced extension class in KeyValueStore with default methods.
2. Fixed formatting in RocksDBKeyValueStore#openDB.
3. Now logs original RocksDBException on errors opening the db. Other minor log message cleanup.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #332 from prateekm/store-fixes

4 months agoSAMZA-1466: Flaky test: TestRocksDbKeyValueStore suite
Prateek Maheshwari [Thu, 19 Oct 2017 18:01:56 +0000 (11:01 -0700)] 
SAMZA-1466: Flaky test: TestRocksDbKeyValueStore suite

TestRocksDbKeyValueStore tests are failing intermittently with errors like:
```
testFlush FAILED
    org.apache.samza.SamzaException: Error opening RocksDB store dbStore at location /tmp, received the following exception from RocksDB org.rocksdb.RocksDBException: "
        at org.apache.samza.storage.kv.RocksDbKeyValueStore$.openDB(RocksDbKeyValueStore.scala:99)
        at org.apache.samza.storage.kv.TestRocksDbKeyValueStore.testFlush(TestRocksDbKeyValueStore.scala:73)
```

These happen intermittently because:
1. RocksDB throws an exception if open is called on a store that's already open.
2. A new unit test was added in PR #327 that didn't close the store.
3. Other tests in this class use the same store directory (System.getProperty("java.io.tmpdir")) and run concurrently. Any test that runs after the one in 2 above fails.

Even when we log the RocksDBException (fixed in pr #332), the exception message is malformed due to a bug in RocksDB: https://github.com/facebook/rocksdb/issues/1688. This is fixed in the latest RocksDB version (verified with 5.8.0), so the messages should be more meaningful after an upgrade. Fixed stack trace will say something like:
```
Caused by: org.rocksdb.RocksDBException: While lock file: /var/folders/1b/4nqqvf4s27sby0frjr0q5t_h0004hp/T/LOCK: No locks available
at org.rocksdb.RocksDB.open(Native Method)
at org.rocksdb.RocksDB.open(RocksDB.java:231)
at org.apache.samza.storage.kv.RocksDbKeyValueStore$.openDB(RocksDbKeyValueStore.scala:70)
```

Fixing just the test in the mean time.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #333 from prateekm/store-test-fix

4 months agoSAMZA-1465: Performance regression for KafkaCheckpointManager
Jacob Maes [Wed, 18 Oct 2017 22:30:34 +0000 (15:30 -0700)] 
SAMZA-1465: Performance regression for KafkaCheckpointManager

Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Boris Shkolnik <boryas@apache.org>, Fred Ji <fredji97@yahoo.com>

Closes #331 from jmakes/master

4 months agoSAMZA-1461; Expose RocksDB properties as metrics
Janek Lasocki-Biczysko [Mon, 16 Oct 2017 23:06:28 +0000 (16:06 -0700)] 
SAMZA-1461; Expose RocksDB properties as metrics

Automatically build gauges for RocksDB properties via configuration:
`stores.<storename>.rocksdb.telemetry.list=<rocksDbProperty1>, <rocksDbProperty1>`

Author: Janek Lasocki-Biczysko <janek.lasocki-biczysko@skyscanner.net>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #327 from janeklb/jlb_rocksDBPropertiesAsMetrics

4 months agoSAMZA-1463 disable some flaky tests on hdfs system
Hai Lu [Mon, 16 Oct 2017 18:26:45 +0000 (11:26 -0700)] 
SAMZA-1463 disable some flaky tests on hdfs system

disable some flaky tests on hdfs system until future investigation

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #329 from lhaiesp/master

4 months agoChangeLog should not require KafkaSystemAdmin
Boris S [Fri, 13 Oct 2017 22:04:22 +0000 (15:04 -0700)] 
ChangeLog should not require KafkaSystemAdmin

Author: Boris S <boryas@apache.org>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #325 from sborya/ChangeLogRequireKafkaSystemAdmin

4 months agoSAMZA-1424 SAMZA-1425 SAMZA-1426; Support serdes and persistent state for windows
Jagadish [Fri, 13 Oct 2017 20:22:56 +0000 (13:22 -0700)] 
SAMZA-1424 SAMZA-1425 SAMZA-1426; Support serdes and persistent state for windows

Notable changes:
* Made windows durable with support for persistent recoverable stores
* New storage format to support multiple messages in windows (the previous storage format
of storing the entire message-list as a value in the store incurs significant serde overhead)
* Wire `TimeSeriesStore` with the WindowOperator implementation.

Testing:
* Existing unit tests and integration tests pass with serdes wired-up
* Will follow-up with a PR for hello-samza soon.

Note: Majority of changes are in `samza-core/src/main/java/org/apache/samza/operators/impl/WindowOperatorImpl.java` and github collapsed the diff.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek Maheshwari<pmaheshw@linkedin.com>

Closes #321 from vjagadish1989/window-operator-serde