samza.git
20 hours agoSAMZA-1478: Delete unneeded data from intermediate Kafka topic on offset commit master
Dong Lin [Wed, 18 Apr 2018 22:31:06 +0000 (15:31 -0700)] 
SAMZA-1478: Delete unneeded data from intermediate Kafka topic on offset commit

Author: Dong Lin <lindong28@gmail.com>
Author: Dong Lin <dolin@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>, Jacob Maes <jmakes@apache.org>, Yi Pan <nickpan47@gmail.com>

Closes #347 from lindong28/SAMZA-1478

22 hours agoMisc. Util cleanup
Prateek Maheshwari [Wed, 18 Apr 2018 20:32:15 +0000 (13:32 -0700)] 
Misc. Util cleanup

Major changes:
1. Broke up 'Util' class into multiple classes: 'FileUtil', 'HttpUtil', 'CoordinatorStreamUtil'.
2. Consolidated some Util classes: MathUtil, ScalaJavaUtil
3. Removed redundant Util classes: ClassloaderUtil, ScalaToJavaUtil
4. Renamed some Util classes for consistency: TimerUtils -> TimerUtil.
5. Inlined some util classes and methods to where they're used: LexicographicComparator to RocksDBKeyValueStore, defaultSerdeFactoryFromSerdeName to SerdeManager, etc.

Rest of the changes are updates to use the new classes and methods.

Testing: Local build and test works. Tested with a locally deployed Samza job.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>, Jagadish Venkatraman <vjagadish1989@gmail.com>

Closes #455 from prateekm/util-cleanup

26 hours agoSAMZA-1640: JobModel Json deserialization error in ZkJobCoordinator.
Shanthoosh Venkataraman [Wed, 18 Apr 2018 17:00:56 +0000 (10:00 -0700)] 
SAMZA-1640: JobModel Json deserialization error in ZkJobCoordinator.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>
Author: Shanthoosh Venkataraman <svenkata@LM-LSNSCDW5132.linkedin.biz>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #466 from shanthoosh/SAMZA-1640

44 hours agoSAMZA-1651: Samza-sql - Implement GROUP BY SQL operator
Aditya Toomula [Tue, 17 Apr 2018 23:02:49 +0000 (16:02 -0700)] 
SAMZA-1651: Samza-sql - Implement GROUP BY SQL operator

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Srini P<spunuru@linkedin.com>

Closes #478 from atoomula/groupby1

44 hours agoSAMZA-1664: ZkJobCoordinator stability fixes.
Shanthoosh Venkataraman [Tue, 17 Apr 2018 23:02:16 +0000 (16:02 -0700)] 
SAMZA-1664: ZkJobCoordinator stability fixes.

Issues fixed:
* Handle job coordinator shutdown gracefully in case of unclean container shutdowns.
* Fix the zookeeper session handling logic.
* Fix the forever retry timeout in ZkClient re-connect.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #476 from shanthoosh/pullInK2Changes

44 hours agoSAMZA-1666: Benchmark SystemProducer and SystemConsumer
Srinivasulu Punuru [Tue, 17 Apr 2018 22:40:59 +0000 (15:40 -0700)] 
SAMZA-1666: Benchmark SystemProducer and SystemConsumer

* Tests to benchmark the performance of the system consumers and producers.
* Config to test the benchmark for the event hub system producer and consumer.

SystemConsumerBench and SystemProducerBench provides base generic implementation to test the benchmark for the system producers and consumers. Any new system that needs benchmark test needs a properties file.

The benchmark test itself is single threaded in the way it consumes and produces events. Scaling the benchmark tests right now involves running multiple processes of these tests in parallel.

Right now we just calculate the event rate, But in future we could create a logging metrics registry to hookup other metrics and log them in console along with event rate while the benchmark tests are being run.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Wei Song<wsong@linkedin.com>

Closes #473 from srinipunuru/benchmark.1

2 days agoRevert "SAMZA-1645: A few issues found by BEAM stress test"
xiliu [Tue, 17 Apr 2018 18:50:44 +0000 (11:50 -0700)] 
Revert "SAMZA-1645: A few issues found by BEAM stress test"

This reverts commit 26294151642283c1cfb51590b51a86d2eaedd11f.

2 days agoInitial version for in memory system.
Bharath Kumarasubramanian [Tue, 17 Apr 2018 16:09:01 +0000 (09:09 -0700)] 
Initial version for in memory system.

Getting the PR out to unblock sanil.
Pending tasks
 1. Add documentation
 2. Add more tests
 3. Currently initialization of the stream happens using createStream on admin. We need to find a way to expose the same functionality to low level users.
 4. To populate the initial stream, clients need to get a handle on producer and produce the messages. Alternatively, we can support serialization of source data and pass it as config to the system to initialize. We have a hook in place for this but not implemented it completely.
  5. Clean up consumed data on the buffer based on the lowest offsets of the consumers.

I will create JIRAs for all the pending tasks and fix them iteratively.

xinyuiscool ^^

Author: Bharath Kumarasubramanian <bkumaras@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #459 from bharathkk/single-node-testing

3 days agoSAMZA-1619: Samza-sql: Support serialization of nested samza sql relational message
Aditya Toomula [Mon, 16 Apr 2018 17:26:51 +0000 (10:26 -0700)] 
SAMZA-1619: Samza-sql: Support serialization of nested samza sql relational message

Added support for serialization of nested samza sql rel message and accordingly fixed the conversion of nested avro records to rel message. Please note that we still do not have support for the sql queries that point to fields in nested records (beyond the top level record).

Author: Aditya Toomula <atoomula@linkedin.com>
Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Srinivasulu P<spunuru@linkedin.com>

Closes #464 from atoomula/serde

6 days agoSAMZA-1643: StreamPartitionCountMonitor should only restart/shut down the job if...
Prateek Maheshwari [Fri, 13 Apr 2018 15:37:20 +0000 (08:37 -0700)] 
SAMZA-1643: StreamPartitionCountMonitor should only restart/shut down the job if partition count increases

As an aside, also update the gauge to report current number of partitions instead of the change, since that's what its name indicates.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish  V <jagadish@apache.org>

Closes #468 from prateekm/partition-monitor-fixes

6 days agoSAMZA-1649: Improve host-aware allocation to account for strict locality
Jagadish [Fri, 13 Apr 2018 02:23:44 +0000 (19:23 -0700)] 
SAMZA-1649: Improve host-aware allocation to account for strict locality

Currently working on a doc for the behavior of the CapacityScheduler and further testing is needed on an actual cluster - but here's a summary of why we should set relax-locality = false:
 - Node-local requests are honored only when relax-locality = false
 - With relax-locality = true, the scheduler biases interactivity over data-locality for requests that ask for few resources relative to the size of the cluster.

In addition to the above change, this PR also modifies the allocator algorithm to fallback to "ANY_HOST" requests so that we make progress when the node is unavailable.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #471 from vjagadish/relax-locality-fix

6 days agoSAMZA-1650: Fix for firing trigger at the end of trigger interval for tumbling window
Aditya Toomula [Thu, 12 Apr 2018 23:50:30 +0000 (16:50 -0700)] 
SAMZA-1650: Fix for firing trigger at the end of trigger interval for tumbling window

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #472 from atoomula/window

6 days agoSAMZA-1584: Improve logging in StreamProcessor.
Shanthoosh Venkataraman [Thu, 12 Apr 2018 21:34:45 +0000 (14:34 -0700)] 
SAMZA-1584: Improve logging in StreamProcessor.

Add the processorID in the log lines wherever necessary(since we support running multiple stream applications in a JVM) and improving logging in general in StreamProcessor.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>
Author: Shanthoosh Venkataraman <svenkata@lm-lsnscdw5132.linkedin.biz>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #441 from shanthoosh/SAMZA-1584

6 days agominor fix on eventhubs size limit for event body and partition key
Hai Lu [Thu, 12 Apr 2018 20:31:39 +0000 (13:31 -0700)] 
minor fix on eventhubs size limit for event body and partition key

Author: Hai Lu <halu@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Srinivasulu Punuru<spunuru@linkedin.com>

Closes #470 from lhaiesp/master

7 days agoSAMZA-1183: Fix TestAsyncRunLoop flaky tests
Shanthoosh Venkataraman [Wed, 11 Apr 2018 22:34:43 +0000 (15:34 -0700)] 
SAMZA-1183: Fix TestAsyncRunLoop flaky tests

A. Fix TestProcessInOrder and TestProcessOutOfOrder failures.

Both the tests has two tasks(T1, T2) and the following task to message assignments:
- Task T1: [M1, M2] (Task T1 is expected to process the messages M1 and M2).
- Task T2: [M3] (Task T2 is expected to process the message M3 and stop the runloop).

In some cases, before the task T1 completes processing all it’s messages, T2 gets scheduled and stops the runloop(Stopping all tasks).

We wait in both tests through a CountdownLatch, expecting the tasks to process all the messages, which will never reach zero in the above scenario(there by causing indefinite wait in tests).

Fix: Remove the manual shutdown invocation from Task T2 and use EOS messages to stop the runloop.

B.  Fix TestCommitSingleTask and TestCommitMultipleTask failures.

Both the tests had two tasks(T1, T2).

Task T2 is expected to stop the runloop and task T1 marks the commit flag through taskCoordinator.commit(requestScope).

Failure occurs in some scenarios when the task T2 stops the runloop before the runLoop commits both the tasks.

Fix: Trigger the runloop shutdown sequence after we’ve committed the tasks.

C. Combine the duplicate tests and unify their assertions.

D. Verification: Ran the following script to verify the fixes.

`root88cf6be1c11a:~/samza# i=0`
`root88cf6be1c11a:~/samza# while [ $i -lt 100 ]; do`
       ` i=expr $i + 1`
       ` ./gradlew clean :samza-core:test -Dtest.single="TestAsyncRunLoop" --debug --stacktrace >> test-logs`
  ` done;`

There were no failures.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #463 from shanthoosh/re_enable_async_run_loop_tests_2

7 days agoSAMZA-1645: A few issues found by BEAM stress test
xinyuiscool [Wed, 11 Apr 2018 20:21:36 +0000 (13:21 -0700)] 
SAMZA-1645: A few issues found by BEAM stress test

1. Revert the priority set to intermediate streams.
2. Fix a watermark propagation condition

Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Prateek M <prateekm@apache.org>

Closes #469 from xinyuiscool/SAMZA-1645

8 days agoUpgrade to latest event hub version (1.0.1)
Srinivasulu Punuru [Tue, 10 Apr 2018 22:39:21 +0000 (15:39 -0700)] 
Upgrade to latest event hub version (1.0.1)

* Upgrade to the latest event hub version 1.0.1
* Adding configs for prefetchCount and maxEventPerPoll
* Fix the high cpu usage issue in SamzaHistogram
* Fixing a race condition in event hub system producer where the future was getting removed while it was being checked for completion resulting in NPE.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #467 from srinipunuru/upgrade-eh-1.0.1

2 weeks agoRefactoring the EventHub System Producer into reusable AsyncSystemProducer
Srinivasulu Punuru [Sat, 31 Mar 2018 00:45:41 +0000 (17:45 -0700)] 
Refactoring the EventHub System Producer into reusable AsyncSystemProducer

Refactoring eventhub system producer into common reusable components
1. AsyncSystemProducer : All the system producers that have real async send API with call backs can use this
2. NoFlushAsyncSystemProducer: All the system producers that have real async call back based send API and doesn't provide flush semantics can use this.

The system producers that implement AsyncSystemproducers can be used across Samza and Brooklin.

TODO: AsyncSystemProducer needs to be moved to the api layer so that it can be used across different system producers (i.e. eventhub and kinesis)

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Boris S <boryas@apache.org>

Closes #458 from srinipunuru/async-prod.3

2 weeks agoSAMZA-1630: Move thread dump from stdout to logs
Prateek Maheshwari [Sat, 31 Mar 2018 00:11:49 +0000 (17:11 -0700)] 
SAMZA-1630: Move thread dump from stdout to logs

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <jvenkatr@linkedin.com>

Closes #462 from prateekm/thread-dump-on-timeout

2 weeks agoSAMZA-1632: KinesisConfig: Making getProxyHost and getProxyPort APIs protected
Aditya Toomula [Thu, 29 Mar 2018 22:57:58 +0000 (15:57 -0700)] 
SAMZA-1632: KinesisConfig: Making getProxyHost and getProxyPort APIs protected

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #457 from atoomula/kconfig

2 weeks agoSD-1599: Improve the efficiency of the AsynRunLoop when some partitio…
James Lent [Thu, 29 Mar 2018 22:26:28 +0000 (15:26 -0700)] 
SD-1599: Improve the efficiency of the AsynRunLoop when some partitio…

…ns are empty.

Author: James Lent <jlent@nc.rr.com>

Reviewers: Yi Pan <nickpan47@gmail.com>, Xinyu Liu <xiliu@linkedin.com>

Closes #436 from jwlent55/SD-1599-improve-async-run-loop-efficiency and squashes the following commits:

ef0e53bb [James Lent] SD-1599: Remove incorrect call to containerMetrics left by previous update.
3899216e [James Lent] SD-1599: Combine the blockIfBusy and blockIfNoWork logic inside one method and a common latch.
e3837809 [James Lent] SD-1599: Mark runLoopResumedSinceLastChecked as volatile.
a7c0ac4c [James Lent] SD-1599: Explicitly set the timeout value to either 'noNewMessagesTimeout' or 0.
b2dd61e2 [James Lent] SD-1599: Address the first set of code inspection comments.
cc34518a [James Lent] SD-1599: Improve the efficiency of the AsynRunLoop when some partitions are empty.

3 weeks agoSAMZA-1630: Log a thread dump on timeouts
Prateek Maheshwari [Thu, 29 Mar 2018 01:14:34 +0000 (18:14 -0700)] 
SAMZA-1630: Log a thread dump on timeouts

Would be useful to get a thread dump on timeouts, e.g. for AsyncStreamTask callback timeout, container shutdown timeout, heartbeat monitor graceful shutdown timeout etc.

Author: Prateek Maheshwari <prateekm@utexas.edu>
Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #460 from prateekm/thread-dump-on-timeout

3 weeks agoSAMZA-1631: Improve logging on Task callback timeout
Prateek Maheshwari [Thu, 29 Mar 2018 00:52:58 +0000 (17:52 -0700)] 
SAMZA-1631: Improve logging on Task callback timeout

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jacob Maes <jmakes@apache.org>

Closes #461 from prateekm/task-callback-logging

3 weeks agoSAMZA-1627: Watermark broadcast enhancements
xinyuiscool [Wed, 28 Mar 2018 17:25:15 +0000 (10:25 -0700)] 
SAMZA-1627: Watermark broadcast enhancements

Currently each upstream task needs to broadcast to every single partition of intermediate streams in order to aggregate watermarks in the consumers. A better way to do this is to have only one downstream consumer doing the aggregation, and then broadcast to all the partitions. This is safe as we can prove the broadcast watermark message is after all the upstream tasks finished producing the events whose event time are before the watermark. This reduced the full message count from O(n^2) to O(n).

Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Boris S <sborya@gmail.com>

Closes #456 from xinyuiscool/SAMZA-1627

3 weeks agoSkipping large messages in the EventHub system producer
Srinivasulu Punuru [Tue, 27 Mar 2018 18:39:52 +0000 (11:39 -0700)] 
Skipping large messages in the EventHub system producer

EventHubs have restriction on maximum message sizes that can be allowed. Adding a `systems.%s.eventhubs.skipMessagesLargerThanBytes` that can be used in the event hub system to make it skip messages that are larger than specific bytes so that we don't even try to send those large messages because EventHubs will reject them anyways.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #454 from srinipunuru/skip.2

3 weeks agoScheduleAfterDebounceTime should catch all Throwable to avoid losing unhandled Errors
thunderstumpges [Mon, 26 Mar 2018 23:08:00 +0000 (16:08 -0700)] 
ScheduleAfterDebounceTime should catch all Throwable to avoid losing unhandled Errors

If an Action executed in the scheduler throws an Error (or other Throwable?) besides Exception, it is silently lost since the Action/Runnable wrapper only catches Exception, not Throwable. This made my troubleshooting of an issue very difficult. Made it seem like the code was "hung" when really it had thrown a "NoSuchMethodError" (Error instead of Exception) due to a simple dependency issue on my side.

Catching Throwable instead ensures this is handled and propagated properly.

Author: thunderstumpges <tstumpges@ntent.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #450 from thunderstumpges/scheduler-catch-throwable

3 weeks agoMaking event hub configs Samza compliant
Srinivasulu Punuru [Fri, 23 Mar 2018 00:08:11 +0000 (17:08 -0700)] 
Making event hub configs Samza compliant

Fixes for Bugs

- SAMZA-1571 Make Eventhubs system configs compatible with Samza standalone.
- SAMZA-1624 EventHub system should prefix the configs with senstive for SasKey and SasToken
- SAMZA-1625 EventHub systemAdmin is swallowing exceptions
- SAMZA-1626 EventHub system admin is not returning the metadata for all the ssps requested for

Description

1. Right now event hub doesn't follow the samza's config convention of naming the secrets as "sensitive" so that they are masked before they are logged.
2. Event hub configs uses the old system.<systemName>.streams.<streamName> which is blacklisted in Samza standalone. So moving these configs to newer <streams>.<streamid>
3. Wrapping the underlying exception properly in the SamzaException in EventHubSystemAdmin
4. Porting Bharat's fix to return the metadata for all the ssps requested for in EventHubSystemAdmin

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #453 from srinipunuru/eh.1

4 weeks agoAdd stream-table join support for samza sql
Aditya Toomula [Thu, 22 Mar 2018 00:58:07 +0000 (17:58 -0700)] 
Add stream-table join support for samza sql

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #425 from atoomula/join

4 weeks agoSAMZA-1623: include avro as the file suffix for hdfs producer
Hai Lu [Wed, 21 Mar 2018 20:13:16 +0000 (13:13 -0700)] 
SAMZA-1623: include avro as the file suffix for hdfs producer

AvroDataFileHdfsWriter should include avro as the file suffix as some pig jobs couldn't read the avro files if they don't come with the proper suffix

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #452 from lhaiesp/master

4 weeks agoSAMZA-1608 : Add hidden config to enable explicit stream creation in StreamAppender...
Daniel Nishimura [Fri, 16 Mar 2018 20:10:56 +0000 (13:10 -0700)] 
SAMZA-1608 : Add hidden config to enable explicit stream creation in StreamAppender due to bug.

Due to a intermittent bug that causes the explicit stream creation in `StreamAppender` to hang, a hidden config is added to enable/disable explicit stream creation. By default this is disabled, which reverts to the previous behavior.

When the intermittent hang bug is fixed, the config will either be removed or made public.

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #442 from dnishimura/samza-1608-disable-streamappender-create-stream

4 weeks agoSAMZA-1622: avro writer to support generic record
Hai Lu [Fri, 16 Mar 2018 16:22:01 +0000 (09:22 -0700)] 
SAMZA-1622: avro writer to support generic record

avro writer in HDFS system producer to support generic record

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #449 from lhaiesp/master

4 weeks agoSAMZA-1615: Fix a couple of issues in ControlMessageSender
Xinyu Liu [Thu, 15 Mar 2018 23:47:22 +0000 (16:47 -0700)] 
SAMZA-1615: Fix a couple of issues in ControlMessageSender

Two issues I found during testing: 1) medaDataCache.getSystemStreamMetadata(): if we pass in partitionOnly to be true, it will actually not store the metadata into the cache, resulting every call being another query to kafka. I turned off the partitionOnly in order to make it in the cache. 2) change the log for info to debug.

Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Boris S <sborya@gmail.com>

Closes #444 from xinyuiscool/SAMZA-1615

5 weeks agoSAMZA-1611: BootstrappingChooser should use systemAdmin offsetComparator API to compa...
Aditya Toomula [Wed, 14 Mar 2018 00:43:45 +0000 (17:43 -0700)] 
SAMZA-1611: BootstrappingChooser should use systemAdmin offsetComparator API to compare the offsets

Author: Aditya Toomula <atoomula@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #443 from atoomula/bootstrap

5 weeks agoSAMZA-1618: fix HdfsFileSystemAdapter to get files recursively
Hai Lu [Tue, 13 Mar 2018 19:28:11 +0000 (12:28 -0700)] 
SAMZA-1618: fix HdfsFileSystemAdapter to get files recursively

fix HdfsFileSystemAdapter to get files recursively

Author: Hai Lu <halu@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #447 from lhaiesp/master

5 weeks agoSAMZA-1610: Implementation of remote table provider
Peng Du [Fri, 9 Mar 2018 22:57:56 +0000 (14:57 -0800)] 
SAMZA-1610: Implementation of remote table provider

Please see commit messages for detailed descriptions.

Author: Peng Du <pdu@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Wei Song <wsong@linkedin.com>

Closes #432 from pdu-mn1/remote-table-0222

5 weeks agoInfinite loop when trying to use SamzaSqlApplicationRunner in yarn mode
Srinivasulu Punuru [Fri, 9 Mar 2018 19:01:07 +0000 (11:01 -0800)] 
Infinite loop when trying to use SamzaSqlApplicationRunner in yarn mode

Right now when we try to use the SamzaSqlApplicationRunner in yarn mode it goes into the infinite loop because the appRunnerConfig that we try to set is being overwritten by the appRunnerConfig passed by the job and SamzaSqlApplicationRunner keeps creating it again and again in an infinite loop.

Fix : Userconfig for app.runner.class should not override the computed one. Added a test case to validate this.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #440 from srinipunuru/bug-fix.1

5 weeks agoSAMZA-1589: Reduce failure retry duration in KafkaCheckpointManager.writeCheckpoint
Shanthoosh Venkataraman [Fri, 9 Mar 2018 02:00:15 +0000 (18:00 -0800)] 
SAMZA-1589: Reduce failure retry duration in KafkaCheckpointManager.writeCheckpoint

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #438 from shanthoosh/SAMZA-1589

6 weeks agoSAMZA-1607: Handle ZkNoNodeExistsException in zkUtils.readProcessorData
Shanthoosh Venkataraman [Wed, 7 Mar 2018 21:24:54 +0000 (13:24 -0800)] 
SAMZA-1607: Handle ZkNoNodeExistsException in zkUtils.readProcessorData

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #437 from shanthoosh/fix_zkutils_get_processor_data

6 weeks agoSAMZA-1602: Moving class StreamAssert from src/test to src/main
sanil15 [Wed, 7 Mar 2018 21:22:40 +0000 (13:22 -0800)] 
SAMZA-1602: Moving class StreamAssert from src/test to src/main

Tested with ./gradlew clean build, which is passing

Author: sanil15 <sanil.jain15@gmail.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #439 from Sanil15/SAMZA-1602

6 weeks agoSAMZA-1498: Support arbitrary system clock timer in operators
xiliu [Wed, 7 Mar 2018 02:20:15 +0000 (18:20 -0800)] 
SAMZA-1498: Support arbitrary system clock timer in operators

This patch adds the capability to register arbitrary timers for both high-level and low-level api.
For high-level, InitableFunction will pass the TimerRegistry to user through the new OpContext, and user will implement the TimerFunction to get timer notifications.
For low-level api, user can register timer in the TaskContext, and then implements the TimerCallback for specific timer actions.

Author: xiliu <xiliu@linkedin.com>
Author: xinyuiscool <xinyuliu.us@gmail.com>
Author: xinyuiscool <xiliu@linkedin.com>

Reviewers: Pateek M <prateekm@gmail.com>

Closes #419 from xinyuiscool/SAMZA-1498

6 weeks agoSAMZA-1600: remove the combination of cleanup policy "compact,delete"…
Yi Pan (Data Infrastructure) [Tue, 6 Mar 2018 03:40:45 +0000 (19:40 -0800)] 
SAMZA-1600: remove the combination of cleanup policy "compact,delete"…

… in changelog topic properties

Author: Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #435 from nickpan47/SAMZA-1600

6 weeks agoSAMZA-1460: StreamAppender does not explicitly create logging topic
Daniel Nishimura [Tue, 6 Mar 2018 00:02:49 +0000 (16:02 -0800)] 
SAMZA-1460: StreamAppender does not explicitly create logging topic

Creates the StreamAppender stream explicitly instead of relying on auto stream creation.

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Prateek M <pmaheshw@linkedin.com>

Closes #423 from dnishimura/samza-1460-streamappender-create-logging-topic

6 weeks agoMisc. minor cleanup.
Prateek Maheshwari [Sun, 4 Mar 2018 00:30:59 +0000 (16:30 -0800)] 
Misc. minor cleanup.

1. Added a meaningful name for the container thread pool threads.
2. Made the thread names for framework threads consistent.
3. Made a couple of monitoring/metrics threads daemon.
4. Fixed a few checkstyle warning about missing param/throws documentation.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Jacob M <jmakes@apache.org>

Closes #433 from prateekm/container-thread-pool-name

6 weeks agoSAMZA-1598: Cleanup README.md file
Daniel Nishimura [Sun, 4 Mar 2018 00:29:44 +0000 (16:29 -0800)] 
SAMZA-1598: Cleanup README.md file

Update old links. Remove Jenkins build badge, in favor of Travis-CI.

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Jagdish <jagadish@apache.org>

Closes #434 from dnishimura/samza-1598-clean-readme

7 weeks agoSAMZA-1596: Staging directory name has to be formatted in config
Akim Akimov [Thu, 1 Mar 2018 18:58:06 +0000 (10:58 -0800)] 
SAMZA-1596: Staging directory name has to be formatted in config

When we instantiate a HDFS config staging directory we missing a formatter for getStagingDirectory so systems.hdfs-system-name.stagingDirectory does not parse from config, but only systems.%s.stagingDirectory only parses instead.

Solution is to add formatter to getStagingDirectory method.

Author: Akim Akimov <zedmor@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>, Hai L<halu@linkedin.com>

Closes #431 from Zedmor/FixStagingDirHDFS

7 weeks agoSAMZA-1597: Expose an interface for throttling
Wei Song [Tue, 27 Feb 2018 19:47:51 +0000 (11:47 -0800)] 
SAMZA-1597: Expose an interface for throttling

7 weeks agoSAMZA-1595: Fix scalacCompileOptions format to build with zinc scala compiler.
Shanthoosh Venkataraman [Fri, 23 Feb 2018 00:44:00 +0000 (16:44 -0800)] 
SAMZA-1595: Fix scalacCompileOptions format to build with zinc scala compiler.

Zinc scala compiler(part of gradle version >= 3.0) expects the scala compilation arguments as a list(where each compilation argument is an element of the list).

In samza, the compilation arguments are concatenated into a single string and passed to the compiler.

This causes build failures when samza is built with Zinc scala compiler.

Existing ant scala compiler used to build samza in open source accepts the compilation arguments both as list and string.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #430 from shanthoosh/master

7 weeks agoSAMZA-1594: Remove ScalaCompileOptions to make samza codebase build with gradle versi...
Shanthoosh Venkataraman [Thu, 22 Feb 2018 23:58:39 +0000 (15:58 -0800)] 
SAMZA-1594: Remove ScalaCompileOptions to make samza codebase build with gradle version > 3.0.

When samza repository is built with the gradle version greater than 3.0, we notice the following build failure.

No such property: useAnt for class: org.gradle.api.tasks.scala.ScalaCompileOptions

This needs to be fixed to build samza with  gradle version >= 3.0.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #427 from shanthoosh/fix_scalac_compile_options

7 weeks agoSAMZA-1593: Upgrade gradle nexus plugin.
Shanthoosh Venkataraman [Thu, 22 Feb 2018 20:39:10 +0000 (12:39 -0800)] 
SAMZA-1593: Upgrade gradle nexus plugin.

Current nexus plugin which samza repository is using is not compatible with the gradle versions greater than 3.0.

For doing the ligradle migration at linkedin, we need to update the nexus gradle plugin to the latest version.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #426 from shanthoosh/master

8 weeks agoFixes to get the build working with Scala 2.10 build
Srinivasulu Punuru [Wed, 21 Feb 2018 19:45:23 +0000 (11:45 -0800)] 
Fixes to get the build working with Scala 2.10 build

The fixes needed to get the build working with the Scala 2.10.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>, Jacob M<jmakes@apache.org>, Bharath K<bkumaras@linkedin.com>, Xinyu L<xiliu@linkedin.com>

Closes #424 from srinipunuru/rel-fixes.1

2 months agoSAMZA-1555: Move creation of checkpoint and changelog streams to the Job Coordinators
Daniel Nishimura [Fri, 16 Feb 2018 22:01:06 +0000 (14:01 -0800)] 
SAMZA-1555: Move creation of checkpoint and changelog streams to the Job Coordinators

**Overview**
The purpose of this PR is to consolidate the creation of the changelog and checkpoint streams into the JobCoordinators. In the current state, the changelog stream is created from the JobModelManager and the checkpoint stream is created within the OffsetManager. The issue with creating the checkpoint in the OffsetManager is that the first call happens from the first SamzaContainer that runs and each subsequent SamzaContainer run will attempt to create the checkpoint stream.

**Motivations**
There are three driving forces for this refactoring. The first motivation is to assign the creation of the changelog and checkpoint streams to the JobCoordinators where it is most appropriate. This was discussed in more detail with nickpan47  . The second motivation is to have any potential failure to stream creation happen no later than during job coordination. The third motivation is to accommodate future security work to provide a robust way to set ACLs on streams.

Follow on to this PR will be: https://issues.apache.org/jira/browse/SAMZA-1564

Author: Daniel Nishimura <dnishimura@gmail.com>

Reviewers: Yi Pan <nickpan47@gmail.com>, Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #413 from dnishimura/samza-1555-move-changelog-checkpoint-creation and squashes the following commits:

1102314 [Daniel Nishimura] Fix a comment change that Intellij inadvertently refactored.
2b2eb16 [Daniel Nishimura] Trigger CI again
366b7b7 [Daniel Nishimura] Trigger CI
dce36b6 [Daniel Nishimura] Add isBootstrapped flag back to CoordinatorStreamSystemConsumer.
2efcfd4 [Daniel Nishimura] Trigger CI
6b2d912 [Daniel Nishimura] Changes related to Yi's latest review.
9d02e7a [Daniel Nishimura] Change createChangeLogStream to a static method.
effec24 [Daniel Nishimura] Cleanup from latest code review.
1bdfda7 [Daniel Nishimura] CoordinatorStreamManager lifecycle refactoring.
5178ca5 [Daniel Nishimura] Refactor AbstractCoordinatorStreamManager as a concrete class.
894af9c [Daniel Nishimura] Merge from master branch.
4009a0b [Daniel Nishimura] Changes from code review.
3a12a75 [Daniel Nishimura] Separation of changelog manager and jobmodel manager. Create CoordinatorStream class to encapsulate creation and management of coordinator stream consumer and producer.
c188adb [Daniel Nishimura] Merge from master branch.
971fa91 [Daniel Nishimura] Move the responsibility of changelog and checkpoint stream creation to the job coordinators.

2 months agoSAMZA-1588: Add random jitter to monitor’s scheduling interval.
Shanthoosh Venkataraman [Fri, 16 Feb 2018 18:34:21 +0000 (10:34 -0800)] 
SAMZA-1588: Add random jitter to monitor’s scheduling interval.

We’ve observed in LinkedIn execution environments that, all the monitors running on the YARN node-manager machines hitting an external service at the same time based upon the configured monitor scheduling interval.

To eliminate unnecessary monitor execution spike and congestion caused to an external service at the same time, it’s essential to add a random jitter to the monitor scheduling interval.

Random jitter will be added to monitor scheduling interval based upon a boolean configuration.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #422 from shanthoosh/SAMZA-1588

2 months agoSupport source types where the last part of the source is not the streamName
Srinivasulu Punuru [Thu, 15 Feb 2018 23:46:17 +0000 (15:46 -0800)] 
Support source types where the last part of the source is not the streamName

Contains following fixes

1. Right now Samza SQL framework assumes that the last part of the source is the stream Name, removed the assumption
2. Made consoleLoggingSystemFactory to log formatted json so that it's easily readable.
3. Added support in SamzaSqlRelMessage where the key may not be present.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #421 from srinipunuru/four-part.1

2 months agoSAMZA-1568: Handle ZkInterruptedException in zkclient.close.
Shanthoosh Venkataraman [Wed, 14 Feb 2018 03:30:03 +0000 (19:30 -0800)] 
SAMZA-1568: Handle ZkInterruptedException in zkclient.close.

When zookeeper session failures occur in a stream processor,   leaves the group(zkClient is closed) and joins the group again.

The last step in that shutdown sequence is zkClient.close(). In some scenarios, it throws the following exception,

    org.I0Itec.zkclient.exception.ZkInterruptedException: java.lang.InterruptedException
    at org.I0Itec.zkclient.ZkClient.close(ZkClient.java:1278)
    at org.apache.samza.zk.ZkControllerImpl.stop(ZkControllerImpl.java:92)

    at org.apache.samza.zk.ZkJobCoordinator.stop(ZkJobCoordinator.java:141)
In existing implementation this is not handled, there by killing the stream processor.  The following codepath triggers this exception:

`StreamProcessor.stop -> ZkJobCoordinator.stop() ->  zkController.stop() -> zkUtils.close`

This exception causes the integration test to fail occasionally  and can cause LocalApplicationRunner.waitForFinish method call to block indefinitely(since this callback event success, updates the latch state required for waitForFinish to end).

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #416 from shanthoosh/zk_utils_close

2 months agoSAMZA-1572: Add fixed retries on failure in KafkaCheckpointManager.
Shanthoosh Venkataraman [Tue, 13 Feb 2018 19:22:34 +0000 (11:22 -0800)] 
SAMZA-1572: Add fixed retries on failure in KafkaCheckpointManager.

KafkaCheckpointManager.writeCheckpoint currently goes into a infinite loop when an irrecoverable failure happens, this indefinitely blocks the commit phase (there by preventing processing). Added finite retries (50), which would retry for fixed time in case of failure before giving up.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Prateek M<prateekm@cs.utexas.edu>

Closes #420 from shanthoosh/add_fixed_retries_in_kafka_checkpoint_manager

2 months agoSAMZA-1489: TaskInstance should commit offset before it closes() if auto commit is...
Dong Lin [Thu, 1 Feb 2018 20:07:56 +0000 (12:07 -0800)] 
SAMZA-1489: TaskInstance should commit offset before it closes() if auto commit is enabled

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #417 from lindong28/SAMZA-1489

2 months agoSAMZA-1552: Host affinity improvements - Improve matching of hosts to allocated resources
Jagadish [Thu, 1 Feb 2018 05:38:14 +0000 (21:38 -0800)] 
SAMZA-1552: Host affinity improvements - Improve matching of hosts to allocated resources

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek <prateekm@cs.utexas.edu>

Closes #401 from vjagadish1989/host-affinity-fix

2 months agoSAMZA-1578: Fix watermark bug found by BEAM tests
Xinyu Liu [Fri, 26 Jan 2018 22:02:42 +0000 (14:02 -0800)] 
SAMZA-1578: Fix watermark bug found by BEAM tests

The problem is getOutputWatermark() does not return the real outputWatermark. This caused problem in user override watermark function.

Author: xiliu <xiliu@linkedin.com>

Reviewers: Jagadish <vjagadish1989@gmail.com>

Closes #415 from xinyuiscool/SAMZA-1578

2 months agoAdded some logging to stdout for easier parsing by tools.
Prateek Maheshwari [Fri, 26 Jan 2018 20:04:29 +0000 (12:04 -0800)] 
Added some logging to stdout for easier parsing by tools.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #414 from prateekm/print-container-info

2 months agoSAMZA-1548; Add start() and stop() to SystemAdmin
Dong Lin [Fri, 26 Jan 2018 01:28:13 +0000 (17:28 -0800)] 
SAMZA-1548; Add start() and stop() to SystemAdmin

This patch adds start() and stop() to SystemAdmin interface. This can be useful for e.g. kafka.admin.AdminClient which needs to be started before it can be used.

Since we add this method in interface and expect AdminClient to be stateful and probably has its own thread, there will be higher cost to instantiate a new SystemAdmin. Thus we probably want to re-use the SystemAdmin instances instead of creating SystemAdmin on demand when needed. Therefore, this patch also adds SystemAdmins class to help manage a map from system to SystemAdmin, similar to the existing SystemProducers class in Samza.

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #397 from lindong28/SAMZA-1548

2 months agoSAMZA-1557: Broadcast operator
Xinyu Liu [Wed, 24 Jan 2018 22:27:39 +0000 (14:27 -0800)] 
SAMZA-1557: Broadcast operator

This patch adds Broadcast operator that allows broadcasting messages to all tasks. It's the counterpart of the Samza broadcast stream in low level api, and will be used by BEAM runner to broadcast views as side input to other part of the pipeline.

Author: xiliu <xiliu@linkedin.com>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #410 from xinyuiscool/SAMZA-1557

2 months agofix the bug containsValue method
michaelwong [Mon, 22 Jan 2018 22:11:25 +0000 (14:11 -0800)] 
fix the bug containsValue method

containsValue method should invoke map.containsValue not map.containsKey

Author: michaelwong <michaelwong95@users.noreply.github.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #12 from jwongo/master

2 months agoupTime should be field not method
zhangyijun [Mon, 22 Jan 2018 22:05:51 +0000 (14:05 -0800)] 
upTime should be field not method

Author: zhangyijun <joa.zhang@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #9 from zhangyijun/patch-1

2 months agoSAMZA-1561: Fix inconsistency problem in JobModel publish.
Shanthoosh Venkataraman [Mon, 22 Jan 2018 19:31:08 +0000 (11:31 -0800)] 
SAMZA-1561: Fix inconsistency problem in JobModel publish.

Author: Shanthoosh Venkataraman <svenkataraman@linkedin.com>

Reviewers: Jagadish V<jagadish@apache.org>, Xinyu Liu<xinyu@apache.org>

Closes #409 from shanthoosh/master

3 months agoSAMZA-1407: upgrade junit version to 4.12
Fred Ji [Thu, 18 Jan 2018 02:35:34 +0000 (18:35 -0800)] 
SAMZA-1407: upgrade junit version to 4.12

"./gradlew clean check" passed

Author: Fred Ji <haifeng.ji@gmail.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #406 from fredji97/junit4_12_new

3 months agoSAMZA-1560: Handle key-serde errors in KafkaCheckpointManager
Jagadish [Thu, 18 Jan 2018 02:27:40 +0000 (18:27 -0800)] 
SAMZA-1560: Handle key-serde errors in KafkaCheckpointManager

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Xinyu Liu<xinyuiscool@gmail.com>

Closes #408 from vjagadish1989/kcm-fix

3 months agoSAMZA-1558: State restore metrics should be duplicated and deprecated…
Jacob Maes [Wed, 17 Jan 2018 22:26:53 +0000 (14:26 -0800)] 
SAMZA-1558: State restore metrics should be duplicated and deprecated…

… to avoid type conflicts

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>

Closes #407 from jmakes/samza-1558

3 months agoSAMZA-1523: Cleanup table entries before shutting down the processor
navina [Wed, 17 Jan 2018 18:11:40 +0000 (10:11 -0800)] 
SAMZA-1523: Cleanup table entries before shutting down the processor

Modified the `TableUtils#deleteProcessorEntity` to provide an option to disable optimistic locking during a call to Azure Table Storage service.

sborya PawasChhokra nickpan47   Review please?

Author: navina <navina@apache.org>

Reviewers: Shanthoosh V<svenkata@linkedin.com>, Boris S<bshkolni@linkedin.com>

Closes #379 from navina/azure-etag-fix

3 months agoSAMZA-1556: Adding support for multi level sources in queries
Srinivasulu Punuru [Wed, 17 Jan 2018 18:06:50 +0000 (10:06 -0800)] 
SAMZA-1556: Adding support for multi level sources in queries

Right now Samza SQL supports queries with just two levels i.e. `select * from foo.bar`. But there can be sources that are identified though multiple levels. for e.g. `select * from kafka.clusterName.topicName`.

This change adds the support for sql queries with sources that have more than two levels.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Miguel S<misanchez@linkedin.com>, Aditya T<atoomula@linkedin.com>

Closes #405 from srinipunuru/multi-level.1

3 months agoSAMZA-1500: Added metrics for RocksDB state store memory usage
Prateek Maheshwari [Wed, 17 Jan 2018 17:39:26 +0000 (09:39 -0800)] 
SAMZA-1500: Added metrics for RocksDB state store memory usage

Approximate RocksDB memory usage = Configured Block Cache size + MemTable size + Indexes and Bloom Filters size =
rocksdb.block-cache-size + rocksdb.size-all-mem-tables + rocksdb.estimate-table-readers-mem

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #404 from prateekm/rocksdb-memory

3 months agoFix for the TestSamzaSqlApplicationConfig.testConfigInit
Srinivasulu Punuru [Thu, 11 Jan 2018 01:14:03 +0000 (17:14 -0800)] 
Fix for the TestSamzaSqlApplicationConfig.testConfigInit

Currently testConfigInit checks for a hardcoded number for udfs. Whenever a new UDF is added, This test is going to fail if it is not updated. Changed the test to validate the number of udfs based on the config that is passed.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #403 from srinipunuru/testfix.1

3 months agoSAMZA-1535: Support for UDFs in where clauses
Srinivasulu Punuru [Wed, 10 Jan 2018 19:17:39 +0000 (11:17 -0800)] 
SAMZA-1535: Support for UDFs in where clauses

The existing version of the udf implementation doesn't seem to support udfs in the where clauses because the Type of the object returned is "ANY" and when you do a
`select * from kafka.topic where regexMatch('.*foo', Name)` it fails in the query validation, because calcite doesn't know the type of regexMatch.

To solve the problem, We made the scalarUdf generic with a strongly typed return type.

https://issues.apache.org/jira/browse/SAMZA-1535

This PR can be merged into trunk not the 0.14.

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #386 from srinipunuru/udf-where.1

3 months agoSAMZA-1530; Bump up Kafka dependency to 0.11
Dong Lin [Wed, 10 Jan 2018 18:52:38 +0000 (10:52 -0800)] 
SAMZA-1530; Bump up Kafka dependency to 0.11

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #395 from lindong28/SAMZA-1530

3 months agoRevert "SAMZA-1553: Add log4j for latest Kafka build"
xiliu [Wed, 10 Jan 2018 18:33:23 +0000 (10:33 -0800)] 
Revert "SAMZA-1553: Add log4j for latest Kafka build"

This reverts commit 5238aaa6cee81a87079c9d432204422ececea793.

3 months agoSAMZA-1550: Update samza 0.14 version in tests
xiliu [Tue, 9 Jan 2018 23:53:40 +0000 (15:53 -0800)] 
SAMZA-1550: Update samza 0.14 version in tests

3 months agoSAMZA-1553: Add log4j for latest Kafka build
Xinyu Liu [Tue, 9 Jan 2018 18:48:10 +0000 (10:48 -0800)] 
SAMZA-1553: Add log4j for latest Kafka build

Add it so Samza compiles with the latest kafka.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Boris Shkolnik <sborya@gmail.com>

Closes #402 from xinyuiscool/SAMZA-1553

3 months agoFix a link in release-notes.md
xiliu [Thu, 4 Jan 2018 17:51:20 +0000 (09:51 -0800)] 
Fix a link in release-notes.md

3 months agoSAMZA-1550: Update master to use 0.14.1-SNAPSHOT version
Xinyu Liu [Thu, 4 Jan 2018 00:10:45 +0000 (16:10 -0800)] 
SAMZA-1550: Update master to use 0.14.1-SNAPSHOT version

Update master to use 0.14.1-SNAPSHOT version.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #400 from xinyuiscool/SAMZA-1550-2

3 months agoSAMZA-1550: Doc for 0.14.0 release
Xinyu Liu [Wed, 3 Jan 2018 17:56:23 +0000 (09:56 -0800)] 
SAMZA-1550: Doc for 0.14.0 release

Docs update for both master and 0.14.0 branch.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>

Closes #396 from xinyuiscool/SAMZA-1550

3 months agoSAMZA-1528: Change ClusterResourceManager to use the async NMClient
Jagadish [Tue, 2 Jan 2018 20:16:51 +0000 (12:16 -0800)] 
SAMZA-1528: Change ClusterResourceManager to use the async NMClient

- Rewrite container handling to be asynchronous
- Verified various failure scenarios using Unit tests, and deployments of a local Samza job.

Author: Jagadish <jvenkatraman@linkedin.com>
Author: Fred Ji <haifeng.ji@gmail.com>
Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Jacob Maes<jmakes@linkedin.com>, Xinyu Liu<xinyuiscool@gmail.com>

Closes #380 from vjagadish1989/cluster-mgr-refactor1

3 months agoSAMZA-1547; Parse default value grouper-factory config in KafkaCheckpointMgr
Jagadish [Fri, 22 Dec 2017 21:58:33 +0000 (13:58 -0800)] 
SAMZA-1547; Parse default value grouper-factory config in KafkaCheckpointMgr

- Additionally, updated all unit-tests.

Author: Jagadish <jvenkatraman@linkedin.com>

Reviewers: Prateek M <prmaheshw@linkedin.com>

Closes #394 from vjagadish1989/kcm-fix

3 months agoFix a couple sonarcloud issues with samza-1537
Jacob Maes [Fri, 22 Dec 2017 19:41:27 +0000 (11:41 -0800)] 
Fix a couple sonarcloud issues with samza-1537

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #393 from jmakes/streamappender-sonarcloud

3 months agoSAMZA-1539: KafkaProducer potential hang on close() when task.drop.pr…
Jacob Maes [Fri, 22 Dec 2017 18:36:37 +0000 (10:36 -0800)] 
SAMZA-1539: KafkaProducer potential hang on close() when task.drop.pr…

…oducer.errors==true

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Boris Shkolnik <boryas@apache.org>

Closes #390 from jmakes/samza-1539

3 months agoSAMZA-1537: StreamAppender can deadlock due to locks held by Kafka an…
Jacob Maes [Thu, 21 Dec 2017 19:43:09 +0000 (11:43 -0800)] 
SAMZA-1537: StreamAppender can deadlock due to locks held by Kafka an…

…d Log4j

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Jagadish <jvenkatr@linkedin.com>,Yi Pan (Data Infrastructure) <nickpan47@gmail.com>

Closes #388 from jmakes/async-stream-appender

3 months agoMerge script improvement - use colons instead of semicolons
Jacob Maes [Thu, 21 Dec 2017 18:43:47 +0000 (10:43 -0800)] 
Merge script improvement - use colons instead of semicolons

Author: Jacob Maes <jmakes@apache.org>

Reviewers: Xinyu Liu <xiliu@linkedin.com>,Jagadish <jvenkatr@linkedin.com>,Boris Shkolnik <boryas@apache.org>

Closes #391 from jmakes/merge-script-improvements

3 months agoSAMZA-1356: Improve monitoring for state restore
Jacob Maes [Tue, 19 Dec 2017 16:00:56 +0000 (08:00 -0800)] 
SAMZA-1356: Improve monitoring for state restore

Author: Jacob Maes <jmakes@apache.org>
Author: Jacob Maes <jmaes@linkedin.com>

Reviewers: Jagadish <jvenkatr@linkedin.com>

Closes #241 from jmakes/samza-1356

4 months agoSAMZA-1538: Disabled Flaky Tests in TestStreamProcessor
Prateek Maheshwari [Tue, 19 Dec 2017 00:45:06 +0000 (16:45 -0800)] 
SAMZA-1538: Disabled Flaky Tests in TestStreamProcessor

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Shanthoosh Venkataraman <svenkata@linkedin.com>

Closes #389 from prateekm/disable-flaky-test

4 months agoSAMZA-1536; Adding docs for Kinesis consumer
Aditya Toomula [Thu, 14 Dec 2017 23:33:29 +0000 (15:33 -0800)] 
SAMZA-1536; Adding docs for Kinesis consumer

Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Jagadish <jagadish@apache.org>

Closes #384 from atoomula/kinesis-docs

4 months agoAdded document for table API to feature preview
Wei Song [Thu, 14 Dec 2017 20:04:34 +0000 (12:04 -0800)] 
Added document for table API to feature preview

Added document for table API to feature preview
 - Brief description of table
 - sendTo() operator for table
 - join() operator for stream-table-join

Author: Wei Song <wsong@linkedin.com>

Reviewers: Prateek Maheshwari <pmaheshwari@apache.org>

Closes #387 from weisong44/table-api-14

4 months agoSAMZA-1437; Added Eventhub producer and consumer docs
Daniel Chen [Wed, 13 Dec 2017 05:47:38 +0000 (21:47 -0800)] 
SAMZA-1437; Added Eventhub producer and consumer docs

Still need to add tutorials, and configs to configurations table

vjagadish1989  for review

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish <jagadish@apache.org>

Closes #382 from dxichen/eventhub-docs

4 months agoSAMZA-1502; Make AllSspToSingleTaskGrouper work with Yarn and ZK JobCoordinator
Aditya Toomula [Wed, 13 Dec 2017 05:41:37 +0000 (21:41 -0800)] 
SAMZA-1502; Make AllSspToSingleTaskGrouper work with Yarn and ZK JobCoordinator

Sending a fresh review as I lost the earlier diffs. This is the new approach that we discussed by adding the processor list in the config and passing it to grouper.

Author: Aditya Toomula <atoomula@atoomula-ld1.linkedin.biz>

Reviewers: Yi Pan <nickpan47@gmail.com>, Shanthoosh V <svenkataraman@linkedin.com>

Closes #383 from atoomula/samza

4 months agoSAMZA-1512: Documentation on the multi-stage batch processing
Xinyu Liu [Wed, 13 Dec 2017 01:07:07 +0000 (17:07 -0800)] 
SAMZA-1512: Documentation on the multi-stage batch processing

Add overview documentation to explain how partitionBy(), checkpoint and state works in batch. Also organized the existing hdfs consumer/producer docs into the same hadoop folder under documentation.

Author: xinyuiscool <xinyuliu.us@gmail.com>

Reviewers: Jake Maes <jmakes@gmail.com>

Closes #381 from xinyuiscool/SAMZA-1512

4 months agoSAMZA-1534: Fix the visualization in job graph with the new PartitionBy Op
Xinyu Liu [Tue, 12 Dec 2017 20:16:39 +0000 (12:16 -0800)] 
SAMZA-1534: Fix the visualization in job graph with the new PartitionBy Op

Seems the stream and the partitionBy op has the same id. So in rendering I added the stream as the id for the node. Also resolved the run.id collision issue.

Author: xiliu <xiliu@xiliu-ld1.linkedin.biz>

Reviewers: Jagadish V <vjagadish1989@gmail.com>

Closes #385 from xinyuiscool/SAMZA-1534

4 months agoDocumentation for Samza SQL
Srinivasulu Punuru [Tue, 12 Dec 2017 18:46:01 +0000 (10:46 -0800)] 
Documentation for Samza SQL

**Samza tools** :
Contains the following tools that can be used for playing with Samza sql or any other samza job.

1. Generate kafka events : Tool used to generate avro serialized kafka events
2. Event hub consumer : Tool used to consume events from event hubs topic. This can be used if the samza job writes events to event hubs.
3. Samza sql console : Tool used to execute SQL using samza sql.

Adds documentation on how to use Samza SQL on a local machine and on a yarn environment and their associated Samza tooling.

https://issues.apache.org/jira/browse/SAMZA-1526

Author: Srinivasulu Punuru <spunuru@linkedin.com>

Reviewers: Yi Pan<nickpan47@gmail.com>, Jagadish<jagadish@apachehe.org>

Closes #374 from srinipunuru/docs.1

4 months agoSAMZA-1532; Eventhub connector fix
Daniel Chen [Tue, 12 Dec 2017 00:45:44 +0000 (16:45 -0800)] 
SAMZA-1532; Eventhub connector fix

Key fixes vjagadish1989 lhaiesp srinipunuru
- Switched Producer source vs destination assumptions in `send`, `register`
- Check `OME.key` if `OME.partitionId` is null for to get partitionId
- Upcoming offset changed the `END_OF_STREAM` rather than `newestOffset` + 1, eventHub returns an error if the offset does not exist in the system
- Made the NewestOffset+1 as upcoming offset, consumer checks if the offset is valid on startup
- Differentiated between streamNames and streamIds in configs, consumer, producer
- Checkpoint table named after job name
- Checkpoint prints better message for invalid key on write

QOL
- How to ignore integration tests
- Improved logging

EDIT:
- Also added Round Robin producer partitioning

Author: Daniel Chen <29577458+dxichen@users.noreply.github.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes #377 from dxichen/eventhub-connector-fix

4 months agoInitial version of Table API
Wei Song [Tue, 5 Dec 2017 21:23:44 +0000 (13:23 -0800)] 
Initial version of Table API

Initial version of table API, it includes
 - Core table API (Table, TableDescriptor, TableSpec)
 - Local table implementation for in-memory and RocksDb
 - The writeTo() and stream-table join operators

nickpan47 xinyuiscool prateekm could you help review?

Author: Wei Song <wsong@linkedin.com>

Reviewers: Yi Pan <nickpan47@gmail.com>, Christopher Pettitt <cpettitt@linkedin.com>

Closes #349 from weisong44/table-api-14

4 months agoFix compile errors with scala 2.12 and update release notes to use check all
Bharath Kumarasubramanian [Tue, 5 Dec 2017 01:11:03 +0000 (17:11 -0800)] 
Fix compile errors with scala 2.12 and update release notes to use check all

Author: Bharath Kumarasubramanian <bkumaras@linkedin.com>

Reviewers: Xinyu Liu <xinyuliu.us@gmail.com>

Closes #378 from bharathkk/master

4 months agoUpdated Serde related documentation and error messages for High Level API
Prateek Maheshwari [Mon, 4 Dec 2017 22:21:21 +0000 (14:21 -0800)] 
Updated Serde related documentation and error messages for High Level API

Updated and clarified the documentation and error messages related to Serdes for Input/Output/PartitionBy streams.

Author: Prateek Maheshwari <pmaheshw@linkedin.com>

Reviewers: Jagadish Venkatraman <vjagadish1989@gmail.com>

Closes #376 from prateekm/documentation-cleanup

4 months agoSAMZA-1506: Fix for robust ContainerHeartbeatMonitor exception handling.
Abhishek Shivanna [Fri, 1 Dec 2017 23:23:12 +0000 (15:23 -0800)] 
SAMZA-1506: Fix for robust ContainerHeartbeatMonitor exception handling.

The Fix includes the following changes:
- Catch all exceptions inside the heartbeat thread and not just
  IOException.
- A time based force kill when the heartbeat is invalid,
  this makes the monitor immune to threads that may keep the
  container stuck in the shutdown sequence. When the timeout
  occurs, a System.exit(1) is called.
- Increasing number of retries for failed heartbeats from 3 to 6.
  This prevents short intermittent network failurs from causing the
  containers to be invalidated.

Author: Abhishek Shivanna <abhisheks91@gmail.com>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #375 from abhishekshivanna/container-heartbeat

4 months agoSAMZA-1519: Add release notes to website documentation
navina [Fri, 1 Dec 2017 16:57:06 +0000 (08:57 -0800)] 
SAMZA-1519: Add release notes to website documentation

Adding a versioned page for release/upgrade notes. We can start this process from the next major version release, aka 0.14.0.

Please update this page as and when you add new features/configs/API or deprecate features/configs/API. Basically, anything that can be useful for Samza users trying to upgrade.

Note: `site.version` is not necessarily same as samza release version. For now, I am using it as a placeholder. Hopefully, with the next generation of our website, it will be better defined.

Author: navina <navina@apache.org>

Reviewers: Jacob Maes <jmaes@linkedin.com>

Closes #301 from navina/versioning