aurora.git
18 months agoUpdating .auroraversion to release version 0.12.0. rel/0.12.0
John Sirois [Mon, 8 Feb 2016 22:42:44 +0000 (15:42 -0700)] 
Updating .auroraversion to release version 0.12.0.

18 months agoFixup release script tag step.
John Sirois [Mon, 8 Feb 2016 22:42:22 +0000 (15:42 -0700)] 
Fixup release script tag step.

18 months agoUpdating .auroraversion to 0.12.0-rc4.
John Sirois [Fri, 5 Feb 2016 22:12:18 +0000 (15:12 -0700)] 
Updating .auroraversion to 0.12.0-rc4.

18 months agoIncrementing snapshot version to 0.13.0-SNAPSHOT.
John Sirois [Fri, 5 Feb 2016 22:12:18 +0000 (15:12 -0700)] 
Incrementing snapshot version to 0.13.0-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.12.0 release.
John Sirois [Fri, 5 Feb 2016 22:12:18 +0000 (15:12 -0700)] 
Updating CHANGELOG for 0.12.0 release.

18 months agoBackfilling JobConfiguration.Identity
Maxim Khutornenko [Fri, 5 Feb 2016 21:55:26 +0000 (13:55 -0800)] 
Backfilling JobConfiguration.Identity

Bugs closed: AURORA-1610

Reviewed at https://reviews.apache.org/r/43262/

18 months agoReset .auroraversion and CHANGELOG in prep for 0.12.0-rc4.
John Sirois [Fri, 5 Feb 2016 20:00:46 +0000 (13:00 -0700)] 
Reset .auroraversion and CHANGELOG in prep for 0.12.0-rc4.

18 months agoIncrementing snapshot version to 0.13.0-SNAPSHOT.
John Sirois [Fri, 5 Feb 2016 19:06:23 +0000 (12:06 -0700)] 
Incrementing snapshot version to 0.13.0-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.12.0 release.
John Sirois [Fri, 5 Feb 2016 19:06:23 +0000 (12:06 -0700)] 
Updating CHANGELOG for 0.12.0 release.

18 months agoManual prep for 0.12.0-rc3 release.
John Sirois [Fri, 5 Feb 2016 18:53:29 +0000 (11:53 -0700)] 
Manual prep for 0.12.0-rc3 release.

Reset .auroraversiob to 0.12.0-SNAPSHOT and revert to 'Aurora 0.11.0'
CHANGELOG tip.

18 months agoAdd failed result email protocol.
John Sirois [Fri, 5 Feb 2016 15:58:10 +0000 (08:58 -0700)] 
Add failed result email protocol.

Hints of this protocol exist down in step 6 when a release succeeds,
but this places the failure action in-line in the step process to make
it more likely the reader does the right thing.

Also kill an incorrect instruction to send the successful release vote
result email to the private@ list.

Testing Done:
I have no clue if the instructions and provided example link are correct.
I did find variation when reading past [RESULT][VOTE] failures; so
guidance on what is required vs what is personal flair is appreciated.

Rendered here: https://github.com/jsirois/aurora/blob/jsirois/release-docs/more-fixes/docs/committers.md

Reviewed at https://reviews.apache.org/r/42984/

18 months agoAdd deprecated field storage backfill
Maxim Khutornenko [Thu, 4 Feb 2016 23:18:38 +0000 (15:18 -0800)] 
Add deprecated field storage backfill

Bugs closed: AURORA-1603

Reviewed at https://reviews.apache.org/r/43172/

18 months agoRemove unused <result> entry in TaskMapper.
Zameer Manji [Thu, 4 Feb 2016 18:37:07 +0000 (10:37 -0800)] 
Remove unused <result> entry in TaskMapper.

The property `taskConfigRowId` doesn't exist on `DbScheduledTask` so this line
has no use.

Testing Done:
./gadlew test

Reviewed at https://reviews.apache.org/r/43178/

18 months agoExpose MyBatis PoolState via stats.
Zameer Manji [Wed, 3 Feb 2016 22:28:12 +0000 (14:28 -0800)] 
Expose MyBatis PoolState via stats.

To better understand the MyBatis connection pool this patch exposes the pool
state via stats.

Reviewed at https://reviews.apache.org/r/43150/

18 months agoAdd header to allow bypassing the LeaderRedirectFilter.
Joshua Cohen [Wed, 3 Feb 2016 01:40:26 +0000 (19:40 -0600)] 
Add header to allow bypassing the LeaderRedirectFilter.

Bugs closed: AURORA-1601

Reviewed at https://reviews.apache.org/r/42964/

18 months agoMake --announcer-enable optional no-op instead of removing it completely.
Zhitao Li [Tue, 2 Feb 2016 23:28:25 +0000 (15:28 -0800)] 
Make --announcer-enable optional no-op instead of removing it completely.

Reviewed at https://reviews.apache.org/r/43112/

18 months agoReorganize NEWS into updates and deprecations
Stephan Erb [Tue, 2 Feb 2016 21:34:20 +0000 (14:34 -0700)] 
Reorganize NEWS into updates and deprecations

I've splitted all releases with additions and deprecations into too sections. This should make it much easier to track past deprecations.

Reviewed at https://reviews.apache.org/r/43109/

18 months agoMap Aurora task metadata to Mesos task labels.
Stephan Erb [Tue, 2 Feb 2016 20:55:39 +0000 (12:55 -0800)] 
Map Aurora task metadata to Mesos task labels.

Bugs closed: AURORA-1052

Reviewed at https://reviews.apache.org/r/35990/

18 months agoUpgrade to pants 0.0.70.
John Sirois [Tue, 2 Feb 2016 19:30:40 +0000 (12:30 -0700)] 
Upgrade to pants 0.0.70.

This bumps us to last week's regular weekly release.
The changelog is here:
  http://pantsbuild.github.io/changelog.html

No changes of note directly impacting Aurora, just keeping up
with the release train.

Testing Done:
Locally green:
```
./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```

Reviewed at https://reviews.apache.org/r/43098/

18 months agoReverting deprecated field removal patches.
Maxim Khutornenko [Tue, 2 Feb 2016 19:07:08 +0000 (11:07 -0800)] 
Reverting deprecated field removal patches.

This reverts commit e1b55fa544765c12251ce6c1736e6352da3f7edb.

This reverts commit 89fad5a8895482b6c3fa45356137aa250d766dfe.

Bugs closed: AURORA-1603

Reviewed at https://reviews.apache.org/r/43104/

18 months agoFixing duplicate instances in the UI.
Maxim Khutornenko [Tue, 2 Feb 2016 04:31:28 +0000 (20:31 -0800)] 
Fixing duplicate instances in the UI.

Bugs closed: AURORA-1604

Reviewed at https://reviews.apache.org/r/43080/

18 months agoAdd a flag to configure H2 LOCK_TIMEOUT.
Zameer Manji [Mon, 1 Feb 2016 22:48:51 +0000 (14:48 -0800)] 
Add a flag to configure H2 LOCK_TIMEOUT.

Bugs closed: AURORA-1596

Reviewed at https://reviews.apache.org/r/42985/

18 months agoImprove --read-json to handle multi-job files
Benjamin Staffin [Mon, 1 Feb 2016 22:24:13 +0000 (14:24 -0800)] 
Improve --read-json to handle multi-job files

Still handles the old --read-json behavior of expecting a single job,
but adds the ability to read files with a {"jobs": [job1, job2, ...]}
schema like the pystachio format.

Also adds --read-json to the `aurora config load` command, as it is
now useful there.

Json configs are now loaded in a way that is much closer to the
pystachio one, so the config loader will no longer ignore unknown
fields.

Bugs closed: AURORA-1577

Reviewed at https://reviews.apache.org/r/42953/

18 months agoAllow dots and hyphens in metric names.
Stephan Erb [Mon, 1 Feb 2016 22:16:07 +0000 (14:16 -0800)] 
Allow dots and hyphens in metric names.

This will make sure we won't warn about invalid stat names for valid job identifiers.

Bugs closed: AURORA-1282

Reviewed at https://reviews.apache.org/r/42879/

18 months agoBump virtualenv version for in repo tools.
Zameer Manji [Mon, 1 Feb 2016 22:08:16 +0000 (14:08 -0800)] 
Bump virtualenv version for in repo tools.

Reviewed at https://reviews.apache.org/r/43066/

18 months agoEnable ping query to prevent use of invalid pooled connections.
Zameer Manji [Mon, 1 Feb 2016 21:59:46 +0000 (13:59 -0800)] 
Enable ping query to prevent use of invalid pooled connections.

Bugs closed: AURORA-1596

Reviewed at https://reviews.apache.org/r/42979/

18 months agoAdd Fitbit to the Aurora adopters list
Benjamin Staffin [Sat, 30 Jan 2016 02:35:41 +0000 (18:35 -0800)] 
Add Fitbit to the Aurora adopters list

Reviewed at https://reviews.apache.org/r/43005/

18 months agoIncrementing snapshot version to 0.12.1-SNAPSHOT.
John Sirois [Thu, 28 Jan 2016 23:51:05 +0000 (16:51 -0700)] 
Incrementing snapshot version to 0.12.1-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.12.0 release.
John Sirois [Thu, 28 Jan 2016 23:51:05 +0000 (16:51 -0700)] 
Updating CHANGELOG for 0.12.0 release.

18 months agoRevert "Updating CHANGELOG for 0.12.0 release."
John Sirois [Thu, 28 Jan 2016 23:45:18 +0000 (16:45 -0700)] 
Revert "Updating CHANGELOG for 0.12.0 release."

This reverts commit 309ed9968d0aa10d63c66a173c16d7e4f9c552ca.

18 months agoRevert "Incrementing snapshot version to 0.13.0-SNAPSHOT."
John Sirois [Thu, 28 Jan 2016 23:44:30 +0000 (16:44 -0700)] 
Revert "Incrementing snapshot version to 0.13.0-SNAPSHOT."

This reverts commit 81722a9e700641f5b435e97c1cbb38d7eed4e98c.

18 months agoRevert "Updating CHANGELOG for 0.12.0 release."
John Sirois [Thu, 28 Jan 2016 23:42:11 +0000 (16:42 -0700)] 
Revert "Updating CHANGELOG for 0.12.0 release."

This reverts commit d34609a2e24b434701347542b9328581acfd829b.

18 months agoRevert "Incrementing snapshot version to 0.12.1-SNAPSHOT."
John Sirois [Thu, 28 Jan 2016 23:42:08 +0000 (16:42 -0700)] 
Revert "Incrementing snapshot version to 0.12.1-SNAPSHOT."

This reverts commit 131771c1b1517c0290739d389ad0d504da1dd12e.

18 months agoIncrementing snapshot version to 0.12.1-SNAPSHOT.
John Sirois [Thu, 28 Jan 2016 23:34:56 +0000 (16:34 -0700)] 
Incrementing snapshot version to 0.12.1-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.12.0 release.
John Sirois [Thu, 28 Jan 2016 23:34:56 +0000 (16:34 -0700)] 
Updating CHANGELOG for 0.12.0 release.

18 months agoRevert "Updating CHANGELOG for 0.13.0 release."
John Sirois [Thu, 28 Jan 2016 23:26:00 +0000 (16:26 -0700)] 
Revert "Updating CHANGELOG for 0.13.0 release."

This reverts commit 38e8237fe91e4fa74cf563a88330571eaf359424.

18 months agoRevert "Incrementing snapshot version to 0.14.0-SNAPSHOT."
John Sirois [Thu, 28 Jan 2016 23:25:59 +0000 (16:25 -0700)] 
Revert "Incrementing snapshot version to 0.14.0-SNAPSHOT."

This reverts commit bfca7ae7e0138fe4facd256217e1166b605f97ce.

18 months agoRevert "Updating CHANGELOG for 0.14.0 release."
John Sirois [Thu, 28 Jan 2016 23:25:57 +0000 (16:25 -0700)] 
Revert "Updating CHANGELOG for 0.14.0 release."

This reverts commit 34c676d96fd909283dcb1be79424887b428fe73e.

18 months agoRevert "Incrementing snapshot version to 0.14.1-SNAPSHOT."
John Sirois [Thu, 28 Jan 2016 23:25:53 +0000 (16:25 -0700)] 
Revert "Incrementing snapshot version to 0.14.1-SNAPSHOT."

This reverts commit 2365083e1d340f51724caf500380c71a2b0104b3.

18 months agoIncrementing snapshot version to 0.14.1-SNAPSHOT.
John Sirois [Thu, 28 Jan 2016 22:28:22 +0000 (15:28 -0700)] 
Incrementing snapshot version to 0.14.1-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.14.0 release.
John Sirois [Thu, 28 Jan 2016 22:28:22 +0000 (15:28 -0700)] 
Updating CHANGELOG for 0.14.0 release.

18 months agoIncrementing snapshot version to 0.14.0-SNAPSHOT.
John Sirois [Thu, 28 Jan 2016 22:25:21 +0000 (15:25 -0700)] 
Incrementing snapshot version to 0.14.0-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.13.0 release.
John Sirois [Thu, 28 Jan 2016 22:25:21 +0000 (15:25 -0700)] 
Updating CHANGELOG for 0.13.0 release.

18 months agoRevert "Improving job update query performance."
Joshua Cohen [Thu, 28 Jan 2016 22:19:14 +0000 (16:19 -0600)] 
Revert "Improving job update query performance."

This reverts commit fee5943a95c4f08e148dc5f1366486a8c23d5773.

We discovered a bug when deploying this commit that caused corruption of the update store.

Reviewed at https://reviews.apache.org/r/42922/

18 months agoFixup RC VOTE email instructions.
John Sirois [Thu, 28 Jan 2016 20:29:01 +0000 (13:29 -0700)] 
Fixup RC VOTE email instructions.

Testing Done:
None

Reviewed at https://reviews.apache.org/r/42919/

18 months agoFixup RC email template tag URL.
John Sirois [Thu, 28 Jan 2016 13:50:34 +0000 (06:50 -0700)] 
Fixup RC email template tag URL.

18 months agoIncrementing snapshot version to 0.13.0-SNAPSHOT.
John Sirois [Thu, 28 Jan 2016 05:29:51 +0000 (22:29 -0700)] 
Incrementing snapshot version to 0.13.0-SNAPSHOT.

18 months agoUpdating CHANGELOG for 0.12.0 release.
John Sirois [Thu, 28 Jan 2016 05:29:51 +0000 (22:29 -0700)] 
Updating CHANGELOG for 0.12.0 release.

18 months agoFixup release-candidate script.
John Sirois [Thu, 28 Jan 2016 05:29:34 +0000 (22:29 -0700)] 
Fixup release-candidate script.

Previously the svn add of the dist artifacts failed with:
```
Publishing release candidate to https://dist.apache.org/repos/dist/dev/aurora/0.12.0-rc0
Committing transaction...
Committed revision 12061.
Checked out revision 12061.
svn: E155007: /home/jsirois/dev/3rdparty/aurora-origin is not a working copy
ERROR: Looks like something has failed while creating the release candidate.
```

Testing Done:
Tested with a slightly different diff - this successfully published to svn (since removed):
```diff
$ git diff
diff --git a/build-support/release/release-candidate b/build-support/release/release-candidate
index 78e9a4f..91261c0 100755
--- a/build-support/release/release-candidate
+++ b/build-support/release/release-candidate
@@ -93,7 +93,7 @@ git fetch --tags -q
 # Verify that this is a clean repository
 if [[ -n "`git status --porcelain`" ]]; then
   echo "ERROR: Please run from a clean git repository."
-  exit 1
+#  exit 1
 elif [[ "`git rev-parse --abbrev-ref HEAD`" != "master" ]]; then
   echo "ERROR: This script must be run from master."
   exit 1
@@ -219,8 +219,11 @@ if [[ $publish == 1 ]]; then
   echo "Publishing release candidate to ${aurora_svn_rc_url}"
   svn mkdir ${aurora_svn_rc_url} -m "aurora-${current_version} release candidate ${rc_version_tag}"
   svn co --depth=empty ${aurora_svn_rc_url} ${dist_dir}
+  pushd ${dist_dir}
   svn add ${dist_name}*
   svn ci -m "aurora-${current_version} release candidate ${rc_version_tag}"
+  popd
+  exit 0

   echo "Creating tag ${rc_version_tag}"
   git tag -s ${rc_version_tag} \
```

Reviewed at https://reviews.apache.org/r/42898/

18 months agoRemove deprecated fields made redundant by JobKey.
Bill Farner [Thu, 28 Jan 2016 02:23:20 +0000 (18:23 -0800)] 
Remove deprecated fields made redundant by JobKey.

Bugs closed: AURORA-1598

Reviewed at https://reviews.apache.org/r/42811/

18 months agoImproving job update query performance.
Maxim Khutornenko [Thu, 28 Jan 2016 01:19:30 +0000 (17:19 -0800)] 
Improving job update query performance.

Bugs closed: AURORA-1600

Reviewed at https://reviews.apache.org/r/42882/

18 months agoFix stray printf style log replacement token when logging triggered cron jobs.
Joshua Cohen [Wed, 27 Jan 2016 22:49:38 +0000 (16:49 -0600)] 
Fix stray printf style log replacement token when logging triggered cron jobs.

Reviewed at https://reviews.apache.org/r/42869/

18 months agoEnable H2 logging to slf4j.
Zameer Manji [Wed, 27 Jan 2016 21:52:02 +0000 (13:52 -0800)] 
Enable H2 logging to slf4j.

On a test cluster with DbTaskStore enabled there are several lines in the log
that look like:
````
2016-01-26 13:07:14 jdbc[15]: exception
````
There is no other information with these lines. This is a result of setting
`TRACE_LEVEL_SYSTEM_OUT` to `1` for H2. This will print out the error message
but not the associated throwable:
https://github.com/h2database/h2database/blob/a932268843ac84c7a665e427167ff2eb291d6b8e/h2/src/main/org/h2/message/TraceSystem.java#L228

The SLF4J implementation of tracing in H2 does not suffer from this restriction.

Reviewed at https://reviews.apache.org/r/42845/

18 months agoRemove deprecated `HealthCheckConfig` fields.
John Sirois [Wed, 27 Jan 2016 20:18:00 +0000 (13:18 -0700)] 
Remove deprecated `HealthCheckConfig` fields.

Remove `endpoint`, `expected_response` and `expected_response_code`
which were all deprecated in Aurora 0.11.0 in favor of the same-named
fields in `HttpHealthChecker`.

This also removes health check validation in the client in favor of
leveraging the pystachio schema.  The one difference this allows for is
an empty string for the `ShellHealthChecker.shell_command`.  Since an
empty string is a valid shell command (equivalent to `true`), this
simplification seems justified.

Testing Done:
Locally green:
```
./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```

Bugs closed: AURORA-1552, AURORA-1563

Reviewed at https://reviews.apache.org/r/42816/

18 months agoRe-purposing addInstances RPC to act as scaleOut
Maxim Khutornenko [Wed, 27 Jan 2016 08:07:31 +0000 (00:07 -0800)] 
Re-purposing addInstances RPC to act as scaleOut

Bugs closed: AURORA-1258

Reviewed at https://reviews.apache.org/r/42759/

18 months agoRemove the --announcer-enable executor flag.
Bill Farner [Tue, 26 Jan 2016 19:43:20 +0000 (11:43 -0800)] 
Remove the --announcer-enable executor flag.

Reviewed at https://reviews.apache.org/r/42727/

18 months agoRemove job update `maxWaitToInstanceRunningMs` field.
John Sirois [Tue, 26 Jan 2016 18:30:09 +0000 (11:30 -0700)] 
Remove job update `maxWaitToInstanceRunningMs` field.

This field in the thrift api `JobUpdateSettings` struct and its sibling
in `UpdateConfig.restart_threshold` on the client side were deprecated
in Aurora 0.11.0.

Testing Done:
Locally green:
```
./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```

Bugs closed: AURORA-1254

Reviewed at https://reviews.apache.org/r/42804/

18 months ago`TaskHistoryPruner` controls Lifecycle directly.
John Sirois [Tue, 26 Jan 2016 18:29:48 +0000 (11:29 -0700)] 
`TaskHistoryPruner` controls Lifecycle directly.

This was the original idea in https://reviews.apache.org/r/42332.

Mixing the active scheduler `Service` lifecycle with the `EventBus`
lifecycle proves tricky - prune events are fired before scheduler active
services are started.  Instead of queueing up prune events to wait for
service start or re-engineering service / event bus interaction, returns
to the orignal behavior, manipulating the `Lifecycle` directly.

Also kill a confusing unused EventSink discovered during analyis of all
pub-sub event sourcing that might interact with the `TaskHistoryPruner`.

Testing Done:
Locally green:
```
./gradlew -Pq build
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```
It's the latter - e2e (krb part) - that was the only automated testing
revealing the problem previously.

Bugs closed: AURORA-1593

Reviewed at https://reviews.apache.org/r/42801/

18 months agoFixing reference in table of contents to Docker Object(s).
Dmitriy Shirchenko [Mon, 25 Jan 2016 21:02:46 +0000 (13:02 -0800)] 
Fixing reference in table of contents to Docker Object(s).

Reviewed at https://reviews.apache.org/r/42737/

18 months agoUpgrade pants to 0.0.69.
John Sirois [Mon, 25 Jan 2016 18:53:08 +0000 (11:53 -0700)] 
Upgrade pants to 0.0.69.

This is the regular weekly release/upgrade.
The CHANGELOG can be read here:
  http://pantsbuild.github.io/changelog.html

No changes of note for Aurora, this just keeps up with latest to make future
upgrades as small and smooth as possible.

Testing Done:
Locally green:
```
./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```

Reviewed at https://reviews.apache.org/r/42699/

18 months agoRemove most direct uses of deprecated TaskConfig fields.
Bill Farner [Sat, 23 Jan 2016 01:14:28 +0000 (17:14 -0800)] 
Remove most direct uses of deprecated TaskConfig fields.

Reviewed at https://reviews.apache.org/r/42668/

18 months agoDeprecating TaskQuery in killTasks.
Maxim Khutornenko [Fri, 22 Jan 2016 22:40:27 +0000 (14:40 -0800)] 
Deprecating TaskQuery in killTasks.

Bugs closed: AURORA-1583

Reviewed at https://reviews.apache.org/r/42666/

18 months agoRemove storage backfill and TaskStore mutateTasks.
Bill Farner [Fri, 22 Jan 2016 22:15:55 +0000 (14:15 -0800)] 
Remove storage backfill and TaskStore mutateTasks.

Reviewed at https://reviews.apache.org/r/42646/

18 months agoSimplify TaskHistoryPruner tie-in to Lifecycle.
John Sirois [Fri, 22 Jan 2016 21:50:54 +0000 (14:50 -0700)] 
Simplify TaskHistoryPruner tie-in to Lifecycle.

This eliminates processing all futures to find the 1st failed one in
favor of directly signalling a Service failure when a unit of async work
fails.

Testing Done:
Locally green: `./gradlew -P build`.

Bugs closed: AURORA-1582

Reviewed at https://reviews.apache.org/r/42639/

18 months agoRemove scheduler flag -extra_modules.
Bill Farner [Fri, 22 Jan 2016 19:30:04 +0000 (11:30 -0800)] 
Remove scheduler flag -extra_modules.

Reviewed at https://reviews.apache.org/r/42645/

18 months agoAdd storage API methods for fetching amd mutating a task by ID.
Bill Farner [Fri, 22 Jan 2016 07:04:35 +0000 (23:04 -0800)] 
Add storage API methods for fetching amd mutating a task by ID.

Reviewed at https://reviews.apache.org/r/42628/

18 months agoTurn TaskHistoryPruner into a service and trigger shutdown on pruning failure.
Zameer Manji [Fri, 22 Jan 2016 01:38:25 +0000 (17:38 -0800)] 
Turn TaskHistoryPruner into a service and trigger shutdown on pruning failure.

Task pruning is key to operating a large cluster and failure to prune should
trigger shutdown to prevent unbounded growth of storage. This patch turns
`TaskHistoryPruner` into a service which propagates failure from failed pruning
attempts towards the `ServiceManager`. Also completing a TODO which removes a
test for behaviour that is very awkward to test for.

Bugs closed: AURORA-1582

Reviewed at https://reviews.apache.org/r/42332/

18 months agoAllowing dual authorizing params to account for thrift API deprecations.
Maxim Khutornenko [Thu, 21 Jan 2016 23:06:07 +0000 (15:06 -0800)] 
Allowing dual authorizing params to account for thrift API deprecations.

Also, added missing test coverage.

Reviewed at https://reviews.apache.org/r/42614/

18 months agoEnable READ COMMITTED transaction isolation.
Bill Farner [Thu, 21 Jan 2016 22:30:28 +0000 (14:30 -0800)] 
Enable READ COMMITTED transaction isolation.

Bugs closed: AURORA-1580

Reviewed at https://reviews.apache.org/r/42613/

18 months agoFix broken Thrift benchmark.
George Sirois [Wed, 20 Jan 2016 19:08:11 +0000 (13:08 -0600)] 
Fix broken Thrift benchmark.

Issue introduced with: https://reviews.apache.org/r/42077/.

Reviewed at https://reviews.apache.org/r/42567/

18 months agoIntroduces -default_docker_parameters scheduler flag.
George Sirois [Wed, 20 Jan 2016 18:18:32 +0000 (12:18 -0600)] 
Introduces -default_docker_parameters scheduler flag.

This flag allows cluster administrators to set arbitrary
Docker parameters which will apply to all jobs.

Also cleans up some of the existing unit tests around task config.

Bugs closed: AURORA-1575

Reviewed at https://reviews.apache.org/r/42077/

18 months agoRevert "Shim interfaces to preface args system overhaul."
Bill Farner [Wed, 20 Jan 2016 01:47:32 +0000 (17:47 -0800)] 
Revert "Shim interfaces to preface args system overhaul."

This reverts commit fe13e4ed52d4dc0a35f9e50b5e49c6e705f64579.

Reviewed at https://reviews.apache.org/r/42532/

18 months agoShim interfaces to preface args system overhaul.
Bill Farner [Tue, 19 Jan 2016 22:05:48 +0000 (14:05 -0800)] 
Shim interfaces to preface args system overhaul.

Reviewed at https://reviews.apache.org/r/41804/

18 months agoVagrant change to reserve part of the dev cluster's resources to 'aurora-role'.
Zhitao Li [Tue, 19 Jan 2016 21:24:19 +0000 (13:24 -0800)] 
Vagrant change to reserve part of the dev cluster's resources to 'aurora-role'.

Bugs closed: AURORA-1109

Reviewed at https://reviews.apache.org/r/42177/

18 months agoUpgrade pants to 0.0.68.
John Sirois [Tue, 19 Jan 2016 21:07:18 +0000 (14:07 -0700)] 
Upgrade pants to 0.0.68.

This is the regular weekly release/upgrade.
The CHANGELOG can be read here:
  http://pantsbuild.github.io/changelog.html

Of interest for Aurora is graceful error handling when running
py.test with `--coverage` enabled.

Testing Done:
Locally green:
```
./build-support/jenkins/build.sh
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
```

Reviewed at https://reviews.apache.org/r/42445/

18 months agoFixup broken jmh benchmarks.
John Sirois [Tue, 19 Jan 2016 20:14:57 +0000 (13:14 -0700)] 
Fixup broken jmh benchmarks.

The `TierConfig` binding fix for the `SchedulingBenchmarks` in
https://reviews.apache.org/r/42073 was also needed for the
`StatusUpdateBenchmarks` and a binding for `ConfigurationManager` was
missing for the `ThriftApiBenchmarks` as-of
https://reviews.apache.org/r/41711/.

While debugging these failures I noticed a new version of jmh had been
released recently, so also upgraded to that.  No changes of import, but
the changelog can be read here:
  http://hg.openjdk.java.net/code-tools/jmh/

Testing Done:
Both of these now work on master:
`./gradlew jmh -Pbenchmarks='StatusUpdateBenchmark.*'`
`./gradlew jmh -Pbenchmarks='ThriftApiBenchmark.*'`

I also checked that a full run of `./gradlew jmh` was now green for
all benchmarks.

Reviewed at https://reviews.apache.org/r/42475/

18 months agoMake required mesos log args required.
John Sirois [Tue, 19 Jan 2016 16:32:10 +0000 (09:32 -0700)] 
Make required mesos log args required.

Both -native_log_file_path and -native_log_zk_group_path are required
but they were not validated (-native_log_file_path) and validated too
late in a provider (-native_log_zk_group_path) to provide useful
failure messages.  Correct this and make the arguments required in
the arg parsing phase.

Testing Done:
```
./gradlew clean distZip
unzip -qd /tmp/ dist/distributions/aurora-scheduler-0.12.0-SNAPSHOT.zip
/tmp/aurora-scheduler-0.12.0-SNAPSHOT/bin/aurora-scheduler \
  -mesos_master_address=localhost:5050 \
  -backup_dir=/tmp \
  -serverset_path=/aurora \
  -cluster_name=test -zk_endpoints=localhost:2181
...
I0115 20:18:37.890 [main, ArgScanner:443] zk_in_proc (org.apache.aurora.scheduler.zookeeper.guice.client.flagged.FlaggedClientConfig.zk_in_proc): false
I0115 20:18:37.890 [main, ArgScanner:443] zk_session_timeout (org.apache.aurora.scheduler.zookeeper.guice.client.flagged.FlaggedClientConfig.zk_session_timeout): (4, secs)
I0115 20:18:37.890 [main, ArgScanner:445] -------------------------------------------------------------------------
Exception in thread "main" java.lang.IllegalStateException: A value for the -native_log_file_path flag must be supplied
at org.apache.aurora.scheduler.log.mesos.MesosLogStreamModule.getRequiredArg(MesosLogStreamModule.java:99)
at org.apache.aurora.scheduler.log.mesos.MesosLogStreamModule.<init>(MesosLogStreamModule.java:110)
at org.apache.aurora.scheduler.app.SchedulerMain.main(SchedulerMain.java:209)
```

Bugs closed: AURORA-1587

Reviewed at https://reviews.apache.org/r/42375/

19 months agoAdd metric for counting uncaught exceptions in async executor.
Zameer Manji [Fri, 15 Jan 2016 18:30:54 +0000 (10:30 -0800)] 
Add metric for counting uncaught exceptions in async executor.

Add metric "async_executor_uncaught_exceptions" for tracking uncaught exceptions
in async executor.

Bugs closed: AURORA-1582

Reviewed at https://reviews.apache.org/r/42328/

19 months agoAllow for plugging in cli-configurable filters that are invoked post shiro filters.
Amol Deshmukh [Thu, 14 Jan 2016 22:26:46 +0000 (16:26 -0600)] 
Allow for plugging in cli-configurable filters that are invoked post shiro filters.

Bugs closed: AURORA-1576

Reviewed at https://reviews.apache.org/r/42046/

19 months agoFix typo in the user guide about Task Updates
Anant Vyas [Thu, 14 Jan 2016 21:25:10 +0000 (14:25 -0700)] 
Fix typo in the user guide about Task Updates

Testing Done:
Fix a minor typo in the user guide and add a missing "and"

Reviewed at https://reviews.apache.org/r/42321/

19 months agoAccept resource offers from multiple framework roles.
Zhitao Li [Thu, 14 Jan 2016 18:43:32 +0000 (10:43 -0800)] 
Accept resource offers from multiple framework roles.

Bugs closed: AURORA-1109

Reviewed at https://reviews.apache.org/r/42126/

19 months agoAdd `--show-error` to curl when bootstrapping thrift.
Zameer Manji [Wed, 13 Jan 2016 00:07:24 +0000 (16:07 -0800)] 
Add `--show-error` to curl when bootstrapping thrift.

From the curl documentation:
````
-S, --show-error

When used with -s it makes curl show an error message if it fails.
````

It's possible for curl to fail when grabbing the tarball or patch and this will
show users why it failed.

Testing Done:
Ran `make` in the `build-support/thrift` directory.

Reviewed at https://reviews.apache.org/r/42225/

19 months agoReplace scheduler log scaffolding with logback.
Bill Farner [Tue, 12 Jan 2016 04:57:49 +0000 (23:57 -0500)] 
Replace scheduler log scaffolding with logback.

Reviewed at https://reviews.apache.org/r/41785/

19 months agoUse tags instead of branches for release candidates.
Bill Farner [Tue, 12 Jan 2016 04:06:08 +0000 (23:06 -0500)] 
Use tags instead of branches for release candidates.

Reviewed at https://reviews.apache.org/r/42145/

19 months agoEnable H2 query statistics collection.
Zameer Manji [Mon, 11 Jan 2016 18:23:06 +0000 (10:23 -0800)] 
Enable H2 query statistics collection.

With this enabled operators can visit the H2 console at /h2console and run
queries like `SELECT * FROM INFORMATION_SCHEMA.QUERY_STATISTICS ORDER BY
MAX_EXECUTION_TIME DESC;` to diagnose slow schedulers.

Testing Done:
Ran `SELECT * FROM INFORMATION_SCHEMA.QUERY_STATISTICS ORDER BY
MAX_EXECUTION_TIME DESC;` within vagrant and saw query statistics.

Benchmarks

Master (c595228):
Benchmark                                                                     (numPendingTasks)   Mode  Cnt      Score      Error  Units
SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark                           N/A  thrpt   10  64138.084 ± 6732.130  ops/s
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark                  N/A  thrpt   10  23863.861 ± 2101.622  ops/s
SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark                N/A  thrpt   10   2228.883 ±  311.434  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                                1  thrpt   10     50.914 ±    2.488  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                               10  thrpt   10     43.729 ±    3.038  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                              100  thrpt   10     44.409 ±    4.426  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                             1000  thrpt   10     40.429 ±    7.526  ops/s
SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark                N/A  thrpt   10  22942.538 ± 1281.331  ops/s

This change:
Benchmark                                                                     (numPendingTasks)   Mode  Cnt      Score      Error  Units
SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark                           N/A  thrpt   10  65285.628 ± 2422.816  ops/s
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark                  N/A  thrpt   10  24573.332 ± 1332.474  ops/s
SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark                N/A  thrpt   10   2430.402 ±  258.860  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                                1  thrpt   10     43.810 ±    2.669  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                               10  thrpt   10     37.378 ±   14.637  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                              100  thrpt   10     40.180 ±    9.738  ops/s
SchedulingBenchmarks.PreemptorSlotSearchBenchmark.runBenchmark                             1000  thrpt   10     24.130 ±   15.746  ops/s
SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark                N/A  thrpt   10  18429.830 ± 3077.426  ops/s

Reviewed at https://reviews.apache.org/r/42041/

19 months agoChange release script to use rel/ tag prefix.
Bill Farner [Sun, 10 Jan 2016 19:59:23 +0000 (11:59 -0800)] 
Change release script to use rel/ tag prefix.

Reviewed at https://reviews.apache.org/r/42117/

19 months agoFix flaky `ServerSetImplTest` test.
John Sirois [Sat, 9 Jan 2016 18:10:38 +0000 (11:10 -0700)] 
Fix flaky `ServerSetImplTest` test.

The `testUnwatchOnException` test method uses a forced
InterruptedException to test that watches are un-registered.  In doing
so, the test inadvertantly set the interrupt bit for the test runner
thread, poisoning subsequent tests that invoked blocking code.  The
poisoning was only evident when the test methods were not run in lexical
order, which is the case in the vagrant vm.  This fix explicitly clears
the interrupt bit for the test thread with an explanation of why this is
done.

Testing Done:
Before the fix, this consistent error in the vagrant VM:
```
vagrant@aurora:~/aurora$ ./gradlew --rerun-tasks commons:test --tests org.apache.aurora.common.zookeeper.ServerSetImplTest
...
org.apache.aurora.common.zookeeper.ServerSetImplTest > testOrdering FAILED
    org.apache.aurora.common.net.pool.DynamicHostSet$MonitorException at ServerSetImplTest.java:155
        Caused by: org.apache.aurora.common.zookeeper.Group$WatchException at ServerSetImplTest.java:155
            Caused by: org.apache.aurora.common.zookeeper.Group$JoinException at ServerSetImplTest.java:155
                Caused by: java.lang.InterruptedException at ServerSetImplTest.java:155
```

Green after the fix in the vm and when run normally on my machine.

Bugs closed: AURORA-1574

Reviewed at https://reviews.apache.org/r/42102/

19 months agoUpgrade to pants 0.0.67.
John Sirois [Fri, 8 Jan 2016 19:42:16 +0000 (12:42 -0700)] 
Upgrade to pants 0.0.67.

The CHANGELOG can be read here:
  http://pantsbuild.github.io/changelog.html

Of note for aurora is an upgrade to pex 1.1.2 which
improves artifact resolution times.

Testing Done:
Locally green: `./build-support/jenkins/build.sh`

Reviewed at https://reviews.apache.org/r/42080/

19 months agoBump JMH to 1.11.2.
Zameer Manji [Fri, 8 Jan 2016 19:16:42 +0000 (11:16 -0800)] 
Bump JMH to 1.11.2.

Bump JMH to the latest available release which is 1.11.2. There isn't a
CHANGELOG but the commit history shows several bug fixes:
http://hg.openjdk.java.net/code-tools/jmh/

Testing Done:
./gradlew jmh -Pbenchmarks='UpdateStoreBenchmarks.*'

Reviewed at https://reviews.apache.org/r/42078/

19 months agoFix exception thrown in SchedulingBenchmarks set up.
Zameer Manji [Fri, 8 Jan 2016 18:27:50 +0000 (10:27 -0800)] 
Fix exception thrown in SchedulingBenchmarks set up.

SchedulingBenchmarks were broken because of a missing binding to `TeirConfig`
and an invalid parameter to `PreemptorModule`.

Testing Done:
./gradlew jmh -Pbenchmarks='SchedulingBenchmarks.*'

Reviewed at https://reviews.apache.org/r/42073/

19 months agoThermos: Add ability to specify process outputs destination
Martin Hrabovcin [Fri, 8 Jan 2016 16:18:11 +0000 (09:18 -0700)] 
Thermos: Add ability to specify process outputs destination

This patch will provide way to **optionally** specify running process outputs destination. Implementation was built on top of https://reviews.apache.org/r/30695/

**What was changed:**

New `destination` parameter is available on global cluster level and also on each `Process` level. Possible options are `file` (default), `stream` to parent process stdout/stderr, `mixed` will split output to files and stream and finally `none` to discard any logs produced by running process.

Testing Done:
Unit test coverage is provided for new functionality.

I did also manual testing with mesos/docker and I made sure that logs are being written to expected files and also same output gets to docker daemon.

Bugs closed: AURORA-1548

Reviewed at https://reviews.apache.org/r/40922/

19 months agoAdding gpg key for jsirois@apache.org.
John Sirois [Thu, 7 Jan 2016 21:15:38 +0000 (14:15 -0700)] 
Adding gpg key for jsirois@apache.org.

Reviewed at https://reviews.apache.org/r/42034/

19 months agoAmend install instructions to cover dependency missing from mesos deb.
Bill Farner [Thu, 7 Jan 2016 05:54:19 +0000 (21:54 -0800)] 
Amend install instructions to cover dependency missing from mesos deb.

Reviewed at https://reviews.apache.org/r/42017/

19 months agoUpdate and slightly extend the beginner tutorial
Stephan Erb [Sun, 3 Jan 2016 20:37:28 +0000 (21:37 +0100)] 
Update and slightly extend the beginner tutorial

Reviewed at https://reviews.apache.org/r/41844/

19 months agoAdd NEWS entry for "Allow custom announce path."
Bill Farner [Thu, 7 Jan 2016 05:15:01 +0000 (21:15 -0800)] 
Add NEWS entry for "Allow custom announce path."

19 months agoAllow custom announce path.
Kunal Thakar [Thu, 7 Jan 2016 05:07:42 +0000 (21:07 -0800)] 
Allow custom announce path.

Bugs closed: AURORA-1569

Reviewed at https://reviews.apache.org/r/41809/

19 months agoPopulating and validating task config in getJobUpdateDiff RPC.
Maxim Khutornenko [Wed, 6 Jan 2016 17:39:15 +0000 (09:39 -0800)] 
Populating and validating task config in getJobUpdateDiff RPC.

Bugs closed: AURORA-1571

Reviewed at https://reviews.apache.org/r/41966/

19 months agoKill flaky TaskObserverTest.
John Sirois [Tue, 5 Jan 2016 17:54:16 +0000 (09:54 -0800)] 
Kill flaky TaskObserverTest.

Previously, a mock threading.Event was waited on in one thread
and the count of waits was read in another thread.  Most thread
memory models do not guaranty reads are fresh in this scenario
unless there is a memory barrier of some sort forcing per-cpu
caches to be flushed.

Since the test really only verified correct conversion of a poll
interval to fractional seconds - kill the test as not pulling its
weight.

Bugs closed: AURORA-1570

Reviewed at https://reviews.apache.org/r/41915/

19 months agoAvoid zk 3.4.7 to fix test hangs.
John Sirois [Tue, 5 Jan 2016 17:52:34 +0000 (09:52 -0800)] 
Avoid zk 3.4.7 to fix test hangs.

The commons tests hang under CI after bumping from zk 3.4.2 to 3.4.7.
Although not root-caused, this zk bug introduced in 3.4.7 seems like a
match for this sort of hang:
  https://issues.apache.org/jira/browse/ZOOKEEPER-2347

Downgrade to 3.4.6 with a note about why 3.4.7 should be skipped.

Reviewed at https://reviews.apache.org/r/41917/

19 months agoUpgrade to pants 0.0.66.
John Sirois [Tue, 5 Jan 2016 16:55:13 +0000 (08:55 -0800)] 
Upgrade to pants 0.0.66.

The changelog can be read here:
  http://pantsbuild.github.io/changelog.html

Of note for aurora is the ability to kill BUILD.tools.

Additionally, add the pants generated .pids/ dir to .gitignore and
normalize all directory ignore to omit the redundant *.

Reviewed at https://reviews.apache.org/r/41899/