GIRAPH-1036: Allow mappers to fail early on exceptions
authorMaja Kabiljo <majakabiljo@fb.com>
Wed, 21 Oct 2015 01:19:36 +0000 (18:19 -0700)
committerMaja Kabiljo <majakabiljo@fb.com>
Thu, 22 Oct 2015 03:06:04 +0000 (20:06 -0700)
commit81d5badf7b76e9f1efde1cebe2150bee70e4cf58
tree17e7336996fbafebaaf21263a47a52f058858538
parentb735f02bde1ed684eb0d0db2ec125d26d103aced
GIRAPH-1036: Allow mappers to fail early on exceptions

Summary:
Often when something fails in a mapper we see it stuck until its timeout passes. Digging through this issue I found two root causes:
- Many threads we are creating were not daemon, preventing process to exit, only main thread should be daemon
- When calling submit on ExecutorService, exceptions are not propagated back to the caller, unless get is called on the future. In ProgressableUtils.getResultsWithNCallables we were calling get on one by one future, causing us to have to wait for previous futures to finish before getting exception which happened in later one.

Test Plan: Run jobs in which I simulated exceptions on some partitions in loading, compute and storing phases, for each verified we exit quickly with exception clearly shown, and without this change we'd wait for timeout and other threads from same ProgressableUtils.getResultsWithNCallables to finish. Run a normal job successfully. mvn clean verify

Differential Revision: https://reviews.facebook.net/D49143
giraph-core/src/main/java/org/apache/giraph/comm/messages/queue/AsyncMessageStoreWrapper.java
giraph-core/src/main/java/org/apache/giraph/ooc/AdaptiveOutOfCoreEngine.java
giraph-core/src/main/java/org/apache/giraph/utils/JMapHistoDumper.java
giraph-core/src/main/java/org/apache/giraph/utils/ProgressableUtils.java
giraph-core/src/main/java/org/apache/giraph/utils/ReactiveJMapHistoDumper.java
giraph-core/src/main/java/org/apache/giraph/utils/ThreadUtils.java
giraph-core/src/main/java/org/apache/giraph/worker/WorkerProgressWriter.java