overview
-
oozie is a workflow scheduler system to manage apache hadoop jobs
-
oozie workflow jobs are directed acyclical graphs (dags) of actions
-
oozie coordinator jobs are recurrent oozie workflow jobs triggered by time (frequency) and data availablity
-
oozie is integrated with the rest of hadoop stack
-
hadoop jobs
* java map-reduce * streaming map-reduce * pig * hive * sqoop * dictcp
-
system specific jobs
* java programs * shell scripts
-