19 May 2015

overview

  1. oozie is a workflow scheduler system to manage apache hadoop jobs

  2. oozie workflow jobs are directed acyclical graphs (dags) of actions

  3. oozie coordinator jobs are recurrent oozie workflow jobs triggered by time (frequency) and data availablity

  4. oozie is integrated with the rest of hadoop stack

    1. hadoop jobs

       * java map-reduce
       * streaming map-reduce
       * pig
       * hive
       * sqoop
       * dictcp
      
    2. system specific jobs

       * java programs
       * shell scripts
      

site



blog comments powered by Disqus