Apache Oozie is a Java Web application used to schedule Apache Hadoop jobs. it is a workflow scheduler system to manage Apache Hadoop jobs. Oozie combines multiple jobs sequentially into one logical unit of work.
It is integrated with the Hadoop stack, with YARN as its architectural center, and supports Hadoop jobs for Apache MapReduce, Apache Pig, Apache Hive, and Apache Sqoop.
Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts)
Oozie is a scalable, reliable and extensible system.
There are two basic types of Oozie jobs
1) Oozie Workflow jobs are Directed Acyclical Graphs (DAGs), specifying a sequence of actions to execute.
2) Oozie Coordinator jobs are recurrent Oozie Workflow jobs that are triggered by time and data availability.
No comments:
Post a Comment