Professional Documents
Culture Documents
x cluster
(https://acadgild.com/blog/beginners-guide-for-oozie-installation/)
STEP 1 :-
First we need to download the oozie-4.1.0 tar file from the below link:
Oozie-4.1.0tarfile (https://drive.google.com/file/d/0ByJLBTmJojjzNVcyMzhhQVg0ak0/view)
STEP 2 :-
We need to move into the Downloads folder using the below commands:
cd Downloads Install
STEP 3 :-
The tar file will be extracted and we will get oozie-4.1.0 file
STEP 4 :-
Maven Installation
Before setting up the things for Oozie install maven in your system.
If you are using Centos type the below command to install maven:
If you are using Ubuntu type the below command to install maven:
After the installation of maven check the installed maven by using the below
command
mvn -version
You must get the output as shown in the below screen shot
STEP 6 :-
Now open the untared oozie-4.1.0 file and open the pom.xml file.
In the pom.xml file update the target version of java as your java version.
Here we are using Java7. So we have updated the target version as 1.7
If you are using Hadoop 2.x update the Hadoop version as 2.3 so that by
using maven, Oozie will refer the dependencies that are required to run it
on Hadoop 2.x cluster, Hadoop 2.3 dependencies are the latest one which
Oozie has added.
Now comment the codehaus repository, because codehaus has stopped its
services recently. So dependencies wont be downloaded from this
repository.
After making the above specified changes, save and close the file.
STEP 7 :-
Now move the pom.xml into the untared oozie-4.1.0 bin folder
The above command will run the disto, and prepares a distro file by skipping the
Tests by Debugging
Note: distro command will download the dependencies from maven that are required
for hadoop2.x cluster that required for Oozie.
The process will take some time, it will download all the dependencies required for
your project.
While making the distro file the you will get some dots as shown below, dont panic at
that time.
Finally you will get a success message as shown in the below figure.
STEP 8 :-
A target file will be created in the distro folder of your Oozie directory.
Inside the target folder you can see the oozie-4.1.0-distro folder
Open the oozie-4.1.0-distro folder, inside you will find oozie-4.1.0 folder
This is the oozie-4.1.0 folder which consists of all the dependencies that are
required to run in a Hadoop cluster.
Copy this oozie-4.1.0 folder into your Hadoop user, in our case we are making
a Oozie directory in home folder($HOME) and then paste the obtained oozie-
4.1.0 folder in the path $HOME/Oozie
STEP 9 :-
Now change the path to newly obtained oozie-4.1.0 directory, create a
directory with name libext(library extension) using the command mkdir libext.
In the below screenshot we can see that libext directory has been created in the
path $HOME/oozie/oozie-4-1.0
Now copy the jar files of Hadoop-2.3.0 into the newly created libext folder.
You can find the libraries of Hadoop-2.3.0 in the following path.
oozie-4.1.0>hadooplibs>hadoop-2>target>hadooplibs>hadooplib-
2.3.0.oozie-4.1.0>
Now download the the ext-2.2 zip file from the below link
ext-2.2.zip
( https://drive.google.com/file/d/0ByJLBTmJojjzcDhxQUsyNEFSQm8/view)
Copy this downloaded ext-2.2.zip file into the newly created libext folder
oozie-4.1.0/bin
STEP 11 :-
After the successful preparation of war file, you will get the output as shown in
the below image.
STEP 12 :-
Now, open the core-site.xml file in your hadoops etc folder and add the below
properties.
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop_user_name.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop_user_name..groups</name>
<value>*</value>
</property>
After doing the changes, save and close the file.
STEP 13 :-
Now open the oozie-site.xml file present in the newly obtained oozie-4.1.0s conf
directory.
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/home/kiran/hadoop-2.7.1/etc/hadoop</value>
<description>
Comma separated AUTHORITY=HADOOP_CONF_DIR, where
AUTHORITY is the HOST:PORT of
the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
used when there is no exact match for an authority. The
HADOOP_CONF_DIR contains
the relevant Hadoop *-site.xml files. If the path is relative is looked within
the Oozie configuration directory; though the path can be absolute (i.e. to
point
to Hadoop client conf/ directories in the local filesystem.
</description>
</property>
<property>
<name>oozie.service.WorkflowAppService.system
.libpath</name>
<value>hdfs://localhost:9000/user/$
{user.name}/share/lib</value>
<description>
System library path to use for workflow
applications.
This path is added to workflow application if
their job properties sets
the property 'oozie.use.system.libpath' to true.
</description>
</property>
Now give the ownership permission to the oozie folder by using the below
command
sudo chown hadoop's_user_name oozie_file_path(in our case it is
$HOME/oozie)
Note: Make sure that all your hadoop daemons are started properly.
Now create a file in hdfs for storing the oozie contents with name sharelib using
the below command:
The above command will create a folder with name sharelib in HDFS.
Creating Oozie DB
Before creating a Oozie DB make sure that you have installed Mysql-server in your
system.
After the installation of MYSQL server, move into the newly created oozie-4.1.0s bin
folder and then type the below command
After running this command successfully, you will get the below output
setting CATALINA_OPTS="$CATALINA_OPTS
-Xmx1024m"
Validate DB Connection
DONE
Check DB schema does not exist
DONE
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
Create OOZIE_SYS table
DONE
Now your oozie is successfully started, you can also check the same with the webUI.
Open your browser, and then type localhost:11000, 11000 is the default port for
oozie.
All the Active and suspended jobs can be seen in the web UI.
Hope this blog helped you in installing oozie in your hadoop cluster, Keep visiting
our website Acadgild for more updates on Big Data and other technologies.
Click here to learn Big Data Hadoop Development.