Ansible: Setting up Hadoop cluster using ansible playbook

Yashraj Oswal
3 min readDec 16, 2020



Ever thought setting-up Hadoop cluster will be so simple? Like you just need to run an ansible play-book, that’s it!. Yes, today we are going to learn how we right from installation and configuring hdfs-site.xml and core-site.xml file in both master and slave nodes, create hadoop cluster within a few seconds of time.

So lets get started….!

Setting-up inventory file: Add IP addresses of Master node and slave node:

Below is the image where we have created .txt file in which host-name [Master] and [Slave] is created.

Now in this file we need to add ip address, username of system where you are configuring master node and password of that system and as we are configuring a linux system, connection protocol used is ssh.

To get more details about how we can configure ansible, read my blog, Ansible: Configuring httpd inside docker container. In this blog I have explained the pre-requisites.

Let us check whether JDK and Hadoop are present on the manage nodes or not:

Now let us design ansible playbook:

Master code: Explanation..

Slave Node : Explanation…

***Note that above two pictures are just for easy illustration, both master and slave are included in one single play book***

Now let us run the playbook, and see what happens….

Command: ansible-playbook hadoop.yml

Here, the master has configured successfully…..

Here the code execution has completed successfully….!

Now let us check JDK and Hadoop are installed or not:

As you can see both JDK and Hadoop has installed successfully…!

Now let us check whether Master node/Name node and Slave node/ Data node has started or not:

First we will check whether Master node/Name node has started or not:

Lets check Data node as well:

Great..! So now next step is to check connection:

So here we have successfully connected Data node with Name node..!