Yashraj Oswal

Nov 8, 2020

14 min read

Automation of Technologies: A step towards working smartly

So, here is a code that will make your life much more easier.

As it was a team task task, Here are my team members:

1: Prithviraj Singh : Hadoop, Linux Command, Writing blog for the same technologies. Git-Hub Link for whole integrated code : Prithviraj-Singh

2: Yashraj Oswal: Aws cli, Writing blog for the same Technology. Git-hub link for aws.py code : Yashraj-oswal

3: Abhishek Biswas: Researching on requirements of the technologies automated with python, Information Gathering, Blog Writting

4: Akshay Baluapuri: Debugging and Troubleshooting

So lets start with introduction…

What is Hadoop?

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

Prerequisites to install Hadoop:-

· We need to install JAVA in our system because Hadoop is created on the top of JAVA. For installing java before we install Hadoop, we can install one of the latest versions of jdk:


· Next we need to download one of the stable versions of Hadoop say:



If you are using any VM running on your own system because if you are using windows to download the mentioned softwares then the VM might not able to find them so, download the winscp software to transfer them to your linux VM.

· Download winscp-5.9.6-Portable.zip from the internet, install it and open it.

In the red mark enter the VM IP adress.

In the yellow mark enter the user name you want to login with.

In the blue mark enter the password of the user you have entered above.

Now, you will get your windows screen on the left and the VM’s screen on the right, you can now just drag the required file form windows to the VM as per your requirement.

Red arrow shows the windows screen.

Blue arrow shows the VM’s screen.

Hadoop installation process:-

Hadoop must be installed forcefully using the command rpm -i Hadoop-1.2.1–1.x86_64.rpm

— force with this version otherwise it will show the conflict error:

Now Hadoop has been successfully installed.

The Above steps you have read are the manual setup, now lets see the power of automated one…!

On opening the program we can see the main menu as…

As a test I tried running the option 1. As you can see it’s running properly…

The code starts by checking if your system has any epel configured or not. If the program does not find any epel configured, the program automatically configures it for us… once this is done the program lands us to the menu page where you can select which tool you like to work on…

The code uses python’s scripting ability to automate Hadoop. It allows the user to automate the configuration of :

Now let’s start with Hadoop to see the program’s capabilities… The program from here starts with configuring Hadoop and JDK. The program first finds if JDK has been installed or not, and hence does the required as installing JDK if its not present and skips the installation if JDK is already present. From here Hadoop is installed clearing any previously configured file so as new configurations can take place. On completion the program lands us to the Hadoop menu as :

  1. Hadoop Name Node (Master node) (Option 1).

It also has the options to to work upon the Hadoop Distributed File system of your own cluster, like:

  1. Option 8 to list all the files and folders in the file system.

And the program also has two exit options:

  1. Option 6 to exit to the main-menu.

So Let’s start configuring…

First let’s try configuring a Name Node for IP; So the program begins by stopping all the node services running on the system. The program also makes a new directory for the Name Node to be configured for. Next the program asks you, if you’d like to format the directory or not, here if you press ‘y’ the program proceeds to format the directory, else if you press any other key the program does not format it and from here the program proceeds to start the Name Node services.

Now, let’s verify if the Node is working or not using the ‘jps’ command

As you can see the Name Node is running on ‘13850’

Now, let’s configure a master on system with IP and give it a slave as for its configuration…

Now let’s check ‘jps’ if it gives the data node service or not…

As you can see the Data Node is running on ‘14253’. Also let’s see the admin report as well to see if the we see anything there…

As you can see that it has given its space to the master

We can also configure a slave node on this system using Option 3

also let’s try using the command 4 and 5… Option 4 has been used as shown above

And here we used option 5… Let’s see what changes has been made to the HDFS-site file…

As we can see the intended changes has been made to hdfs-site.xml.

So far so good… now let’s use options 8 through 10 for file system usage. Let’s start with uploading a file on the file-system.

Here the program ask us the path of the file which we’d like to upload on our file-system and the location where you’d like to upload the file on the file-system. Let’s use Option 8 to see if the file was uploaded or not

Here we can see the intended file was uploaded on the fs… Now let’s try reading it using Option 9

We can see that the file is being read on the client… (The file which we uploaded was the menu program itself)

Here setting up hadoop cluster via automated python script has ended….

Now lets get started with out new technology, i.e Amazon Web Services

What is AWS CLI?

The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.


Before you can install or update the AWS CLI version 2 on Windows, be sure you have the following:

· A 64-bit version of Windows XP or later.

· Admin rights to install software

Steps to install AWS CLI on Windows

Step 1:

Download the AWS CLI MSI installer for Windows (64-bit):

For the latest version of the AWS CLI: https://awscli.amazonaws.com/AWSCLIV2.msi

For a specific version of the AWS CLI: Append a hyphen and the version number to the filename. For this example the filename for version 2.0.30 would be AWSCLIV2–2.0.30.msi resulting in the following link https://awscli.amazonaws.com/AWSCLIV2-2.0.30.msi. For a list of versions, see the AWS CLI version 2 changelog on GitHub.

To update your current installation of AWS CLI version 2 on Windows, download a new installer each time you update to overwrite previous versions. AWS CLI is updated regularly. To see when the latest version was released, see the AWS CLI version 2 changelog on GitHub.

Step 2:

Run the downloaded MSI installer and follow the on-screen instructions. By default, the AWS CLI installs to C:\Program Files\Amazon\AWSCLIV2.

Step 3:

To confirm the installation, open the Start menu, search for cmd to open a command prompt window, and at the command prompt use the aws — version command.

Don’t include the prompt symbol (C:\>) when you type a command. These are included in program listings to differentiate commands that you type from output returned by the AWS CLI. The rest of this guide uses the generic prompt symbol ($), except in cases where a command is Windows-specific. For more information about how we format code examples, see Using the AWS CLI examples.

C:\> aws — version

aws-cli/2.0.47 Python/3.7.4 Windows/10 botocore/2.0.0

If Windows is unable to find the program, you might need to close and reopen the command prompt window to refresh the path, or add the installation directory to your PATH environment variable manually.

For configuration of AWS CLI

For general use, the aws configure command is the fastest way to set up your AWS CLI installation.

The AWS CLI stores this information in a profile (a collection of settings) named default in the credentials file. By default, the information in this profile is used when you run an AWS CLI command that doesn’t explicitly specify a profile to use. For more information on the credentials file, see Configuration and credential file settings

The following example shows sample values. Replace them with your own values as described in the following sections.

$ aws configure


AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Default region name [None]: us-west-2

Default output format [None]: json

Above you have seen the manual procedure, now lets see the magic of automated one..!

Automation of AWS-CLI:

In this, the whole process is been replaced with just hitting your choices on the command prompt. In this program the AWS-CLI commands will run at the back-end, and in the front-end you will see a menu where based on your choice the program will access specific services provided by AWS for you.

Lets start with the program…!

  1. When you execute the program using python program_name.py command, first it will ask you your choice, which technology you want to use, based on that the program will first check whether the needed software’s are been installed or not, if not it will automatically do that job for you, by downloading it from web, and install and set-it-up for you. Following are some demonstrations.

In the above page, you can see an Installation menu is appeared, you can either check the version, which will show you the version if you have the file, else go for installation,

The above page will appear when you hit 1 on the command prompt, here it will ask you to choose your operating system, choose appropriate one and it will start downloading and after download it will install it for you.

Above picture shows the downloading and installation wizard.

2. Starting with accessing AWS Services:

After successfully installing AWS-CLI, the menu in the above picture will appear in front of you, the initial step is to first configure your AWS-CLI with you AWS account, for that you need to create an IAM user first from AWS-GUI and provide the user the Power-access. It will proc=vide you with Access key and secret key, download it and paste here to complete the final setup.

3. Using the AWS service name : Amazon EC2

Here, the menu will prompt you the steps to create a new instance on Amazon EC2, Follow the steps and you will finally end-up with creating a new instance.

3.1 Creation of Key-pair:

In the above picture, when you go for option one, it will prompt you to enter the key-pair name, key is essential to login into your cloud instance.Once you prove a name it will create a key for you.

3.2 Creation of Security-Group:

When you go for creation of security group, it will ask you to enter a security-group name, once done it will create a security group for you.

3.3 Creation of new instance

On hitting 3 on the command prompt, it will ask you, key-name, Security group-id, image-id, instance-type, count of the instance and subnet-id, after proving the all the information appropriately, it will launch a instance for you.

We can check the same on GUI:

You can check, instance is been created through the information provided to program.

Now in this program, once you done with sub-menu and you choose to go back, the terminal will clear itself and provide you with clear Main menu, every time when you exit the sub-menu.

4. Start/Stop AWS Instances:

Now you can see a fresh terminal with main menu appeared on the screen, now as the last step we create a new instance and the instance is in running state. To stop it,

The program will ask you the instance id, when you go for stop instance option, the terminal will describe all the instance from EC2. So it becomes easier to fetch the id from above and paste here to stop the running instance.

We can check on GUI as well:

So you can see the status of the instance, which is stopped.

5. Creation of EBS Volume:

So, when you go for creation of EBS volume, it will prompt you to enter availability zone for the EBS Volume, its size and with that it will create a EBS volume for you.

To check from GUI:

5.1 Attaching the volume with the instance:

When you go for attaching the volume, it will describe the list of created instances, from there you need to copy the Instance-id and paste it down.

Next when you hit enter, it will display the list of volumes, present, so from there we need to copy the appropriate volume-id,

so, we chose the volume-id we created back, and pasted below. This will attach the volume to the Instance -id you provided.

So you can see the volume has been attached successfully.

5.3 Detaching the EBS- Volume:

When you for detaching the volume from instance, it requires Volume-id, so first it will display you the volume list, with their description, choose one and copy the volume-id and paste it down.

Once to provided the id and hit enter, it will detach the volume from the instance. Also it prior gives you the Warnings that first you need to unmount the volume, and then detach it.

5.4 Deleting the volume:

Next when you go for deleting the volume, it first describes all the available volumes, pick the appropriate one and copy paste its Volume id, once done it will delete the volume for you.

6. Creation of S3 i.e Simple Storage Services:

When selecting the 5th option from the main menu, it will open the sub-menu for creation of s3 bucket. It will prompt you to provide a unique bucket name, region name and location constraint for the same, after proving these info, it will create a s3 bucket for you.

You can check it from GUI as well, it has created a bucket for you.

6.2 Uploading Objects into the bucket:

When going for the option called Upload file into the bucket, it will first prompt you the absolute location of the file, once you provided the location, it will describe the bucket list, choose the bucket and Enter the Bucket name you want to upload file in. Once provided, it will show you the status of it.

6.3 Delete the S3-Bucket:

To delete any bucket, we need to first empty the bucket and the it can be deleted, When you go for deleting s3 Bucket, it will ask you the bucket name, once you provided the name, it will first empty the bucket and then it will delete the bucket for you.

7. Creation of CloudFront:

While creating CloudFront, we require a s3 Bucket where we can attach the cloudfront, so we prove the bucket link we created earlier, and it creates bucket for you.

Here it has successfully created a CloudFront.

So finally Automated AWS-CLI end here.

Following are the linux command we can perform through this automated python script:

**Commands are not to be memorized, either you may search or best way, you can use our program..!**

Practically showing all commands here is not possible, why don’t you try out the same on you system, you will surely get amazed..!

Thank-you for reading our blog..!.