Automation of Hadoop Hdfs through Ansible

5 0
  • 0 Collaborators

In this project I wish to automate the Hadoop HDFS cluster i.e( Master node and Name node's) through Ansible which is a Devops tool for automation. ...learn more

Project status: Under Development

Networking

Intel Technologies
DAAL

Overview / Usage

This project aims to automate Hdfs(Hadoop distributed file system) through Ansible. As a single node is incapable to have a high-end large hard-disk therefore in hdfs various nodes/virtual machine/laptop/systems are used to store the data, As it takes a lot of time to set the hdfs cluster manually i.e Master node and Name nodes, so by using Ansible ( DevOps tool for Automation) we can create a single playbook which when operated on the system will perform the needed functions and set the hdfs cluster for us .

By using various modules we can create a playbook in Ansible. It shall be very helpful as no longer we need to set it manually and we can just run the playbook and the task is done . As Industries involve thousands of systems to build a cluster so rather than typing all commands from one system to another which is quite a tedious task we can automate it through ansible which shall be efficient and would save a lot of time.

Methodology / Approach

Firstly For sure one should have set up the hdfs cluster manually with at least minimum no of nodes (1 -> master node) and (2 -> namenode), then the developer must have the knowledge of DevOps tool Ansible.

Next step, would be to start creating the playbook by first writing the modules for master node and after that one can test the playbook for mater node.

Following which further modules w.r.t Datanodes can be written and then the playbook is ready to set the hdfs cluster.

Technologies Used

Apache Hadoop, Ansible, Intel DAAL

Comments (0)