- Hands on
Installation and configuration of Hortonworks Data Platform HDP 2.3.4.
- Worked on installing production cluster, commissioning & decommissioning
of Data Nodes, Name Node recovery, capacity planning, and slots
configuration
- Worked on Hadoop Administration, responsibilities include software
installation, configuration, software upgrades, backup and recovery, cluster
setup, cluster performance and monitoring on daily basis, maintaining cluster
up and run on healthy.
- Implemented the security requirements for Hadoop and integrate with
Kerberos authentication and authorization infrastructure.
- Designed, developed and implemented connectivity products that allow
efficient exchange of data between the core database engine and
the Hadoop ecosystem.
- Involved in defining job flows using Oozie for scheduling jobs to manage
apache Hadoop jobs.
- Implemented Name Node High Availability on the Hadoop cluster to
overcome single point of failure.
- Worked on YARN capacity scheduler by creating queues to allocate resource
guarantee to specific groups.
- Worked on importing and exporting data from Oracle database into HDFS and
HIVE using Sqoop.
- Monitored and analyzed of the Map Reduce job executions on cluster at task
level.
- Extensively involved in Cluster Capacity planning, Hardware planning,
Performance Tuning of theHadoop Cluster.
- Wrote automation scripts and setting up crontab jobs to maintain cluster
stability and healthy.
- Installed Ambari on an already existing Hadoop cluster.
- Implemented Rack Awareness for data locality optimization.
- Optimized and tuned the Hadoop environments to meet performance
requirements.
- Hand-On experience with AWS cloud with EC2, S3.
- Collaborating with offshore team.
- Ability to document existing processes and recommend improvements.
- Shares knowledge and assists another team member as needed.
- Assist with maintenance and troubleshooting of scheduled processes.
- Participated in development of system test plans and acceptance
criteria.
- Collaborate with offshore developers in order to monitor ETL jobs and
troubleshoot steps.
Environment: Hortonworks HDP2.3x,
Ambari, Oozie 4.2, Sqoop1.4.6, Mapreduce2, Ambari, Sql Developer, Teradata,
SSH, Eclipse, Jdk 1.7, CDH 3.x, 4.X, 5.X, Cloudera Manager 4&5, Ganglia,
Tableau, Shell Scripting, Oozie, Pig, Hive, Flume, Kafka, Impala, Oozie, CentOS