LiveCareer-Resume

Hadoop Developer resume example with 6+ years of experience

Jessica Claire
, , 609 Johnson Ave., 49204, Tulsa, OK 100 Montgomery St. 10th Floor
Home: (555) 432-1000 - Cell: - resumesample@example.com - -
Professional Summary

Over 5 years of IT experience as a Developer, Designer & QA Test Engineer with cross-platform integration experience using Hadoop Ecosystem, Java and Software Functional Testing. Hands on experience in installing, configuring and using Hadoop Ecosystem - HDFS, MapReduce, Pig, Hive, Oozie, Flume, HBase, Spark, Sqoop, Flume and Oozie. Strong understanding of various Hadoop services, MapReduce and YARN architecture. Responsible for writing Map Reduce programs. Experienced in importing-exporting data into HDFS using SQOOP Experience loading data to Hive partitions and creating buckets in Hive. Developed Map Reduce jobs to automate transfer the data from HBase. Expertise in analysis using PIG, HIVE and MapReduce. Experienced in developing UDFs for Hive, PIG using Java. Strong understanding of NoSQL databases like HBase, MongoDB & Cassandra. Scheduling all Hadoop/hive/Sqoop/HBase jobs using Oozie Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud. Good understanding of Scrum methodologies, Test Driven Development and continuous integration. Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills. Experience in defining detailed application software test plans, including organization, participant, schedule, test and application coverage scope. Experience in gathering and defining functional and user interface requirements for software applications. Experience in real time analytics with Apache Spark (RDD, Data Frames and Streaming API). Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data. Experience in integrating Hadoop with Kafka. Expertise in uploading Click stream data from Kafka to HDFS. Expert in utilizing Kafka for messaging and publishing subscribe messaging system

Skills
  • Hadoop/Big Data : Hadoop, Map Reduce, HDFS, Zookeeper, Kafka, Hive, Pig, Sqoop, Airflow,
  • Kafka, Yarn, HBase,
  • No SQL Databases : HBase, Cassandra, mongo DB
  • Languages : Python 3.7.2 and previous versions, NumPy Pandas mat plot libraries, Scala,
  • Apache Spark 2.4.3, Java UNIX shell scripts
  • Java/J2EE Technologies : Applets, Swing, JDBC, JNDI, JSON, JSTL
  • Frameworks : MVC, Struts, Spring, Hibernate
  • Operating Systems : Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8
  • Web Technologies : HTML, DHTML, XML
  • Web/Application servers : Apache Tomcat, WebLogic, JBoss
  • Databases : SQL Server, MySQL
  • Tools and IDE : Anaconda, PyCharm, Jupiter Eclipse, IntelliJ
Work History
10/2019 to Current
Hadoop Developer Cognizant Technology Solutions Berkeley, CA,
  • Installed and configured Hive in Hadoop cluster and help business users/application teams fine tune their HIVE QL for optimizing performance and efficient use of resources in cluster.
  • Conduct performance tuning of Hadoop Cluster and map reduce jobs. Also, real-time applications with best practices to fix design flaws.
  • Implemented Oozie workflow for ETL Process for critical data feeds across platform.
  • Configured Ethernet bonding for all Nodes to double network bandwidth.
  • Implementing Kerberos Security Authentication protocol for existing cluster.
  • Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller (ZKFC) and Quorum Journal nodes.
  • Worked on analyzing Hadoop clusters and different big data analytic tools including Sqoop, Pig, and Hive.
  • Created POC on Hortonworks and suggested best practice in terms HDP, HDF platform
  • Experience in understanding security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup.
  • Management and support of Hadoop Services including HDFS, Impala, and SPARK.
  • Installing, Upgrading and Managing Hadoop Cluster on Cloudera
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Sqoop jobs and Hive queries were created for data ingestion from relational databases to analyze historical data.
  • Involved in Hive/SQL queries performing spark transformations using Spark RDDs and Python.
  • Created Serverless data ingestion pipeline on AWS using lambda functions.
  • Configured Spark Streaming to receive real time data from Apache Kafka and store stream data to DynamoDB using Scala.
  • Developed Apache Applications by using Scala, Python and Implemented Apache Spark data processing module to handle data from various RDBMS and Streaming sources.
  • Experience in developing and scheduling various Spark Streaming / batch Jobs using python and Scala.
  • Achieved high-throughput, scalable, fault-tolerant stream processing of live data streams using Apache Spark Streaming
  • Involved in using various Python libraries with spark in order to create data frames and store them to Hive.
  • Experience in working with Elastic MapReduce (EMR) and setting up environments on amazon AWS EC2 instances.
  • Knowledge on handling Hive queries using SQL that integrates with Spark environment.
  • Executed Hadoop jobs on AWS EMR using programs, stored in S3 Buckets.
  • Knowledge on creating user defined functions (UDF's) in Hive.
  • Worked with different File Formats like c, Avro, parquet for HIVE querying and processing based on business logic.
  • Implemented Hive UDF's to implement business logic and Responsible for performing extensive data validation using Hive.
  • Involved in developing code and generated various data frames based on business requirement and created temporary tables in hive.
  • Utilized AWS CloudWatch to monitor performance environment instances for operational and performance metrics during load testing. Scripting Hadoop package installation and configuration to support fully automated deployments.
  • Involved in chef-infra maintenance including backup/security fix on Chef Server.
  • Deployed application updates using Jenkins. Installed, configured, and managed Jenkins
  • Triggering SIT environment build of client remotely through Jenkins.
  • Deployed and configured Git repositories with branching, forks, tagging, and notifications.
  • Experienced and proficient deploying and administering GitHub
  • Deploy builds to production and work with teams to identify and troubleshoot any issues.
  • Worked on MongoDB database concepts such as locking, transactions, indexes, Shading, replication, schema design.
  • Consulted with operations team on deploying, migrating data, monitoring, analyzing, and tuning MongoDB applications.
  • Viewing selected issues of web interface using SonarQube.
  • Developed fully functional login page for company's user facing website with complete UI and validations.
  • Installed, Configured and utilized AppDynamics (Tremendous Performance Management Tool) in whole JBoss Environment (Prod and Non-Prod).
  • Responsible for upgradation of SonarQube using upgrade center.
  • Resolving tickets submitted by users, P1 issues, troubleshoot error documenting, resolving errors.
01/2017 to 10/2019
HADOOP DEVELOPER Cognizant Technology Solutions Bethlehem, PA,
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Sqoop
  • Created POC on Hortonworks and suggested best practice in terms HDP, HDF platform
  • Experience in understanding security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, managing
  • Management and support of Hadoop Services including HDFS, Hive, Impala, and SPARK
  • Installing, Upgrading and Managing Hadoop Cluster on Cloudera
  • Troubleshooting many clouds related issues such as Data Node down, Network failure, login issues and data block missing
  • Worked as Hadoop Admin and responsible for taking care of everything related to clusters total of 100 nodes ranges from POC (Proof-of-Concept) to PROD clusters on Cloudera (CDH 5.5.2) distribution
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce impact and documenting same and preventing future issues
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades
  • Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume
  • Migrated Flume with Spark for real time data and developed Spark Streaming Application with java to consume data from Kafka and push them into Hive
  • Configured Kafka for efficiently collecting, aggregating and moving large amounts of click stream data from many different sources to HDFS
  • Monitored workload, job performance and capacity planning using Cloudera Manager
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions
  • Interacting with Cloudera support and log issues in Cloudera portal and fixing them as per recommendations
  • Imported logs from web servers with Flume to ingest data into HDFS
  • Using Flume and Spool directory loading data from local system to HDFS
  • Retrieved data from HDFS into relational databases with Sqoop
  • Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis Fine tuning hive jobs for optimized performance
  • Scripting Hadoop package installation and configuration to support fully automated deployments
  • Involved in chef-infra maintenance including backup/security fix on Chef Server
  • Deployed application updates using Jenkins
  • Installed, configured, and managed Jenkins
  • Triggering SIT environment build of client remotely through Jenkins
  • Deployed and configured Git repositories with branching, forks, tagging, and notifications
  • Experienced and proficient deploying and administering GitHub
  • Deploy builds to production and work with teams to identify and troubleshoot any issues
  • Worked on MongoDB database concepts such as locking, transactions, indexes, Shading, replication, schema design
  • Consulted with operations team on deploying, migrating data, monitoring, analyzing, and tuning MongoDB applications
  • Viewing selected issues of web interface using SonarQube
  • Developed fully functional login page for company's user facing website with complete UI and validations
  • Installed, Configured and utilized AppDynamics (Tremendous Performance Management Tool) in whole JBoss Environment (Prod and Non-Prod)
  • Reviewed OpenShift PaaS product architecture and suggested improvement features after conducting research on Competitor’s products
  • Migrated data source passwords to encrypted passwords using Vault tool in all JBoss application servers
  • Participated in Migration undergoing from JBoss 4 to Web logic or JBoss 4 to JBoss 6 and its respective POC
  • Responsible for upgradation of SonarQube using upgrade center
  • Resolving tickets submitted by users, P1 issues, troubleshoot error documenting, resolving errors
  • Installed and configured Hive in Hadoop cluster and help business users/application teams fine tune their HIVE QL for optimizing performance and efficient use of resources in cluster
  • Conduct performance tuning of Hadoop Cluster and map reduce jobs
  • Also, real-time applications with best practices to fix design flaws
  • Implemented Oozie workflow for ETL Process for critical data feeds across platform
  • Configured Ethernet bonding for all Nodes to double network bandwidth
  • Implementing Kerberos Security Authentication protocol for existing cluster
  • Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller (ZKFC) and Quorum Journal nodes
  • Environment: HDFS, Map Reduce, Hive 1.1.0, Kafka, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, Apache Hadoop 2.6, Spark, SOLR, Storm, Cloudera Manager, Red Hat, MySQL, Prometheus, Docker, Puppet.
06/2016 to 12/2016
Hadoop Developer Cognizant Technology Solutions Bettendorf, IA,
  • Performed data cleaning on unstructured information using various Hadoop tools.
  • Designed, developed, modified and debugged programs.
  • Worked closely with clients to establish specifications and system designs.
  • Built, tested and deployed scalable, highly available and modular software products.
  • Corrected, modified and upgraded software to improve performance.
  • Created proofs of concept for innovative new solutions.
  • Tested troubleshooting methods, devised innovative solutions and documented resolutions for inclusion in knowledge base for support team use.
  • Analyzed work to generate logic for new systems, procedures and tests.
  • Acted as expert technical resource to programming staff in program development, testing and implementation process.
  • Quickly learned new skills and applied them to daily tasks, improving efficiency and productivity.
  • Inspected and analyzed existing Hadoop environments for proposed product launches, producing cost/benefit analyses for use of included legacy assets.
Education
Expected in
Bachelor of Science: Electrical Engineering
College of Staten Island of The City University of New York - Staten Island, NY
GPA:

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

Your data is safe with us

Any information uploaded, such as a resume, or input by the user is owned solely by the user, not LiveCareer. For further information, please visit our Terms of Use.

Resume Overview

School Attended

  • College of Staten Island of The City University of New York

Job Titles Held:

  • Hadoop Developer
  • HADOOP DEVELOPER
  • Hadoop Developer

Degrees

  • Bachelor of Science

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

*As seen in:As seen in: