Spark Developer resume example with 13+ years of experience

Jessica Claire
  • Montgomery Street, San Francisco, CA 94105 609 Johnson Ave., 49204, Tulsa, OK
  • H: (555) 432-1000
  • C:
  • Date of Birth:
  • India:
  • :
  • single:
Professional Summary
Hadoop/Spark Developer 8+ years of overall IT experience in a variety of industries, which includes hands on experience on Big Data Analytics, and Development. * Having good experience in Bigdata related technologies like Hadoop frameworks, Map Reduce, Hive, HBase, PIG, Sqoop, Spark, Kafka, Flume, ZooKeeper, Oozie, and Storm. * Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm * Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, JSON and Avro. * Having working experience on Cloudera Data Platform using VMware Player, Cent OS 6 Linux environment. Strong experience on Hadoop distributions like Cloudera, and HortonWorks. * Good knowledge of No-SQL databases Cassandra, MongoDB and HBase. * Expertise in Database Design, Creation and Management of Schemas, writing Stored Procedures, Functions, DDL, DML SQL queries. * Worked on HBase to load and retrieve data for real time processing using Rest API. * Very good experience of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance. * Good working experience using Sqoop to import data into HDFS or Hive from RDBMS and exporting data back to HDFS or HIVE from RDBMS. * Extending HIVE and PIG core functionality by using custom User Defined Function's (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig. * Worked with Big Data distributions like Cloudera (CDH 3 and 4) with Cloudera Manager. * Worked in ETL tools like Talend to simplify Map Reduce jobs from the front end. Also have knowledge of Pentaho and Informatica as another working ETL tool with Big Data. * Worked with BI tools like Tableau for report creation and further analysis from the front end. * Extensive knowledge in using SQL queries for backend database analysis. * Involved in unit testing of Map Reduce programs using Apache MRunit. * Worked on Amazon Web Services and EC2 * Excellent Java development skills using J2EE, J2SE, Servlets, JSP, Spring,Hibernate, JDBC. * Experience in creating Reusable Transformations (Joiner, Sorter, Aggregator, Expression, Lookup, Router, Filter, Update Strategy, Sequence Generator, Normalizer and Rank) and Mappings using Informatica Designer and processing tasks using Workflow Manager to move data from multiple sources into targets. * Implemented SOAP based web services. * Used Curl scripts to test RESTful Web Services. * Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle. * Experience working with Build tools like Maven and Ant. * Experienced in both Waterfall and Agile Development (SCRUM) methodologies * Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions. * Experience in developing service components using JDBC. Authorized to work in the US for any employer
  • Hadoop Technologies
  • Apache Hadoop, Cloud era Hadoop Distribution (HDFS and Map Reduce)
  • Technologies HDFS, YARN, MapReduce, Hive, Pig, Sqoop, Flume, Spark, Kafka, Zookeeper, and Oozie
  • Java/J2EE Technologies
  • Core Java, Servlets, Hibernate, Spring, Struts.
  • NOSQL Databases
  • Hbase, Cassandra
  • Programming Languages
  • Java, Scala, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Java Script, Shell Scripting
  • Web Technologies
  • HTML, J2EE, CSS, JavaScript, AJAX, Servlet, JSP, DOM, XML
  • Application Servers
  • Web Logic, Web Sphere, JBoss, Tomcat
  • Cloud Computing tools
  • Amazon AWS.
  • Build Tools
  • Jenkins,Maven, ANT
  • Databases
  • MySQL, Oracle, DB2
  • Business Intelligence Tools
  • Tableau, Splunk
  • Development Methodologies
  • Agile/Scrum, Waterfall.
  • Development Tools
  • Microsoft SQL Studio, Toad, Eclipse, NetBeans.
  • Operating Systems
  • Windows 95/98/2000/XP, MAC OS, UNIX, LINUX.

 Windows 95/98/2000/XP, MAC OS, UNIX, LINUX.


Work History
Spark developer, 2016 - Current
Infosys Ltd Hillsboro, ,
  • Responsible for design development of Spark SQL Scripts based on Functional Specifications.
  • Responsible for Spark Streaming configuration based on type of Input Source.
  • Wrote the Map Reduce jobs to parse the web logs which are stored in HDFS.
  • Developed the services to run the Map-Reduce jobs as per the requirement basis.
  • Importing and exporting data into HDFS and HIVE, PIG using Sqoop.
  • Responsible to manage data coming from different sources.
  • Monitoring the running MapReduce programs on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS.
  • Installed and configured Hive and also written Pig/Hive UDFs.
  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Writing MapReduce (Hadoop) programs to convert text files into AVRO and loading into Hive (Hadoop) tables.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Developing design documents considering all possible approaches and identifying best of them.
  • Loading Data into HBase using Bulk Load and Non-bulk load.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop.
  • Import the data from different sources like HDFS/HBase into Spark RDD.
  • Experienced with Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN.
  • Import the data from different sources like HDFS/HBase into Spark RDD.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala and Python.
  • Involved in gathering the requirements, designing, development and testing.
  • Followed agile methodology for the entire project.
  • Prepare technical design documents, detailed design documents.
  • Environment: Hive, HBase, Flume, Java, Maven, Impala, Splunk, Pig, Spark, Oozie, Oracle, Yarn, GitHub, Junit, Tableau, Unix, Cloudera, Flume, Sqoop, HDFS, Tomcat, Java, Scala, Python.
  • Hadoop developer.

hadoop developer, 11/2014 - 2016
Bank Of America Corporation Dearborn Heights, ,
  • Consumed the data from Kafka queue using Spark.
  • Configured different topologies for Spark cluster and deployed them on regular basis.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Involved in loading data from LINUX file system to HDFS.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
  • Involved in performing the Linear Regression using Scala API and Spark.
  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH5 Hadoop cluster on CentOS.
  • Assisted with performance tuning, monitoring and troubleshooting.
  • Created Map Reduce programs for some refined queries on big data.
  • Involved in the development of Pig UDF'S to analyze by pre-processing the data.
  • Involved in setting up of HBase to use HDFS.
  • Used Hive partitioning and bucketing for performance optimization of the hive tables and created around 20000 partitions.
  • Created RDD's in Spark technology and extracted data from data warehouse on to the Spark RDD's.
  • Used Spark with Scala.
  • Environment

Hadoop Developer, 11/2013 - 10/2014
Bank Of America Corporation Deland, ,
  • Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive and MapReduce.
  • Involved in exploring Hadoop, Map Reduce programming and its ecosystems.
  • Implementing Map Reduce programs / Algorithms for Organizing the data, For performing Aggregation over the data, Joining different data sets, Filtering the data, Classification, Partitioning.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Writing UDF (User Defined Functions) in Pig, Hive when needed.
  • Developing the Pig scripts for processing data.
  • Managing work flow and scheduling for complex map reduce jobs using Apache Oozie.
  • Involved in creating Hive tables, loading data &writing hive queries.
  • Written Hive queries for data analysis to meet the business requirements.
  • Created HBase tables to store various data formats of incoming data from different portfolios.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Automated the History and Purge Process.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Validating the data using MD5 algorithems.
  • Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs.
  • Involved in Configuring core-site.xml and mapred-site.xml according to the multi node cluster environment.
  • Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts.
  • Used AVRO, Parquet file formats for serialization of data.
  • Environment

Sr.Java Developer, 07/2011 - 06/2013
Ally Remote, NE, Ind
  • Developed Web module using Spring MVC, JSP.
  • Developing model logic by using Hibernate ORM framework.
  • Handle server side validations.
  • Involved in Bug fixing.
  • Involved in Unit Testing by using Junit.
  • Writing Technical Design Document.
  • Gathered specifications from the requirements.
  • Developed the application using Spring MVC architecture.
  • Developed JSP custom tags support custom User Interfaces.
  • Developed front-end pages using JSP, HTML and CSS.
  • Developed core Java classes for utility classes, business logic, and test cases.
  • Developed SQL queries using MySQL and established connectivity.
  • Used Stored Procedures for performing different database operations.
  • Used Hibernate for interacting with Database.
  • Developed control classes for processing the request.
  • Used Exception Handling for handling exceptions.
  • Designed sequence diagrams and use case diagrams for proper implementation.
  • Used Rational Rose for design and implementation Environment

Java Developer, 07/2007 - 06/2011
Bhilwara Infotech Pvt City, ,
  • Responsibilities

Bachelor of Science: , Expected in 2007
Acharya Nagarjuna University - Guntur,
Additional Information

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

Your data is safe with us

Any information uploaded, such as a resume, or input by the user is owned solely by the user, not LiveCareer. For further information, please visit our Terms of Use.

Resume Overview

School Attended

  • Acharya Nagarjuna University

Job Titles Held:

  • Spark developer
  • hadoop developer
  • Hadoop Developer
  • Sr.Java Developer
  • Java Developer


  • Bachelor of Science

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

*As seen in:As seen in: