Jessica Claire
  • , , 100 Montgomery St. 10th Floor
  • Home: (555) 432-1000
  • Cell:
Professional Summary
  • 3+ years of experience in IT, which includes experience in Big data Technologies, Hadoop ecosystem, Spark Framework and Currently working on Spark framework extensively using PySpark as the main programming dialect
  • Good understanding of Hadoop architecture and various components such as HDFS, Job tracker, Task Tracker, Name Node and Data Node.
  • Hands on experience in installing, configuring, and using Hadoop ecosystems such as Map Reduce, HIVE, Oozie.
  • Developed Spark with Python, PySpark and Spark-SQL for cleaning, processing of data.
  • Good Knowledge in loading the data from Oracle and MySQL databases to HDFS system using SQOOP (Structured Data) .
  • Experienced in writing custom Hive UDF’s to incorporate Business logic with Hive Queries.
  • Experience with AWS components like Amazon IAM roles, Glue, Athena, EMR, Ec2 instances, S3 buckets.
  • Strong UNIX/Linux knowledge including the ability to understand the interaction between applications and the operating system.
  • Hands on experience in application development using Python, Hive, Shell Scripting and developed unix scripts to automate various processes.
  • Scheduling jobs and workflows using crontab, Oozie.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Scrum, Waterfall and Agile.

Apache Spark

  • Created Data Frames and performed analysis using Spark SQL.
  • Hands on expertise in writing different RDD (Resilient Distributed Datasets) transformations and actions using Scala, Python and Java.
  • Excellent understanding of Spark Architecture and framework, Spark Context, APIs, RDDs, Spark SQL, Data frames, Streaming, MLLib.

Apache Sqoop

  • Used Sqoop to Import data from Relational Database (RDBMS) into HDFS and Hive, storing using different formats like Text, Avro, Parquet, Sequence File, ORC File along with compression codes like Snappy and Gzip.
  • Performed transformations on the imported data and Exported back to RDBMS.

Apache Hive

  • Experience in writing queries in HQL (Hive Query Language), to perform data analysis.
  • Created Hive External and Managed Tables.
  • Implemented Partitioning and Bucketing on Hive tables for Hive Query Optimization.

Big Data Technologies: HDFS, Map Reduce, Hive, Sqoop, Oozie, PySpark, SparkSQL.

Hadoop Distribution: Cloudera, Apache, AWS.

Languages: Python, SQL, HiveQL, Shell Scripting.

Operating Systems: Windows, UNIX, LINUX.

Version Control: Git, Git Bash.

IDE & Build Tools: Visual Studio, Intellij, Notepad ++

Databases: RDBMS, MySql, SQL Server.

Work History
Big Data Developer, 08/2020 to Current
Cognizant Technology SolutionsDayton, NJ,
  • Responsible for ingesting large volumes of user behavioral data and customer profile data to Analytics Data store.
  • Have experience in using Python with PySpark in building data pipelines and writing python scripts to automate pipelines.
  • Developed many Spark applications for performing data cleansing, event enrichment, data aggregation, de-normalization and data preparation needed for machine learning exercise.
  • Developed various spark applications using PySpark to perform various enrichments of user behavioral data (click stream data) merged with user profile data
  • Utilized PySpark API to implement batch processing of jobs
  • Worked on fine-tuning spark applications to improve the overall processing time for the pipelines.
  • Design and Develop ETL Processes in AWS Glue to migrate customers data from external sources like S3, ORC/Parquet/Text Files into AWS
  • Data Extraction, aggregations and consolidation of customers data within AWS Glue using PySpark.
  • Create external tables with partitions using Hive, AWS Athena and Redshift
  • Designed External and Managed tables in Hive and processed data to the HDFS using Sqoop
  • Create user defined functions UDF in HveQL and Pyspark

Environment: AWS Glue, Spark, Spark SQL, Python, PySpark, Hive, Sqoop, Oozie, AWS Simple workflow, Linux,HDFS.

Data Engineer, 05/2016 to 12/2018
VerizonDallas, TX,
  • Involved in loading the data into HDFS from different Data sources like Oracle, DB2 using Sqoop and load into Hive tables.
  • Involved in creating Hive tables, loading data from different data sources, hdfs locations and other hive tables
  • Created SQOOP jobs and Scheduled them to handle incremental loads from RDBMS into HDFS and applied Spark transformations.
  • Created Hive external tables to perform ETL on data that is generated on daily basics.
  • Involved in developing shell scripts for logging and accessing data.
  • Worked on Scheduling workflows, jobs using cron, Oozie.
  • Worked in monitoring, managing, and troubleshooting the Hadoop Log files.
  • Involved in Agile methodologies, daily Scrum meetings, Sprint planning.
Master of Science: Computer Science, Expected in 08/2020
University of Central Missouri - Warrensburg, MO
Bachelor of Science: Electrical, Electronics And Communications Engineering, Expected in 05/2016
Satyabhama University - Chennai,

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy


Resumes, and other information uploaded or provided by the user, are considered User Content governed by our Terms & Conditions. As such, it is not owned by us, and it is the user who retains ownership over such content.

How this resume score
could be improved?

Many factors go into creating a strong resume. Here are a few tweaks that could improve the score of this resume:


resume Strength

  • Length
  • Measurable Results
  • Personalization
  • Target Job

Resume Overview

School Attended

  • University of Central Missouri
  • Satyabhama University

Job Titles Held:

  • Big Data Developer
  • Data Engineer


  • Master of Science
  • Bachelor of Science

By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy

*As seen in: