| M101J: MongoDB for Java Developers. |
Project: Metazoo-Metadata-driven data ingestion and ETL (OCT 2022- MAR 2023)
Client: Advent Health
Role: Cloud Data Engineer
Description: This project is migrating different on-prem data sources (Oracle, MySQL, Salesforce, etc.) to azure cloud/snowflake. Building automated metadata-driven framework and pipelines using azure data factory, creating a datalake in ADLS, and loading data to Snowflake for further reporting and analytics.
Environment: Azure, Salesforce, SQL server, snowflake, python, Azure data factory, Azure DevOps
Key Responsibilities:
Project : BDP(Business Data intelligent platform) (FEB 2022- OCT 2022)
Client : MyFitnessPal
Role : Sr. Data Engineer
Project Description : MyFitnessPal is one of the best weight loss apps and fitness apps, helping nearly 1 million members reach their nutrition and fitness goals every year. This project is migrating their application data to snowflake data warehouse for their BI needs as well as implementing ETL & Data Warehousing using Snowflake and orchestrates & automates complete end-to-end flow using Airflow jobs.
Environment : Aws s3, AWS Managed Apache Airflow, Snowflake
Key Responsibilities:
Project : Batch - Ingestion & On prem to Cloud data migration ( MAR 2021- JAN 2022)
Client : HSBC
Role : Sr. Data Engineer
Project Description : This project is data migration from on-prem to google cloud and implementing data ingestion strategies from GCP storage to big query using Airflow as orchestration tool
Environment : Python, GCP, GCS, Google Big query, Airflow, Juniper
Key Responsibilities:
Client: Exeliq Consulting Inc. / Trustmark
Project: Cloud Governance
Description: The “Cloud Governance” tool streamlines the overall governance of the client-side cloud environment
after migrating to the cloud. The tool ensures that the cloud environment ensures ease of compliance, enhanced security, optimum utilization of resources,cost optimization and standardization of
Processes for seamless scaling of the environment.
Environment: Azure platform, Python, Azure Databricks, Azure AD, Azure Storage, Python, GitHub
Github Link: https://github.com/hitesh09p/TSPL/tree/master/cloudgovernance-master
Key Responsibilities:
Project : HCA (Harbour Capital Advisors) (JAN 2020- FEB 2021)
Role : Sr. Data Engineer
Project Description : This project is implementing and migrating informatica ETL to Databricks PySpark & test spark automation framework.
Environment : Databricks, Spark, AWS S3, PostgreSQL, vagrant, informatica
Key Responsibilities:
Project : Data Xform (FEB 2018 - DEC 2019)
Client : Ingredion
Role : Sr. Data Engineer
Description : “Data Xform” provides a seamless journey for data migration and transformation from a
plethora of legacy databases to the cloud environment. It works on database discovery, assessment and migration by using an industry specific architecture and ensuring minimal downtime and data loss while switching over to the cloud-hosted providers. The tool also ensures that integration of data across various databases is done efficiently and effectively.
Key Responsibilities:
Project : CAT-BOMBOD (AUG 2017 - JAN 2018)
Client : Caterpillar
Description : This project is automation and orchestration of complete BOM and BOD Pre and Post validation process.
Environment : Google Cloud Composer, Google Dataproc, Google storage , Google cloud functions, Google Compute Engine
Key Responsibilities:
Project : Numerator (2016 - JULY 2017 )
Description : This project is implementing a Data warehouse using Pentaho and Snowflake. Also, migrating Pentaho to airflow for distributed processing & automation.
Environment : Airflow, Snowflake, Pentaho, Python, Shell script, R scripts, GitHub.
Key Responsibilities :
● Designed and implemented POC using Databricks spark cluster.
● Migrated pentaho ETL jobs to Airflow dags for orchestration and automation.
● Migrated oracle stored procedures to snowflake scripts.
● Managed to load data in the snowflake system using the Airflow tool.
● Implemented Airflow data pipelines, creating DAGs in python to load data into snowflake with docker.
● Created and executed parameterized task workflows in Pentaho as per business requirement.
● Scheduled the tasks using airflow scheduler.
● Uploaded data to AWS S3 for data archival.
● Applied performance tuning techniques on snowflake data model.
Big data Professional Trainer & Mentor
By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy
Any information uploaded, such as a resume, or input by the user is owned solely by the user, not LiveCareer. For further information, please visit our Terms of Use.
By clicking Customize This Resume, you agree to our Terms of Use and Privacy Policy