Data Scientist with strong math background and 7+ years of experience using predictive modeling, data processing, and data mining algorithms to solve challenging business problems.
Strong on-hands experience in Data Extraction, Data Modelling, Data Wrangling, Statistical Modeling, Data Mining, Machine Learning, and Data Visualization.
Experience in all phases of SDLC like Requirement Analysis, Implementation, and Maintenance and good experience with Agile and Waterfall.
Experience in using various packages in Python and R like Pandas, NumPy, SciPy, Matplotlib, Seaborn, TensorFlow, Scikit-Learn, and ggplot2.
Knowledge in Machine Learning algorithm and Predictive Modeling, including Regression, Clustering, Random Forest, NLP, Decision Tree, Time Series, hypothesis testing, K means clustering.
Experience in performing data analysis on various IDE's like Jupyter Notebook and PyCharm.
Experience in the entire data science project life cycle and actively involved in all the phases, including data extraction, data cleaning, statistical modeling, and data visualization with large data sets of structured and unstructured data.
Good understanding of Data Warehousing principles (Fact Tables, Dimensional Tables, Dimensional Data Modeling - Star Schema, and Snowflake Schema).
Good knowledge in creating visualizations, interactive dashboards, reports, and data stories using Tableau and Power BI.
Experience in Normalization techniques for OLAP systems in creating Database Objects like Tables, Constraints (Primary key, Foreign Key, Unique, Default), and Indexes.
Knowledge in reporting of KPI for some parameters to keep track of close competitors in the industry.
Knowledge of data analysis and data profiling using complex SQL on various sources, including SQL Server and Teradata.
Good knowledge in RDBMS implementation and development using MySQL, SQL Server, SQL, and PL/SQL stored procedures and query optimization.
Experience in Pivot tables and VLOOKUP function for analyzing and generating reports.
Experience in Data Mining, Text Mining, Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export.
Experience in Tracking defects using Bug tracking and Version control tools like Jira and Git.
Strong experience in interacting with stakeholders/customers, gathering requirements through interviews, workshops, and existing system documentation or procedures, defining business processes, identifying, and analyzing risks using appropriate templates and analysis tools.
Considerable Experience in working on different Operating Systems like Windows, Mac, and Linux.
Excellent communicative, interpersonal, intuitive, analysis, leadership skills, a quick starter with the ability to master and apply new concepts.
Machine Learning (6+ years)
Natural Language Processing (5+ years)
Data Mining (5+ years)
Statistics (7+ years)
A/B Testing (5+ years)
Big Data (5+ years)
SQL (7+ years)
NoSQL (7+ years)
Spark (6 years)
Python (7+ years)
Scala (7+ years)
R (7+ years)
Pandas (7+ years)
NumPy (7+ years)
Scikit-learn (7+ years)
Seaborn (7+ years)
Logistic Regression (7+ years)
SVM (7+ years)
SAS Programming (3+ years)
K-means clustering (6+ years)
NLP- Naïve bayes (4+ years)
Tableau (7+ years)
Power BI (7+ years)
Microsoft Excel (7+ years)
DATA SCIENTIST/ MACHINE LEARNING | 02/2019 to CurrentJohnson & Johnson - NEW BRUNSWICK, NJ
Johnson & Johnson is an American multinational corporation founded in 1886 that develops medical devices, pharmaceuticals, and consumer packaged goods.
Text reviews are the finest ways to know specifically which feature of the product lacks customer satisfaction.
If the user can get a breakup of reviews for each feature of the product, the product's quality can be better assessed.
The manufacturer of the product can also benefit from such feature-based reviews.
A manufacturer can improvise the product by fixing the features which lack customer satisfaction.
Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit the analytical requirements.
Performed data analysis using Hive to retrieve the data from the Hadoop cluster, SQL to retrieve data from RedShift.
Explored and analyzed the customer-specific features by using Spark SQL.
Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables.
Performed data imputation using Scikit-learn package in Python.
Participated in features engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn pre-processing.
Used Python 3.X (NumPy, SciPy, pandas, scikit-learn, seaborn) and Spark 2.0 (PySpark, MLlib) to develop a variety of models and algorithms for analytic purposes.
Developed and implemented predictive models using machine learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, Random Forests, K-means clustering, KNN, PCA, and regularization for data analysis.
Conducted analysis on assessing customer consuming behaviors and discover the value of customers with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means Clustering and Hierarchical Clustering.
Built regression models include: Lasso, Ridge, SVR, XGboost to predict Customer Lifetime Value.
Built classification models include Logistic Regression, SVM, Decision Tree, Random Forest to predict Customer Churn Rate.
Used F-Score, AUC/ROC, Confusion Matrix, MAE, RMSE to evaluate different Model performance.
Designed and implemented recommender systems that utilized Collaborative filtering techniques to recommend the course for different customers and deployed to AWS EMR cluster.
Utilized natural language processing (NLP) techniques to Optimized Customer Satisfaction.
Designed rich data visualizations to model data into human-readable form with Tableau and Matplotlib.
DATA SCIENTIST/ MACHINE LEARNING | 12/2017 to 01/2019Capital One Financial Corp - NEW YORK, NY
Capital One Financial Corporation is an American bank holding company specializing in credit cards, auto loans, banking, and savings accounts, headquartered in McLean, Virginia, with operations primarily in the United States.
In Capital One, there were multiple projects to identify clients who are and who are not willing to subscribe to a term-based deposit.
We worked independently and collaboratively throughout the complete analytics project lifecycle, including data extraction/preparation, design, and implementation of scalable machine learning analysis and solutions, and documentation of results.
We determined that multiple factors can be addressed as a client's age, marriage, etc.; to address this situation, we came up with a benefits program and a cashback program for our clients who open a savings account.
Moreover, we performed statistical analysis to determine peak and off-peak periods for rate-making purposes.
Work independently and collaboratively throughout the complete analytics project lifecycle, including data extraction/preparation, design, and implementation of scalable machine learning analysis and solutions, and documentation of results.
Performed statistical analysis to determine peak and off-peak periods for rate-making purposes.
Conducted analysis of customer data to design rates.
Identified root causes of problems and facilitated the implementation of cost-effective solutions with all levels of management.
Application of various machine learning algorithms and statistical Modeling like decision trees, regression models, clustering, SVM to identify Volume using Scikit-learn package in R.
Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
Involved in transforming data from legacy tables to HDFS and HBase tables using Sqoop.
Research on Reinforcement Learning and control (TensorFlow, Torch), and machine learning model (Scikit-learn).
Hands-on experience in implementing Naive Bayes and skilled in Random Forests, Decision Trees, Linear, Logistic Regression, SVM, Clustering, and Principle Component Analysis.
Work independently or collaboratively throughout the complete analytics project lifecycle, including data extraction/preparation, design, and implementation of scalable machine learning analysis and solutions, and documentation of results.
Partner with technical and non-technical resources across the business to leverage their support and integrate our efforts.
Worked on Text Analytics and Naive Bayes, creating word clouds and retrieving data from social networking platforms.
Support various business partners on a wide range of analytics projects from ad-hoc requests to large-scale cross-functional engagements.
Approach analytical problems with an appropriate blend of statistical/mathematical rigor with practical business intuition.
Hold a point-of-view on the strengths and limitations of statistical models and analyses in various business contexts and evaluate and effectively communicate the results' uncertainty.
Application of various machine learning algorithms and statistical Modeling like decision trees, regression models, SVM, clustering to identify Volume using Scikit-learn package in python.
DATA SCIENTIST/ MACHINE LEARNING | 05/2013 to 11/2017TARGET CORPORATION - MINNEAPOLIS, IN, MN
Target is a general merchandise retailer with stores in all 50 U.S.
States and the District of Columbia.
Target is headquartered in Minneapolis, Minnesota.
In target, we were looking for a specific set of vendors that can help us support many guests; primarily, we thought of aiming towards food and beverage items, as these products can be stocked out very easily.
We used spark in R to speed up our execution time and allocate small memory to large datasets and pyspark to use python syntax.
For the inside product of the target, we need to figure out OSA(On-shelf availability), the number of products that are not on display, and the number of products in the storage.
Worked independently or collaboratively to select a specific set of vendors that can support multiple guests.
Used PySpark to speed up the execution process.
Forecasted on multiple products about their demand, pricing, and predicting inventory management, and Identifying OSA based on the products available in an outlet.
Used Python in supply and demand to inform vendors about the products running low and required immediate attention.
Provided Python algorithm to minimize the lead time of inventory in the storeroom with SPU-SS (Sales presentation unit, safety stock), demand SSP (Store-ship pack), LT (Lead Time).
Created DSA and DTA to improve accuracy on GFPA and create Document clear comparison between feature selection options to recommend feature set selection on decision trees.
We built a backroom root cause analysis; some backroom units(products) are expected and planned.
Predicted Future outcomes with current DMO (Data Management Order).
Developed Python programs for manipulating the data reading from various Teradata and convert them into one CSV file.
Performing statistical data analysis and data visualization using Python and R.
Worked on creating filters, parameters, and calculated sets for preparing dashboards and worksheets in Tableau.
Created data models in Splunk using pivot tables by analyzing the vast amount of data and extracting key information to suit various business requirements.
Implemented data refreshes on Tableau Server for biweekly and monthly increments based on business change to ensure that the views and dashboards were displaying the changed data accurately.
Maintenance of large data sets, combining data from various sources by Excel, SAS Grid, Enterprise, Access, and SQL queries.
Analyzed Data Set with SAS programming, R, and Excel.
Publish Interactive dashboards and schedule auto-data refreshes.
Experience in performing Tableau administering by using tableau admin commands.
Education and Training
Northern Illinois University - - Dekalb,IL | | MASTER OF SCIENCEManagement Information Systems
Acropolis Institute of Technology and Research - - Indore,IN | | Bachelor of ScienceInformation Technology
Worked as a social media analyst at CAAEL (Chicago Area Alternative Education League)
Represented university in various competitions.
Represented state in Cricket, Football, and Basketball.
Facilitated as an event coordinator in National Entrepreneurship Network (NEN).
Participated in various Cultural and Sports activities. HOBBIES
Keen interest in grabbing the business news and information going on.
Surfing and getting to know about the latest Gadgets and technological updates.
Resumes, and other information uploaded or provided by the user, are considered User Content governed by our Terms & Conditions. As such, it is not owned by us, and it is the user who retains ownership over such content.
Companies Worked For:
Johnson & Johnson
Capital One Financial Corp
Northern Illinois University
Acropolis Institute of Technology and Research
Job Titles Held:
DATA SCIENTIST/ MACHINE LEARNING
Northern Illinois University - Dekalb , IL | MASTER OF SCIENCE Acropolis Institute of Technology and Research - Indore , IN | Bachelor of Science
Create a job alert for [job role title] at [location].