My passion for data was kindled when I first read about John Snow's (not the one from GOT!) pioneering work in collecting and visualizing data to identify the cause of the cholera spread in 1854. I thoroughly enjoy working with data, particularly in building data pipelines, managing data infrastructure, and creating data visualizations.
Over 4 𝐲𝐞𝐚𝐫𝐬 of professional experience and certified Databricks Data Engineer , I developed the ability to solve complex problems and provide teams with valuable insights using data. I am proficient in various tools and technologies, including Python, Snowflake, DBT, AWS, ETL, and SQL. What sets me apart is my willingness to take on challenges that involve learning new technologies and tools.
Some of my achievements at work are listed below:
- Worked on data cleaning and ingestion of bruise data to enable researchers to study bruises, which can help bring justice to domestic violence victims.
- I identified and resolved missing transactions between the Data Lake and MySQL database, resulting in a 10% improvement in data quality within the Data Lake.
If you're in search of a professional who can extract value from data or maintain data infrastructure, let's connect and discuss how I can leverage my expertise to provide valuable insights for your organization.
In my free time, I enjoy learning new stuff, watching documentaries, swimming, hiking, and cycling.
Skills: Python, SQL, AWS, Azure, Databricks, Hadoop, Amazon Redshift, AWS Glue, Docker, ETL, Data Warehousing, Data Modeling, Apache Spark, PySpark, Machine Learning, Agile, Data Visualization, DBT, Snowflake, NoSQL, R, SageMaker, MongoDB, NLTK, Natural Language Processing."
Experience
Nov 2023 - Present
-
Migrated data pipelines from AWS EMR to Databricks.
-
Built scripts using pyspark in databricks notebook to compare data from production and pre-prod.
Jul 2023 - Jan 2024
-
Transformed and cleaned bruise data, which was collected as part of research and uploaded the data to database. That helped researchers study bruises and devise better system to document bruises and there by help bring justice to domestic violence victims.
-
Built pipelines to extract metadata from images and store it in database.
Oct 2023 - Dec 2023
-
Instructor for course HAP780 - Data mining in Health care
Jun 2022 - Aug 2022
-
Got hands-on experience with data lakes and data visualization tools like AWS QuickSight.
-
Analyzed how data is moved between different tools in a data pipeline and how data modeling concepts are applied to database that stores financial transactions.
-
Regularly interacted with various teams and translated the teams requirements to business solutions.
Oct 2018 - 2021
-
Worked with the Break Through Tech team to clean data from student surveys and transform it into a format accepted by salesforce.
-
Analyzed the data using visualizations and salesforce dashboard and identify ways to improve program.
Oct 2018 - Aug 2021
-
Collaborated with support teams and identified ways to reduce manual workload for the team and develop scripts and tools to save time for the team. Worked on data pipelines to integrate data from servers into an application.
-
Worked on data pipelines to integrate data from servers into an application.
Certifications
Databricks Certified Data Engineer Associate
Certificate link
Education
Masters in Data analytics
George Mason University
Bachelors in Electronics and communication Engineering
Gitam University
Projects
Weather Analysis
Built data pipeline using pyspark and hadoop to perform analysis on real-time weather data from Florida.
Elizachatbot
Rule-based chatbot built using Natural Language Processing and deployed using Django.
Mywordslist
Website built using Flask web-framework to learn new words with a built in dictionary.
Tic-Tac-Toe
Built using python Tkinter package, AI player built using Mini-Max algorithm.
Minesweeper
Implemented classic minesweeper game using python Tkinter package.
Mouse Mover
Application to move mouse curser based on set time interval.