About

👋    Hi, I'm Matt

💼    Analytics Engineer @ Headspace

🗽    Based in New York City

💫    Aspiring Data-Driven Leader

My professional goal is to act as the liaison between Technology and Business domains, in an effort to unify their objectives into one efficient, cross-functional strategy. I pride myself on my business ethics, extreme curiosity, and ambition to always be at the forefront of cutting-edge technology. These personal attributes drive my passion for providing thoughtful and impactful analysis using big data.

For further detail into my personal and professional career, please feel free to checkout my projects and contact me directly to discuss my professional experience!

Skills

Languages

  • Python (Pandas, NumPy, Scikit-learn, TensorFlow, Keras, etc.)
  • dbt
  • PySpark
  • SQL

Visualization Tools

  • Looker
  • Power BI
  • Tableau
  • Python Libraries - Seaborn, Matplotlib, and Plotly

Other Analytical Tools

  • Databricks
  • Jupyter Notebook
  • Snowflake

Coursework

Machine Learning Engineering

Machine Learning Engineer

Requirements:
    - 16 week program (~20 hours a week)
    - Weekly assignments and coding sessions
    - Developing and deploying a final capstone project



Data Science

Data Science with Python

Requirements:
    - 23 courses
    - 6 projects
    - 3 skill assessments

EY Data Science Badge - Bronze

Requirements:
    - 20+ hours of training
    - Used new skills for internal recruitment project
    - Joined Data Science groups within EY to        share/contribute knowledge

EY Data Visualization Badge - Bronze

Requirements:
    - 40+ hours of training
    - Demonstrated skills on external project
    - Lead Power BI trainings within EY to over 400+        coworkers (Staff-Partners)



Machine Learning

Machine Learning for Business

Summary:
    - ML in production best practices and common       mistakes
    - ML data pyramid
    - Business objectives and ML performance metrics

Supervised Learning with scikit-learn

Summary:
    - Regression and Logistic Regression
    - Bias / Variance Tradeoff and ROC curve
    - Preprocessing data and Hypertuning parameters

Unsupervised Learning in Python

Summary:
    - Hierarchical clustering and kMeans clustering
    - PCA and t-SNE
    - NLP and tf-idf



Programming

Writing Efficient Python Code

Summary:
    - Pythonic practices
    - Time and Space complexity
    - Set Theory and efficient looping alternatives

Personal Projects


Movie Recommendation Engine

Fig.1: Movie Recommender - Click for more detail!

As part of my FourthBrain course, I developed and deployed (now only available locally) a movie recommendation engine that utilizes content-based and collaborative filtering techniques to personalize movie recommendations for users. The main objective of the microservice is to address the cold start problem for new users on a streaming application by providing tailored recommendations with minimal information about the user.

Click the image above to explore the recommender system and the overall structure of the application. This is a great base for a recommender system that will have further iterations and enhancements to benefit its users!


Customer Segmentation

Fig.2: Customer Segmentation - Click for more detail!

Ecom Co. (fake e-commerce company) wanted to better understand their customers, and uncover any hidden patterns that their consumer transaction data had to offer. This initiative will allow the company to unlock a wide range of cross-functional value to implement into various business lines including marketing, sales, planning, etc.

Click the image above to explore the results and the definitions for each customer segment. These customer segments could assist in better understanding customer tendencies!


Spotify Artist Analyzer

Fig.3: Spotify Artist Analyzer - Click for more detail!

In this project, the Spotify API (Spotipy) and several python data analysis/visualization libraries were utilized to create an "Artist Analyzer". With the help of the user's input to determine which artists to include, the analyzer creates different slices of artist information, and several visualizations for more comprehensive analysis.

The analysis is based on the Top 10 Spotify Tracks for each artist, and the respective feature metrics for each track. Track features include: danceability, energy, loudness, speechiness, acousticness, instrumentalness, liveness, valence, tempo. These features allow for an in-depth analysis of any song within the Spotify database.

Click the image above to explore the Analyzer and the definitions for each feature. Some artists may be more similar than you think!


Forecasting the Stock Market

Fig.4: Forecasting the Stock Market - Click for more detail!

In this project, two machine learning algorithms are utilized, Vector Support Regression and Dynamic Time Warping, in an effort to enhance my competitive advantage in the stock market. These models were benchmarked against basic intraday and overnight buying strategies, to assess the effectiveness and credibility over a certain period of time.

Although forecasting the stock market is nearly impossible, I had a great opportunity to experiment with two completely different machine learning models, and learn the similarities/differences in a real world example.

Click the image above for a deeper dive into the technical aspects of the project and the final results!

Acknowledgements


PPP Loan Forgiveness Reporting

Fig.1: PPP Repoting Dashboard - Click for more detail!
"Additionally, interoperability with Power BI provides the bank with dashboards displaying real-time metrics, such as the number of loans processed, time to process, quality check logs, and performance reports for EY’s VOC team...

...Using the tool, EY has been able to process over 50,000 loans totaling more than $10 billion on behalf of its banking customers. What’s more, as the number of PPP loan forgiveness applications has soared, the value of an automated process has also grown."

In response to the Covid-19 pandemic, EY was tasked with an unprecedented assignment to quickly develop a flexible PPP loan forgiveness solution for multiple banks that adhered to the rapidly changing guidelines from the SBA. I had the opportunity to support the reporting team in the overall effort to complete this project in a timely and efficient manner.

Click the image above to read into more detail about the project and how EY partnered with Microsoft to provide an innovative solution.

For more details about my professional experience, please contact me at matt.castelli2@gmail.com.