Developing Machine Learning Solutions for NHL Sports Betting

Developing Machine Learning Solutions for NHL Sports Betting

The intersection of data science and sports betting presents unique challenges and opportunities. This project explored the development of machine learning models to identify positive expected value (EV) bets in NHL anytime goalscorer markets, aiming to find mispriced opportunities in sportsbook odds and enhance profitability.

Tackling the Biases in Sportsbook Odds

Sportsbooks inherently tilt odds in their favor, making it difficult for bettors to achieve consistent profitability. This project sought to address these biases by leveraging player statistics and game-related variables to build a data-driven solution capable of detecting profitable betting opportunities.

Key Highlights:

The project was characterized by several crucial components:

  • Objective: The goal was to predict the probability of NHL players scoring goals in upcoming games, using these probabilities to calculate the expected value (EV) of corresponding bets.
  • Data Challenges: Data was sourced through extensive web scraping from platforms such as DraftKings, Hockey Reference, and Rotowire. A custom MySQL database was engineered to store and process this data efficiently, ensuring integrity and scalability.
  • Feature Engineering: Over 250 features were created to quantify player performance, including rolling window statistics (e.g., shots per 60 minutes), game-related variables (e.g., rest days, home/away status), and advanced metrics like point streaks.
  • Machine Learning Models: Various algorithms were tested, including random forests, XGBoost, and neural networks. The models were evaluated on profitability metrics such as binary cross-entropy loss, EV thresholds, and average profit per bet.
  • Results: While none of the models achieved profitability against DraftKings, XGBoost emerged as the best-performing model, offering the highest average profit per bet.

Insights and Impact:

  • The project highlighted the inherent difficulty of overcoming sportsbook biases, emphasizing the need for innovative features and robust modeling techniques.
  • Player clustering using k-means revealed distinct playing styles (e.g., skill, defensive, grinder) and their impact on model predictions, uncovering patterns in overvaluation and underperformance.
  • Future avenues for improvement include incorporating advanced hockey statistics, injury data, and graph neural networks to model team dynamics and player chemistry.
  • The importance of “odds shopping” across multiple sportsbooks was also identified as a key strategy to improve profitability and capitalize on discrepancies in odds.

This project demonstrated the potential of machine learning in analyzing sports betting markets while uncovering the challenges of competing with sophisticated sportsbook models. By exploring new features and modeling strategies, there remains hope for developing a profitable solution in the future.

What is a capstone project?