Our technology

How our AI model works

Three independent prediction models. One ensemble confidence score. Every tip backed by data from 12,700+ historical matches.

Every tip on Daily Sport Pick is generated by an automated system that runs three separate prediction models on each match of the day. The models work independently, then their outputs are combined into a single ensemble confidence score between 0 and 100%.

This approach reduces the risk of any single model’s blind spots affecting the final prediction. When all three models agree, the confidence score is higher. When they disagree, the score is lower — and we are more cautious about publishing that tip.

How the confidence score is calculated: The ensemble score combines the weighted probability of the predicted outcome (60% weight for GradientBoosting, 20% each for Poisson and Dixon-Coles), a consensus bonus when multiple models agree, and a spread bonus when the winning probability is clearly ahead of the alternatives.

Model 1 — GradientBoosting (60% weight)

Our primary model is a GradientBoosting classifier trained on 12,700+ historical football matches across 13 competitions including the Premier League, Champions League, La Liga, Bundesliga, Serie A and Eredivisie.

Current accuracy: 66.5% on a held-out test set of 2,189 matches — more than double the random baseline of 33%.

The model uses 20 features to make its prediction:

  • ELO ratings — both a global rating and separate home/away ELO ratings per team. The home/away split is the strongest predictor in the model, with an importance score of 0.58 combined.
  • Recent form — wins, draws and losses in the last 5 matches for both teams
  • Goals averages — average goals scored per game over the last 5 matches
  • League tier — competitiveness level of the competition

The model outputs a probability for each of three outcomes: home win, draw or away win. The most probable outcome becomes the prediction.

Why ELO ratings matter: ELO is a dynamic rating system that updates after every match based on result and opponent strength. A team that beats a strong opponent gains more ELO than one that beats a weak side. The home/away split means we track separately how teams perform at home versus away — a crucial distinction in football.

Model 2 — Poisson Distribution (20% weight)

The Poisson model takes a purely mathematical approach. Rather than learning from labelled outcomes, it calculates the expected number of goals each team is likely to score, then uses the Poisson probability distribution to generate a full score matrix.

For each possible scoreline (e.g. 1-0, 2-1, 0-0), the model calculates an exact probability. From this matrix we derive:

  • 1X2 probabilities (home win / draw / away win)
  • Over/Under 2.5 goal probabilities
  • The three most likely exact scores

The model calculates attack and defense ratings per team per competition, corrected for home advantage and normalised to the league average. A team with a high attack rating and facing a team with a weak defense will have higher expected goals.

Strength: The Poisson model is particularly useful for Over/Under predictions. By modelling goals independently for each team, it gives a clear view of whether a match is likely to be high- or low-scoring — independent of which team wins.

Model 3 — Dixon-Coles MLE (20% weight)

The Dixon-Coles model (Dixon & Coles, 1997) is an extension of Poisson that corrects a known weakness: standard Poisson underestimates the frequency of low-scoring results like 0-0, 1-0, 0-1 and 1-1.

Dixon-Coles adds a correction factor called tau (τ) specifically for these four scorelines. It also uses Maximum Likelihood Estimation (MLE) via scipy L-BFGS-B optimisation to fit the attack, defense and home advantage parameters simultaneously — rather than calculating them as simple averages.

An additional refinement is time-weighting: matches from 180 days ago carry half the weight of a match played today. This means the model adapts faster to form changes mid-season.

The Dixon-Coles model is trained separately per competition and currently has full ratings for 10 leagues: Champions League, Europa League, Conference League, Premier League, La Liga, Bundesliga, Serie A, Ligue 1, Eredivisie and Primeira Liga.

Score heatmap: The Dixon-Coles model powers the score probability heatmap shown on each tip. The 5×5 grid shows every scoreline from 0-0 to 4-4, coloured by probability. Blue cells are home win scenarios, grey are draws, orange are away wins — darker means more likely.

The ensemble: combining all three

After each model generates its prediction for a given match, the ensemble layer combines them using fixed weights:

Model Weight 1X2 accuracy Best at
GradientBoosting 60% 66.5% Overall match outcome
Poisson Distribution 20% 46.1% Over/Under goals markets
Dixon-Coles MLE 20% 45.7% Score prediction, low-score correction

The ensemble score is calculated as follows:

  1. Base score (0–75): the weighted average probability of the predicted outcome, scaled up from the random baseline of 33%
  2. Consensus bonus (0–15): +15 if all 3 models agree, +7 if 2 of 3 agree, 0 if they disagree
  3. Spread bonus (0–10): how clearly the winning outcome stands above the alternatives

A score of 70%+ indicates high confidence with strong model consensus. We aim to only publish tips that score above a minimum threshold.

See the model in action

Check today’s tips and their confidence scores

Frequently asked questions

What prediction models does Daily Sport Pick use?

We use three independent models: a GradientBoosting machine learning model trained on 12,700+ historical matches (66.5% accuracy), a Poisson Distribution model that calculates score probabilities mathematically, and a Dixon-Coles MLE model that corrects for low-score bias.

What is the ensemble confidence score?

The ensemble score (0–100%) combines all three models using weighted averaging. GradientBoosting carries 60% weight, Poisson 20% and Dixon-Coles 20%. A score above 65% indicates high consensus across models.

What does the score heatmap show?

The score heatmap shows the probability of every possible scoreline (0-0 through 4-4) based on the Dixon-Coles model. Blue cells indicate home win scenarios, grey cells indicate draws, and orange cells indicate away wins. Darker colours mean higher probability.

How accurate is the AI model?

Our GradientBoosting model correctly predicts the match outcome in 66.5% of cases on a test set of 2,189 historical matches. This is significantly above the random baseline of 33%.

Why do the Poisson and Dixon-Coles models have lower accuracy than GradientBoosting?

Statistical models like Poisson and Dixon-Coles are not designed primarily for 1X2 prediction — they are built to model score distributions. Their value in the ensemble is complementary: they add information about expected goals and scoreline probabilities that the ML model does not capture directly.

Scroll to Top
Telegram Bluesky Twitter Facebook Instagram YouTube LinkedIn