How our AI model works
Three independent prediction models. One ensemble confidence score. Every tip backed by data from 12,700+ historical matches.
Every tip on Daily Sport Pick is generated by an automated system that runs three separate prediction models on each match of the day. The models work independently, then their outputs are combined into a single ensemble confidence score between 0 and 100%.
This approach reduces the risk of any single model’s blind spots affecting the final prediction. When all three models agree, the confidence score is higher. When they disagree, the score is lower — and we are more cautious about publishing that tip.
Model 1 — GradientBoosting (60% weight)
Our primary model is a GradientBoosting classifier trained on 12,700+ historical football matches across 13 competitions including the Premier League, Champions League, La Liga, Bundesliga, Serie A and Eredivisie.
Current accuracy: 66.5% on a held-out test set of 2,189 matches — more than double the random baseline of 33%.
The model uses 20 features to make its prediction:
- ELO ratings — both a global rating and separate home/away ELO ratings per team. The home/away split is the strongest predictor in the model, with an importance score of 0.58 combined.
- Recent form — wins, draws and losses in the last 5 matches for both teams
- Goals averages — average goals scored per game over the last 5 matches
- League tier — competitiveness level of the competition
The model outputs a probability for each of three outcomes: home win, draw or away win. The most probable outcome becomes the prediction.
Model 2 — Poisson Distribution (20% weight)
The Poisson model takes a purely mathematical approach. Rather than learning from labelled outcomes, it calculates the expected number of goals each team is likely to score, then uses the Poisson probability distribution to generate a full score matrix.
For each possible scoreline (e.g. 1-0, 2-1, 0-0), the model calculates an exact probability. From this matrix we derive:
- 1X2 probabilities (home win / draw / away win)
- Over/Under 2.5 goal probabilities
- The three most likely exact scores
The model calculates attack and defense ratings per team per competition, corrected for home advantage and normalised to the league average. A team with a high attack rating and facing a team with a weak defense will have higher expected goals.
Model 3 — Dixon-Coles MLE (20% weight)
The Dixon-Coles model (Dixon & Coles, 1997) is an extension of Poisson that corrects a known weakness: standard Poisson underestimates the frequency of low-scoring results like 0-0, 1-0, 0-1 and 1-1.
Dixon-Coles adds a correction factor called tau (τ) specifically for these four scorelines. It also uses Maximum Likelihood Estimation (MLE) via scipy L-BFGS-B optimisation to fit the attack, defense and home advantage parameters simultaneously — rather than calculating them as simple averages.
An additional refinement is time-weighting: matches from 180 days ago carry half the weight of a match played today. This means the model adapts faster to form changes mid-season.
The Dixon-Coles model is trained separately per competition and currently has full ratings for 10 leagues: Champions League, Europa League, Conference League, Premier League, La Liga, Bundesliga, Serie A, Ligue 1, Eredivisie and Primeira Liga.
The ensemble: combining all three
After each model generates its prediction for a given match, the ensemble layer combines them using fixed weights:
| Model | Weight | 1X2 accuracy | Best at |
|---|---|---|---|
| GradientBoosting | 60% | 66.5% | Overall match outcome |
| Poisson Distribution | 20% | 46.1% | Over/Under goals markets |
| Dixon-Coles MLE | 20% | 45.7% | Score prediction, low-score correction |
The ensemble score is calculated as follows:
- Base score (0–75): the weighted average probability of the predicted outcome, scaled up from the random baseline of 33%
- Consensus bonus (0–15): +15 if all 3 models agree, +7 if 2 of 3 agree, 0 if they disagree
- Spread bonus (0–10): how clearly the winning outcome stands above the alternatives
A score of 70%+ indicates high confidence with strong model consensus. We aim to only publish tips that score above a minimum threshold.
See the model in action
Check today’s tips and their confidence scores
Frequently asked questions
What prediction models does Daily Sport Pick use?
We use three independent models: a GradientBoosting machine learning model trained on 12,700+ historical matches (66.5% accuracy), a Poisson Distribution model that calculates score probabilities mathematically, and a Dixon-Coles MLE model that corrects for low-score bias.
What is the ensemble confidence score?
The ensemble score (0–100%) combines all three models using weighted averaging. GradientBoosting carries 60% weight, Poisson 20% and Dixon-Coles 20%. A score above 65% indicates high consensus across models.
What does the score heatmap show?
The score heatmap shows the probability of every possible scoreline (0-0 through 4-4) based on the Dixon-Coles model. Blue cells indicate home win scenarios, grey cells indicate draws, and orange cells indicate away wins. Darker colours mean higher probability.
How accurate is the AI model?
Our GradientBoosting model correctly predicts the match outcome in 66.5% of cases on a test set of 2,189 historical matches. This is significantly above the random baseline of 33%.
Why do the Poisson and Dixon-Coles models have lower accuracy than GradientBoosting?
Statistical models like Poisson and Dixon-Coles are not designed primarily for 1X2 prediction — they are built to model score distributions. Their value in the ensemble is complementary: they add information about expected goals and scoreline probabilities that the ML model does not capture directly.
