findbest.tools

FBref vs Understat vs Opta: Why Different xG Numbers Exist for the Same Shot

18 min readUpdated May 2026

You've looked up a match on FBref, then checked Understat, and found different xG values. Both sites are reputable. Both are using expected goals. So why don't the numbers match? This guide explains everything — from the technical reasons for disagreement to how you should handle conflicting sources in your analysis.


Introduction: The Confusing Reality of Multiple xG Providers

Imagine watching a Premier League match. The broadcast graphic shows 1.8 xG for the home team. You check FBref afterward and it says 2.1. You open Understat and it shows 1.6. You see a tweet from a data journalist citing Opta's figure of 1.9.

All four numbers are for the same match. None of them are wrong. And understanding why they disagree — genuinely understanding it, not just accepting it — will make you a far more sophisticated consumer of football analytics data.

The short answer: different xG providers use different data, different methodologies, and different model architectures. Like four people measuring the same room with rulers of slightly different length, they'll all get sensible answers that don't perfectly agree.

The long answer fills this entire guide.


The Major xG Providers: Who Are They?

Before diving into why they disagree, it helps to know who the main players are:

StatsBomb (via FBref.com)

StatsBomb is a football data and analytics company that collects detailed event-level data and publishes a sophisticated, multi-feature xG model. Their data is accessible to the public through FBref.com for many competitions.

StatsBomb is known for collecting what they call "360 data" — detailed information about player and goalkeeper positioning at the moment of each event, which most competitors don't capture. This enables more context-rich xG calculations. StatsBomb is generally considered one of the most comprehensive public data providers.

Understat.com

Understat is a free website that publishes xG data for the top five European leagues and a few others. Their model uses a different set of features than StatsBomb and is built independently.

Understat's data is popular among fans and analysts because it's free, covers multiple seasons, and has clean visualisations. Their xG values often differ from StatsBomb's, sometimes noticeably.

Opta (via various platforms: SofaScore, WhoScored, Sky Sports, ESPN)

Opta (now part of StatsPerform) is one of the largest sports data companies in the world, supplying data to broadcasters, clubs, and media organisations globally. Their xG model powers broadcast graphics on Sky Sports and ESPN, among others.

Opta's proprietary model is not fully disclosed, but they collect extensive event data across a huge range of competitions. Their xG figures are what most casual fans see via broadcast graphics and major sports websites.

Other Providers

  • Wyscout (now part of StatsPerform): Used extensively by European football clubs for scouting and analysis
  • InStat: Popular in Eastern Europe and for lower-league data
  • Ateam / Metrica Sports: Provide tracking-data-enhanced models in some contexts
  • Club-built models: Many top clubs (Manchester City, Liverpool, Barcelona) have built proprietary internal xG models whose outputs never become public

Why the Numbers Differ: Five Core Reasons

Reason 1: Different Input Features

This is the most fundamental source of disagreement. Different models include different variables in their xG calculations.

All major models include the basics: distance to goal, angle to goal, and body part used. But beyond these, providers diverge significantly.

StatsBomb's enhanced features:

  • Goalkeeper position at the moment of the shot (their 360 data captures this)
  • Defensive pressure from nearby players (with detailed positional information)
  • Distance and position of the closest defender
  • Whether the goalkeeper was "unsighted" by a screen

These additional features — particularly goalkeeper positioning — can significantly alter xG values for certain shots. A chip or a shot when the goalkeeper is off their line will receive a much higher StatsBomb xG than a provider who doesn't capture goalkeeper position.

The practical result: the same shot can receive meaningfully different xG values because one model "sees" factors the other doesn't.

Reason 2: Different Training Data

Machine learning models learn from historical shot data. The models' outputs depend heavily on what data they were trained on.

  • Dataset size: A model trained on 2 million shots will generally have better statistical calibration than one trained on 200,000 shots. Larger datasets smooth out noise and allow more reliable probability estimates for rare situations.
  • Competition coverage: A model trained primarily on the English Premier League will have learned patterns specific to that competition. Applied to La Liga or the Bundesliga, it may systematically over- or underestimate xG because playing styles, defensive structures, and goalkeeping tendencies differ between leagues.
  • Historical period: Shot patterns have changed over time as teams have become more sophisticated in set-piece routines, high-pressing tactics, and chance creation from specific zones. A model trained on data from 2005–2015 may produce different values than one trained on 2015–2024 data.
  • Data collection methodology: How a shot's characteristics are recorded affects model training. Was "defensive pressure" collected as a binary (yes/no) or a continuous measure? Was the shot location marked as a precise coordinate or assigned to a zone? These coding choices affect what the model learns.

Reason 3: Different Model Architectures

Even with identical data and identical features, different modelling approaches produce different outputs.

  • Logistic regression: Simple, interpretable, well-calibrated on average. May miss complex interaction effects between variables.
  • Random forests / Gradient boosting: Can capture non-linear interactions. For example, the effect of distance might interact differently with shot pressure at different angles — a boosted tree model can learn these patterns, while logistic regression applies fixed linear weights.
  • Neural networks: Highly flexible, can learn extremely complex patterns. Requires large training sets to avoid overfitting. Less interpretable — harder to understand why a particular shot received a certain value.
  • Bayesian models: Some researchers use Bayesian approaches that explicitly model uncertainty rather than producing a single point estimate.

Reason 4: Different Calibration Decisions

All models need to be calibrated — meaning the output probabilities need to align with real-world scoring rates. A model that assigns 0.30 xG to a class of shots should see those shots score about 30% of the time in the data.

Providers make different calibration decisions, producing different final outputs even from similar underlying models.

Reason 5: Different Data Collection Methods

Finally, the raw event data itself can vary between providers, separate from any modelling decisions.

Did Collector A record the shot location as the point of contact with the ball, or the position of the player's feet? Are coordinate systems aligned — does (0,0) represent the same corner of the pitch for both providers?

When a player takes a shot from "just outside the penalty box," did both providers record it as inside or outside? Shot location ambiguity in live data collection is real, and small location differences propagate into meaningful xG differences.


A Direct Comparison: How Different Are the Numbers?

Research comparing provider xG values on the same matches has found:

  • On average, across large samples: Different providers' xG totals correlate strongly with each other. A match that StatsBomb records as having 2.1 xG will likely show up as somewhere between 1.7 and 2.5 on Understat, not wildly outside that range.
  • For individual shots: Disagreement can be substantial. A shot that StatsBomb values at 0.35 xG — because they've captured that the goalkeeper was off their line — might be valued at 0.18 by Understat, which doesn't have that information.
  • For set pieces: This is where disagreement tends to be largest. Corner kicks, free kicks, and complex set-piece situations have many variables that different providers capture differently.
  • For open-play central shots: Disagreement tends to be smaller. The geometry is clear enough that simple and complex models often agree.

What This Means for Analysis: Practical Rules

Rule 1: Never Compare Across Providers

This cannot be overstated. A 0.30 xG from StatsBomb is not the same as a 0.30 xG from Understat. Comparing a player's xG from two different sources is like comparing weights measured in kilograms versus pounds — the numbers look similar but represent different things. Always pick a single provider and use it consistently for all comparisons within a piece of analysis.

Rule 2: The Direction Is More Reliable Than the Magnitude

While exact values differ between providers, directional conclusions tend to hold across sources. If five major providers all agree that Team A outperformed Team B in xG over a season, that conclusion is robust. If one provider shows it and two others don't, it's worth scepticism.

Rule 3: Understand What Your Chosen Source Captures

Before relying on a provider's xG data for analysis, understand what they do and don't include in their model. For understanding chance quality and building analytical intuition, the free xG calculator at FindBest Tools uses a transparent educational model — you can see exactly which factors are included and how they're weighted, making it ideal for learning how the inputs drive the output.

Rule 4: Large Samples Reduce Provider Disagreement

Individual shot xG values can differ substantially between providers. But over a full season's worth of shots (400–600 per team), the aggregated xG totals tend to converge more closely. The errors and disagreements partially cancel out.


The FBref vs Understat vs Broadcast Graphic Reconciliation

The most common practical question is: "Why does the xG on the television not match FBref, and which should I trust?"

  • Broadcast graphic (Opta): Based on Opta's proprietary model. Does not include goalkeeper positioning (in standard public model). Reliable for general match-level assessment.
  • FBref (StatsBomb): More detailed feature set including goalkeeper position. Excellent for in-depth player and team analysis.
  • Understat: Completely free with historical archives. Good for trend analysis over multiple seasons.

Which to trust? None is definitively "correct" — they're all valid measurements using different rulers. For rigorous analysis, pick the one with the richest data available for your question.


How Clubs Handle Multiple Models

Professional football clubs often build their own internal xG models or subscribe to multiple data providers simultaneously. Rather than trusting a single source, analysts cross-reference key shots, use ensemble approaches (aggregating multiple models), and build proprietary models using tracking data that can incorporate dozens of factors that event-based models miss.


The Academic Perspective: How Significant Is the Disagreement?

Studies comparing major providers typically find correlation coefficients of 0.85–0.95 between providers' match-level xG totals. At the individual shot level, correlations are lower — often around 0.70–0.85 — reflecting the genuine differences in what each model captures. The practical conclusion: for big-picture questions choice of provider matters less than consistency.


What This Means for the Casual Fan

For almost every question a casual fan wants to answer, any reputable single source will serve you well, provided you use it consistently. For building intuition about how xG works across all scenarios, the free xG calculator at FindBest Tools is ideal. It uses a transparent model where every input is visible, giving you full control over the methodology as you learn.


Conclusion: Different Rulers, Same Room

The fact that FBref, Understat, and Opta give different xG values for the same shot isn't a scandal. It's a natural consequence of different organisations building different tools for slightly different purposes. Understanding why they disagree turns provider disagreement from a source of confusion into a source of insight. Pick a source, understand what it measures, use it consistently, and recognise that the direction of conclusions matters more than the exact decimal.


Frequently Asked Questions

Which xG provider is most accurate?

StatsBomb, with their detailed 360 data, generally has the richest feature set. But 'accuracy' depends on your question — for historical trend analysis, Understat's long archive is valuable. For broadcast context, Opta's data is the most widely referenced.

Why does FBref sometimes show different xG for the same match on different dates?

Data providers often revise their data after initial publication as additional information is incorporated or data errors are corrected. Always use the most recent version of FBref data for serious analysis.

Can I mix FBref and Understat data for a player comparison?

No. Never compare values from different providers. Choose one and use it consistently throughout your analysis.

Is there a 'ground truth' xG that all providers are trying to approximate?

Not exactly. Each model is making slightly different design choices about what factors to include and how to weight them. There's no single 'true' xG model — rather, different models optimise for different things, which is why they legitimately disagree.

Do provider disagreements matter for betting analysis?

For broad analytical conclusions — whether a team is overperforming their xG — provider choice matters less than consistency. For fine-grained edge-seeking in specific markets, using the richest available model (StatsBomb where possible) is preferable.