findbest.tools

How is xG Calculated? The Factors Behind Every Shot

18 min readUpdated May 2026

Expected goals (xG) is football's most talked-about advanced statistic — but for many fans, it remains a black box. A number appears on the screen, and nobody explains where it came from. This guide opens that black box completely. You'll understand every input that shapes an xG value, how models are built, and why two providers can give the same shot different numbers.


Introduction: Why Understanding the Calculation Matters

When you understand how xG is calculated, the numbers stop feeling arbitrary and start carrying real meaning. You'll know why a 0.05 long-range shot is valued the way it is. You'll understand why a cutback from the byline dramatically raises xG compared to a floated cross from the same position. You'll be able to watch a match and develop near-instant intuition for what kind of chance each shot represents — before the broadcast graphic even appears.

More practically, understanding the methodology helps you know when to trust xG data and when to be sceptical of it.


The Foundation: Historical Pattern Recognition

At its most fundamental level, xG is a probability model. It doesn't decide a shot's value based on a person's opinion. Instead, it asks a computer: "Given a dataset of 300,000 historical shots, how many times did a shot with these exact characteristics result in a goal?"

If the computer finds 10,000 similar shots in its database and sees that 1,200 of them were goals, it assigns that shot an xG value of 0.12.


The Critical Input Factors

While different data providers (like Opta, StatsBomb, or Understat) use slightly different models, they all rely on several "core" factors to determine the probability of a goal.

1. Shot Location: Distance and Angle

These are the two most influential variables in any xG model.

  • Distance: The closer a shot is to the goal, the higher the xG. A shot from 2 yards has a massive xG value; a shot from 30 yards is rarely a goal.
  • Angle: The angle is a trigonometric calculation of how much of the goal net the attacker can actually see. A central position directly in front of the posts has a higher xG than a narrow angle near the byline, even if the distance is the same.

2. Body Part: Foot vs. Head

Not all body parts are equal in football. Statistically, it is much harder to score a header than a shot with the foot from the same location. Most models apply a "reduction factor" to headers and volleys, meaning a header from 6 yards will have a lower xG than a standard foot shot from the same spot.

3. Assist Type: How the Ball Arrived

The situation leading up to the shot dramatically changes its probability.

  • Through balls and cutbacks typically produce high xG because the attacker is often facing an open net or moving toward the goal.
  • Crosses and corner kicks have lower xG because the ball is arriving with high velocity or height, making it much harder to control and direct.
  • Rebounds often have very high xG because the goalkeeper is frequently out of position after the initial save.

4. Defensive Pressure and Goalkeeper Position

The most advanced models (like StatsBomb's 360 model) include information about where defenders and the goalkeeper are positioned.

  • A shot with no defenders between the ball and the goal will have a much higher xG than one where three defenders are lunging to block it.
  • If the goalkeeper has been drawn out of position (an "open net" scenario), the xG value sky-rockets.

5. Game Situation: Fast Breaks and Counter-Attacks

A shot taken during a counter-attack, where the defensive team is still retreating and out of shape, is statistically more likely to score than a shot taken against a "set" defence. Models identify these situations using event data and increase the xG accordingly.


How the Models Are Built: The Statistical Machinery

Data scientists typically use Logistic Regression or Machine Learning (Gradient Boosting) to build xG models.

  • Logistic Regression: A traditional statistical method that maps a set of input variables to a probability between 0 and 1. It is highly transparent and allows analysts to see exactly how much each factor (like distance) is "weighting" the final number.
  • Machine Learning: More modern models use "Random Forests" or "XGBoost" algorithms. These are better at capturing "interactions" — for example, knowing that the impact of distance might be different for a header than for a foot shot.

Why Do Different Providers Show Different Numbers?

If you check a match on FBref and then on Understat, the xG totals will likely differ. This is not because one is "wrong," but because they are using different rulers to measure the same thing.

  • Different Features: One provider might include goalkeeper position data while another doesn't.
  • Different Training Data: One model might be trained on 10 years of Premier League data, while another is trained on 3 years of data from 5 different European leagues.
  • Different Definitions: What one provider codes as a "Big Chance" or a "Volley," another might code differently.

Summary: The Methodology Matters

Understanding xG calculation turns the metric from a "black box" into a useful lens for viewing the game. It reminds us that football is a game of probabilities, and that every decision a player makes — where to shoot from, which foot to use, how to arrive in the box — is an attempt to maximise those probabilities.

To see these factors in action yourself, experiment with our interactive xG calculator, where you can adjust every variable and watch the underlying calculation react in real-time.


Frequently Asked Questions

Can I calculate xG by hand?

Simple versions are possible with distance and angle inputs using logistic regression coefficients. Professional models are too complex for manual calculation. The interactive calculator at FindBest Tools automates the process clearly.

What's the most important factor in xG?

Distance to goal, followed closely by angle. These two geometric factors account for the majority of the variance in shot outcomes.

Why does xGOT differ from xG?

xG is calculated before the shot based on position and context. xGOT is calculated after seeing where the shot was placed within the goal frame. They measure different things.

Is a higher xG model necessarily better?

Not necessarily. A well-calibrated model that accurately reflects average scoring rates is more useful than a complex model that overfits to specific scenarios. Simplicity and calibration often matter more than complexity.

Can xG be negative?

No. xG is a probability, and probabilities are always between 0 and 1. A very difficult chance might have xG of 0.01, but never negative.