Guide

What is Cohort Forecasting? A Complete Guide for Mobile Game Studios

How modern mobile game studios use machine learning to predict cohort revenue, retention, and lifetime value weeks or months before the data has matured. Estimated reading time: 14 minutes.

TL;DR

What you will learn on this page

This guide covers the full discipline of cohort forecasting for mobile game studios: what it is, how it works step by step, the main methods used to forecast cohort LTV, the most common mistakes that produce unreliable forecasts, the landscape of tools mature studios use, and a plain-English glossary of the metrics that matter. If you run UA, finance, or analytics at a mobile studio and your forward LTV decisions cost more than $50K per month to get wrong, this page is written for you.

Definition

What is Cohort Forecasting?

Cohort forecasting is the practice of grouping users acquired in the same time window (a "cohort") and predicting that group's future revenue, retention, and lifetime value using machine learning models trained on historical cohort behaviour, producing a forward view of profitability weeks or months before the data has matured.

The four pillars of cohort forecasting in mobile gaming:

Cohort definition

Grouping installs by acquisition date, channel, geo, or creative to isolate comparable user groups so they can be compared apples to apples.

LTV curve fitting

Modelling the revenue trajectory from Day 0 to D365+ based on early-day signals (retention, session depth, early in-app purchase events).

Prediction confidence

Quantifying uncertainty so decision makers know when a forecast is actionable versus when to wait for more data to accumulate.

Campaign attribution

Mapping cohort revenue back to specific campaigns, creatives, and channels so the forecast feeds budget decisions, not just retrospective reporting.

The fundamental challenge in mobile gaming is that you spend money today to acquire users whose full revenue value will not be known for 6 to 18 months. Cohort forecasting compresses that gap by using the early behaviour of a new cohort, first session length, D1/D3/D7 retention, early IAP signals, to predict how that cohort will perform at D90, D180, and D365.

The discipline draws on decades of work in customer cohort analysis from direct response and SaaS marketing, adapted to the specific shape of mobile gaming revenue curves: high variance, heavy-tailed distributions, IAP plus IAA hybrid streams, and the privacy-first attribution environment introduced by ATT and SKAdNetwork. It has become the primary competitive lever for mobile UA, finance, and M&A teams who need to make forward-looking budget and valuation decisions on cohorts that have not yet matured.

How it works

How Cohort Forecasting Works

Most studios still report cohort performance only after the fact, using trailing 30, 60, or 90 day windows. By the time those windows close, budget has been committed. Modern cohort forecasting inverts the workflow: forward predictions generated in the first week of a cohort, refined daily as more data arrives, used as the primary input to active budget and planning decisions.

Step 1

Cohort data ingestion

Cohort forecasting platforms connect to a studio's MMP (AppsFlyer, Singular, or Adjust) and pull structured cohort revenue data: installs, events, and revenue by day, segmented by channel, geo, and creative. Data is normalised and deduplicated to handle the common edge cases in MMP exports (re-attribution windows, view-through attribution, organic bleed-through). Output: clean per-cohort timeseries ready for model training.

Step 2

LTV model training

ML models are trained on the studio's own historical cohort data, not industry averages. LTV curves vary significantly by game genre, monetisation model (IAP vs IAA vs hybrid), and market. The most accurate approaches combine studio-specific training data with cross-customer aggregates as a regularisation signal, especially for new titles where the studio's own training data is thin. Most production-grade platforms use gradient-boosted models with cohort-level features as inputs and D90/D180/D365 ARPU as targets, re-trained automatically as new cohort data arrives.

Step 3

Forecast generation and confidence scoring

For each new cohort (typically defined weekly or per campaign flight), the platform generates a point forecast plus a confidence interval at D90, D180, and D365. Confidence scoring tells UA managers when a prediction is reliable enough to act on. As a rule of thumb in Kohort's models, which use gradient-boosted ML trained on cross-customer cohort aggregates: cohorts with 500+ installs and 7+ days of data typically fall within plus or minus 15% of D365 actual; 2,000+ installs and 14+ days within plus or minus 8%. Below 500 installs, forecasts should be treated as directional rather than precise. These thresholds are tighter than what a single-studio curve-fitting model can achieve at the same data depth, as Eric Seufert documents in Mobile Dev Memo's classic analysis of LTV data requirements. The cross-customer training data is what compresses the data-depth requirement.

Step 4

ROAS mapping and budget signals

Predicted LTV multiplied by cohort size, divided by cohort-level ad spend, produces a predicted campaign ROAS at D90/D180/D365. Cohort forecasting platforms map this back to live campaigns in real time, so UA managers see which campaigns are tracking to target ROAS before the standard trailing window would surface the signal. This is the step that turns a cohort forecast from a reporting artifact into a budget allocation tool, the core feedback loop behind systematic user acquisition optimization.

Step 4 above is where cohort forecasting feeds directly into user acquisition optimization: the same forward signal that ranks cohorts by predicted LTV ranks live campaigns by predicted ROAS.

Methods

Cohort Forecasting Methods Compared

There is no single "right" way to forecast cohort LTV. The right method for a given studio depends on training data depth, technical capability, and the LTV horizon being forecast.

MethodHow it worksBest forLimitations
Linear extrapolationMultiply D7 or D30 revenue by an industry multiplier to estimate D365Smallest studios with no in-house analytics; very early launch windowsRoughly 70% accuracy at D365; ignores cohort-specific behaviour; fails on long-tail and hybrid monetisation titles
Parametric curve fitting (power, Weibull, exponential)Fit a parametric retention or revenue function (Russell Ovans' r(n)=a·n^b power function, Weibull decay, exponential, or the logarithmic "Golden Curve") to observed cohort data and project forwardStudios at any scale with clean retention data; the most widely used production approach for retention curve modelling outside large vendor stacks; used by practitioners like Russell Ovans at East Side GamesCurve shape assumption can be miscalibrated for new titles; combining IAP and IAA revenue streams in a single curve loses signal; tail behaviour at D365+ is sensitive to early-day noise
Probabilistic models (Pareto/NBD, BG/NBD)Model purchase frequency and churn separately using classical probability distributions; widely cited in the customer-analytics literature (Schmittlein, Fader/Hardie)Studios with strong analyst capacity and a need for interpretable, theory-grounded LTV models; common in academic and finance settingsAssumes purchase behaviour follows specific distributions that may not fit gaming cohorts well; less common in production mobile gaming workflows than curve fitting or ML
Cohort regressionMultivariate regression on cohort features (retention, session count, early ARPU) trained on historical cohortsStudios with 6+ months of cohort history and analyst time to build and maintain modelsLimited capacity to capture non-linear interactions; manual model maintenance
Gradient-boosted MLBoosted decision trees (XGBoost, LightGBM) on cohort features, retrained automatically as data arrivesMost production-grade cohort forecasting today; balances accuracy with interpretability; particularly strong on high-LTV segmentsRequires training data depth (90+ days minimum); benefits significantly from cross-customer aggregates; production setups often pair gradient boosting with Tweedie or zero-inflated regression for low-LTV stability
Deep learning sequence modelsLSTM or Transformer architectures on cohort timeseriesLargest studios with extensive ML infrastructure and very long LTV tailsCompute-intensive; harder to interpret; marginal gains over gradient-boosted approaches at most scales

Most current best-in-class platforms use gradient-boosted models as the production layer, often supplemented with parametric priors for sparse-data cohorts. In Kohort's back-testing across $6B+ of UA spend, gradient-boosted models trained on 6+ months of studio-specific cohort data achieve D365 prediction accuracy in the 88 to 92% range on cohorts of sufficient size, where simple linear extrapolation from D30 data sits closer to 70%. The gap widens for games with heavy long-tail LTV, where linear extrapolation breaks down structurally.

Pitfalls

Common Mistakes in Cohort Forecasting

The following six mistakes account for the largest share of bad cohort forecasts in mobile gaming.

Mistake 1

Aggregating cohorts too broadly

Studios that run cohort forecasting at the portfolio level rather than at the campaign or channel level lose most of the signal. A 'May 2026 cohort' containing installs from Meta, Google UAC, and AppLovin will have a blended LTV curve that hides material variance between sources. The fix is forecasting at the smallest cohort definition that has statistical significance (500+ installs minimum), then rolling up.

Mistake 2

Comparing predicted LTV to incomplete actual LTV

A common reporting error: comparing a model's D365 prediction against a cohort's current D60 actual revenue, and concluding the model is 'wrong' because the actual is lower. The actual will catch up. The right comparison is predicted D-N versus actual at the same horizon (predicted D365 against actual D365), which requires waiting for cohorts to mature before judging model accuracy. This is one of the most common reasons studios lose faith in cohort forecasting prematurely.

Mistake 3

Ignoring confidence intervals when making budget decisions

A point forecast with a plus or minus 25% confidence band is qualitatively different from a forecast with a plus or minus 5% band, even if the central estimate is the same. Studios that treat all forecasts as equally precise will scale campaigns based on noisy signal in the early weeks of a cohort and have to walk it back. The fix is requiring a minimum confidence threshold (typically plus or minus 15% or tighter) before letting forecasts drive material budget shifts.

Mistake 4

Conflating retention and revenue curves

Cohort retention (the share of users still active at day N) and cohort revenue (ARPU at day N) are related but not the same. Two cohorts can have identical D30 retention and very different D30 revenue, depending on monetisation depth. Forecasting only on retention misses the monetisation signal; forecasting only on revenue misses early dropout. Mature cohort models use both as input features.

Mistake 5

Not retraining as the game economy changes

A cohort model trained on data from before a major monetisation update (new IAP currency, ad placement change, live ops cadence shift) will systematically misforecast cohorts acquired after the update. Production cohort forecasting requires automatic retraining as new cohort data arrives, plus the ability to detect and flag drift when actuals diverge from predictions in a structured way.

Mistake 6

Using cross-genre benchmarks as a forecast input

LTV curves vary dramatically by genre: hypercasual games monetise heavily in the first 7 days primarily through IAA, while 4X and strategy games carry long, heavy IAP-driven tails where a meaningful share of D365 LTV materialises after Day 90, in some sub-genres the majority. Studios using generic mobile gaming benchmarks to evaluate a specific genre will systematically over-bid on slow-monetising titles and under-bid on fast-monetising ones. The fix is genre-calibrated training data, either from the studio's own portfolio or from a cross-customer aggregate filtered to comparable titles.

Landscape

The Cohort Forecasting Tool Landscape

Cohort forecasting draws from several categories of tooling, and most mature studios use a combination.

Mobile Measurement Partners (MMPs)

AppsFlyer, Singular, Adjust, and Kochava are the data foundation for cohort forecasting. They define cohorts at the install level, track in-app events and revenue, and produce the structured timeseries that prediction models consume. MMPs do not themselves produce forward LTV forecasts. They produce the data that forecasting tools predict on.

In-house BI and Data Warehouses

Metabase, Looker, Tableau, and similar BI tools pulling from a centralised data warehouse handle retrospective cohort analysis well. They answer "what was the D90 ARPU of our April cohort?" efficiently. They rarely contain the ML layer needed to answer "what will the D365 ARPU of our May cohort be?"

Statistical and ML Stacks

Studios with in-house data science teams sometimes build cohort forecasting on top of Python (with libraries like lifelines, scikit-learn, XGBoost) or R. The barrier is not the libraries (which are mature) but the productionisation: data pipelines, automatic retraining, drift detection, integration into the UA tooling stack, and ongoing maintenance against MMP API changes.

Specialist Cohort Forecasting Platforms

This category, which includes Kohort's Ktrl as well as tools like GameAnalytics (for smaller studios) and bespoke in-house ML stacks, sits on top of MMP data and adds the forward prediction layer. Differentiators within this category are model accuracy at early prediction windows (D7 and D14), training data approach (industry averages versus studio-specific versus cross-customer aggregate), and how tightly the forecast is integrated into campaign-level ROAS decisions.

Specialist platforms typically deliver production-grade cohort forecasts in weeks. Equivalent in-house builds usually take 6 to 18 months before producing trustworthy output, and run roughly $1M to $2M per year in fully-loaded data science headcount depending on team size and seniority. Current US ML engineer total compensation sits at $160K to $260K per role; a production-grade cohort forecasting pipeline typically requires 2 to 3 dedicated engineers plus shared infrastructure. Specialist platforms also aggregate cross-customer training data, which improves early-window prediction accuracy in ways a model trained only on a single studio's history cannot match.

Most studios spending $5M+ per month on UA, or running M&A processes on a mobile portfolio, run a specialist cohort forecasting platform for the foundational LTV prediction layer, then augment with in-house tooling for studio-specific extensions (custom cohort segmentations, investor reporting templates, integrations into the financial planning stack). The pattern mirrors how the same studios use MMPs like AppsFlyer or Singular rather than building attribution from scratch.

Glossary

Cohort Forecasting Glossary

The following terms appear throughout this page and across cohort forecasting discussions.

Cohort
A group of users defined by a shared acquisition characteristic, most commonly install date (daily, weekly, or campaign-flight cohorts).
LTV (Lifetime Value)
The total revenue a user or cohort is expected to generate over a defined period (D30, D90, D180, D365). Usually expressed as ARPU at a given day horizon.
ARPU (Average Revenue Per User)
Total cohort revenue divided by total installs in the cohort, at a specified day horizon (D7 ARPU, D90 ARPU, etc.).
ARPDAU (Average Revenue Per Daily Active User)
Total daily revenue divided by daily active users. A different denominator than ARPU; reports on engagement quality rather than cohort-wide value.
Retention curve
The percentage of users still active at each day after install. The shape of the retention curve is one of the strongest predictors of long-tail LTV.
LTV curve
Cumulative ARPU plotted against days since install. The shape of this curve determines payback period and forecast accuracy at each horizon.
Confidence interval
The predicted range around a point forecast at a stated probability (typically 80% or 95%). Wider intervals indicate noisier forecasts.
D-N notation
Shorthand for 'day N after install.' D7 = revenue or retention measured at 7 days; D365 = same metric at one year.
Whale tail
The long, slow-decaying portion of a cohort's LTV curve, driven by a small number of very high-spend users. Strategy, RPG, and 4X games typically have heavy whale tails.
Curve fitting
Fitting a parametric mathematical function (Weibull, exponential decay, power law) to observed cohort data to project forward.
Cross-customer training data
Aggregate cohort data from multiple studios used as a training input, allowing models to learn patterns that no single studio's data would expose.
Drift
The gradual divergence of model predictions from realised outcomes, typically caused by changes in the game economy, audience composition, or attribution environment.
FAQ

Frequently Asked Questions: Cohort Forecasting

Solution

About Kohort's Ktrl Platform

Kohort built Ktrl specifically for the cohort forecasting problem described on this page. Ktrl connects directly to your MMP (AppsFlyer, Singular, or Adjust), trains gradient-boosted prediction models on your studio's own cohort data, and surfaces D90, D180, and D365 ARPU and ROAS forecasts at the campaign level within the first 7 days of each cohort, with explicit confidence intervals at each horizon.

Ktrl is trained on $6B+ of mobile gaming UA spend across studios from hypercasual to mid-core, and is the cohort forecasting platform of choice for studios that want forward LTV signals without building the prediction infrastructure themselves. The same forecasts are used inside leading gaming investors and M&A advisors for diligence on mobile portfolios.

References

Further Reading and External Sources

Ready when you are

Stop hoping your campaigns pay back.

Daily ROAS signals trained on $6bn of UA spend. Free to start, set up in minutes.