The Princeton GEO Study: Benchmark & Findings Explained
The Princeton/IIT Delhi/Georgia Tech GEO paper (KDD 2024) tested 9 optimization strategies on 10,000 queries. Here are the quantified results.
The Princeton GEO study — "GEO: Generative Engine Optimization" by Aggarwal, Dugan, et al. — is the foundational research paper that turned GEO from a marketing buzzword into a measurable discipline. Published at KDD 2024, it is the first systematic, benchmarked study of what makes content get cited by AI search engines.
If you read only one source on GEO, this is it. The paper introduced GEO-bench, tested 9 optimization strategies on 10,000 queries, and quantified the visibility lift of each. Every claim about "statistics +33%" or "expert quotations +41%" traces back to this study.
Key numbers from the study: 10,000 search queries across 9 datasets. 9 optimization strategies tested. Top performer: expert quotations at +41% visibility. Worst performer: keyword stuffing at −8%. Position-5 pages gained +115% visibility; position-1 pages lost 30%.
The research collaboration
The paper has authors from three institutions: Princeton University, Indraprastha Institute of Information Technology Delhi (IIT Delhi), and Georgia Institute of Technology. It was presented at KDD 2024, the ACM SIGKDD Conference on Knowledge Discovery and Data Mining — one of the top-tier venues for data science research. The arXiv preprint is arXiv:2311.09735, posted in November 2023.
"Generative engines… now answer a significant fraction of user queries directly, often eliminating the need to click through to websites. We introduce Generative Engine Optimization (GEO)… the first framework to optimize content visibility in generative engine responses."
GEO-bench: how visibility was measured
The researchers built GEO-bench, a benchmark of 10,000 real search queries drawn from 9 datasets, including Google Search queries, Perplexity queries, and other AI engine sources. Each query was run against a generative engine, and the AI's answer was analyzed for which sources it cited and how prominently.
The visibility metric is position-adjusted word count. For each source cited in an AI answer, the researchers count how many words from that source appear in the answer, weighted by position. Words appearing earlier in the answer (closer to the top) count more than words appearing later. This captures both whether you are cited and how prominently.
This metric is more nuanced than "are you cited?" — it rewards sources that contribute substantial, prominent content to the answer, not just a passing mention.
The 9 strategies tested
The researchers tested 9 content modifications, each applied to a baseline page. The visibility lift was measured against the unmodified version.
| # | Strategy | Visibility lift | Best for |
|---|---|---|---|
| 1 | Expert quotations | +41% | Analysis, opinion, people |
| 2 | Statistics addition | +33% | Law, policy, business |
| 3 | Fluency optimization | +29% | Business, science, health |
| 4 | Cite sources | +28% | Factual queries |
| 5 | Quotation addition | +21% | History, biography |
| 6 | Easy-to-understand language | +11% | Technical topics |
| 7 | Technical jargon avoidance | +9% | General audience |
| 8 | Authoritative tone | +8% | Trust-building |
| 9 | Keyword stuffing | −8% | ⚠️ Harmful in GEO |
Source: Aggarwal et al., "GEO: Generative Engine Optimization," arXiv:2311.09735, KDD 2024. Visibility measured by position-adjusted word count on GEO-bench.
The democratization finding
The most strategically important finding is not the strategy ranking — it is the distribution of visibility gains across page positions. The Princeton study found that GEO optimization has the largest impact on lower-ranked pages:
- ▸ Position 5 pages: +115% visibility after GEO optimization
- ▸ Position 1 pages: −30% visibility after the same optimization
This is the opposite of SEO, where the top positions absorb most of the gains from further optimization. In AI search, the dominant incumbents have less to gain and something to lose, while new entrants have the largest upside. GEO democratizes search visibility.
Why this matters: For two decades, SEO rewarded incumbents — high-authority domains with deep backlink profiles. GEO inverts this. A new publisher with original data, expert quotations, and clean structure can out-cite a top-ranked incumbent in AI answers. The barrier is content quality, not link equity.
How to apply the study to your content
The Princeton study's findings translate directly into a production checklist:
- 1.Add expert quotations (+41%) — interview named experts, attribute fully, and use blockquotes for direct quotes.
- 2.Add statistics (+33%) — replace vague claims with specific numbers, dates, and named sources.
- 3.Optimize fluency (+29%) — eliminate redundancy, ensure subject-verb clarity, and make paragraphs progress logically.
- 4.Cite your sources (+28%) — link to authoritative platforms and academic papers for every factual claim.
- 5.Stop keyword stuffing (−8%) — replace keyword repetition with semantic coverage using synonyms and related entities.
Limitations of the study
The Princeton study has real limitations. It tested on a single generative engine (the open-source GPT-3.5-based system used in the research), not on production ChatGPT Search or Perplexity. The 9 strategies were applied in isolation, so interaction effects are unmeasured. The 10,000-query benchmark, while large, is a sample — niche topics may behave differently. Treat the numbers as directional, not exact.
That said, the directional findings — that factual density, citations, and quotations help while keyword stuffing hurts — have been independently observed by Previsible, Seer Interactive, and Profound in production AI search data. The Princeton study remains the most rigorous single source on GEO.
Frequently asked questions
What is the Princeton GEO study?
It is the paper "GEO: Generative Engine Optimization" by Aggarwal, Dugan, et al. from Princeton, IIT Delhi, and Georgia Tech, published at KDD 2024. It introduced GEO-bench (10,000 queries, 9 datasets) and tested 9 optimization strategies with quantified visibility lift.
What is GEO-bench?
GEO-bench is the benchmark introduced by the Princeton GEO study. It contains 10,000 real search queries across 9 datasets. Visibility is measured using position-adjusted word count — words from a source in the AI answer, weighted by position.
Which GEO strategy gives the biggest visibility lift?
Expert quotations at +41%, followed by statistics at +33%, fluency at +29%, and citing sources at +28%. Keyword stuffing is the only strategy with negative impact (−8%).
Does GEO work better for low-ranking or high-ranking pages?
GEO works dramatically better for low-ranking pages. Position-5 pages gained up to 115% visibility; position-1 pages lost 30%. GEO democratizes AI search visibility.
References: Aggarwal, P., Dugan, L., et al. "GEO: Generative Engine Optimization." arXiv:2311.09735, KDD 2024. · Princeton University, IIT Delhi, Georgia Tech. · Previsible 2025 AI Search Traffic Report (independent confirmation of directional findings).
Want to check your site's GEO readiness?
Run the 27-point GEO auditRelated articles
What Is GEO (Generative Engine Optimization)? Complete Guide
GEO is the practice of optimizing content to be cited and referenced by AI search engines like ChatGPT, Perplexity, and Google AI Overviews. Learn how it works.
GEO vs SEO: 7 Critical Differences You Need to Know
SEO targets keyword rankings and clicks. GEO targets AI citations and brand mentions. This guide breaks down the 7 key differences with data.
How AI Search Engines Work: RAG Architecture Explained
AI search uses Retrieval-Augmented Generation (RAG) to find, rerank, and cite sources. Understand the 4-stage pipeline that decides what gets cited.