GEO-bench is the benchmark introduced by the Princeton GEO study. It contains 10,000 real search queries sourced from 9 datasets, including Google Search, Perplexity, and other AI engines. The benchmark measures visibility using position-adjusted word count — how many words from a given source appear in the AI answer, weighted by position.

How is visibility measured in the Princeton GEO study?

Visibility is measured using position-adjusted word count. The metric counts how many words from a source appear in the AI-generated answer, weighted by the position of those words (earlier mentions count more). This captures both whether you are cited and how prominently you are cited.

Fundamentals

The Princeton GEO Study: Benchmark & Findings Explained

Q: What is the Princeton GEO study?

The Princeton GEO study is the paper "GEO: Generative Engine Optimization" by Aggarwal, Dugan, et al. from Princeton University, IIT Delhi, and Georgia Tech, published at KDD 2024 (arXiv:2311.09735). It is the first systematic, benchmarked study of how content modifications affect visibility in AI-generated answers. It introduced GEO-bench, a benchmark of 10,000 search queries across 9 datasets, and tested 9 optimization strategies with quantified visibility lift.

Q: Which GEO strategy gives the biggest visibility lift?

Expert quotations give the largest visibility lift at +41%, followed by statistics addition at +33%, fluency optimization at +29%, and citing sources at +28%. Keyword stuffing is the only tested strategy with negative impact, reducing visibility by 8%.

Q: Does GEO work better for low-ranking or high-ranking pages?

GEO works dramatically better for low-ranking pages. The Princeton study found that pages ranked 5th in traditional search achieved up to 115% visibility improvement after GEO optimization. The 1st-ranked page actually lost 30% visibility. GEO democratizes AI search visibility, making it especially valuable for new and niche publishers.

The Princeton/IIT Delhi/Georgia Tech GEO paper (KDD 2024) tested 9 optimization strategies on 10,000 queries. Here are the quantified results.

14 min read·Updated 2025-06-22

The Princeton GEO study — "GEO: Generative Engine Optimization" by Aggarwal, Dugan, et al. — is the foundational research paper that turned GEO from a marketing buzzword into a measurable discipline. Published at KDD 2024, it is the first systematic, benchmarked study of what makes content get cited by AI search engines.

If you read only one source on GEO, this is it. The paper introduced GEO-bench, tested 9 optimization strategies on 10,000 queries, and quantified the visibility lift of each. Every claim about "statistics +33%" or "expert quotations +41%" traces back to this study.

Key numbers from the study: 10,000 search queries across 9 datasets. 9 optimization strategies tested. Top performer: expert quotations at +41% visibility. Worst performer: keyword stuffing at −8%. Position-5 pages gained +115% visibility; position-1 pages lost 30%.

The research collaboration

The paper has authors from three institutions: Princeton University, Indraprastha Institute of Information Technology Delhi (IIT Delhi), and Georgia Institute of Technology. It was presented at KDD 2024, the ACM SIGKDD Conference on Knowledge Discovery and Data Mining — one of the top-tier venues for data science research. The arXiv preprint is arXiv:2311.09735, posted in November 2023.

"Generative engines… now answer a significant fraction of user queries directly, often eliminating the need to click through to websites. We introduce Generative Engine Optimization (GEO)… the first framework to optimize content visibility in generative engine responses."
— Aggarwal et al., "GEO: Generative Engine Optimization," arXiv:2311.09735, KDD 2024

GEO-bench: how visibility was measured

The researchers built GEO-bench, a benchmark of 10,000 real search queries drawn from 9 datasets, including Google Search queries, Perplexity queries, and other AI engine sources. Each query was run against a generative engine, and the AI's answer was analyzed for which sources it cited and how prominently.

The visibility metric is position-adjusted word count. For each source cited in an AI answer, the researchers count how many words from that source appear in the answer, weighted by position. Words appearing earlier in the answer (closer to the top) count more than words appearing later. This captures both whether you are cited and how prominently.

This metric is more nuanced than "are you cited?" — it rewards sources that contribute substantial, prominent content to the answer, not just a passing mention.

The 9 strategies tested

The researchers tested 9 content modifications, each applied to a baseline page. The visibility lift was measured against the unmodified version.

#	Strategy	Visibility lift	Best for
1	Expert quotations	+41%	Analysis, opinion, people
2	Statistics addition	+33%	Law, policy, business
3	Fluency optimization	+29%	Business, science, health
4	Cite sources	+28%	Factual queries
5	Quotation addition	+21%	History, biography
6	Easy-to-understand language	+11%	Technical topics
7	Technical jargon avoidance	+9%	General audience
8	Authoritative tone	+8%	Trust-building
9	Keyword stuffing	−8%	⚠️ Harmful in GEO

Source: Aggarwal et al., "GEO: Generative Engine Optimization," arXiv:2311.09735, KDD 2024. Visibility measured by position-adjusted word count on GEO-bench.

The democratization finding

The most strategically important finding is not the strategy ranking — it is the distribution of visibility gains across page positions. The Princeton study found that GEO optimization has the largest impact on lower-ranked pages:

▸ Position 5 pages: +115% visibility after GEO optimization
▸ Position 1 pages: −30% visibility after the same optimization

This is the opposite of SEO, where the top positions absorb most of the gains from further optimization. In AI search, the dominant incumbents have less to gain and something to lose, while new entrants have the largest upside. GEO democratizes search visibility.

Why this matters: For two decades, SEO rewarded incumbents — high-authority domains with deep backlink profiles. GEO inverts this. A new publisher with original data, expert quotations, and clean structure can out-cite a top-ranked incumbent in AI answers. The barrier is content quality, not link equity.

How to apply the study to your content

The Princeton study's findings translate directly into a production checklist:

1.
Add expert quotations (+41%) — interview named experts, attribute fully, and use blockquotes for direct quotes.
2.
Add statistics (+33%) — replace vague claims with specific numbers, dates, and named sources.
3.
Optimize fluency (+29%) — eliminate redundancy, ensure subject-verb clarity, and make paragraphs progress logically.
4.
Cite your sources (+28%) — link to authoritative platforms and academic papers for every factual claim.
5.
Stop keyword stuffing (−8%) — replace keyword repetition with semantic coverage using synonyms and related entities.

Limitations of the study

The Princeton study has real limitations. It tested on a single generative engine (the open-source GPT-3.5-based system used in the research), not on production ChatGPT Search or Perplexity. The 9 strategies were applied in isolation, so interaction effects are unmeasured. The 10,000-query benchmark, while large, is a sample — niche topics may behave differently. Treat the numbers as directional, not exact.

That said, the directional findings — that factual density, citations, and quotations help while keyword stuffing hurts — have been independently observed by Previsible, Seer Interactive, and Profound in production AI search data. The Princeton study remains the most rigorous single source on GEO.

Frequently asked questions

What is the Princeton GEO study?

It is the paper "GEO: Generative Engine Optimization" by Aggarwal, Dugan, et al. from Princeton, IIT Delhi, and Georgia Tech, published at KDD 2024. It introduced GEO-bench (10,000 queries, 9 datasets) and tested 9 optimization strategies with quantified visibility lift.

What is GEO-bench?

GEO-bench is the benchmark introduced by the Princeton GEO study. It contains 10,000 real search queries across 9 datasets. Visibility is measured using position-adjusted word count — words from a source in the AI answer, weighted by position.

Which GEO strategy gives the biggest visibility lift?

Expert quotations at +41%, followed by statistics at +33%, fluency at +29%, and citing sources at +28%. Keyword stuffing is the only strategy with negative impact (−8%).

Does GEO work better for low-ranking or high-ranking pages?

GEO works dramatically better for low-ranking pages. Position-5 pages gained up to 115% visibility; position-1 pages lost 30%. GEO democratizes AI search visibility.

References: Aggarwal, P., Dugan, L., et al. "GEO: Generative Engine Optimization." arXiv:2311.09735, KDD 2024. · Princeton University, IIT Delhi, Georgia Tech. · Previsible 2025 AI Search Traffic Report (independent confirmation of directional findings).

Want to check your site's GEO readiness?

Run the 27-point GEO audit