Configure what matters

Content clusters

A cluster is a named group of pages. Instead of watching 500 individual URLs, you watch “Healthcare” or “Pricing pages” as a cohort — views, impressions, avg position, bounce rate, and AI referral share all roll up together.

How clusters work

Each cluster holds a list of URL patterns. During analysis, the engine in api/services/analysis.py tests every page path against those patterns. Pages that match get aggregated into a SnapshotCluster row — total views, total impressions, total clicks, average position, average bounce rate, and AI referral percentage.

A page can belong to more than one cluster if its path matches multiple pattern sets. The underlying page row is stored once; aggregation is per-cluster at query time.

Pattern syntax

Patterns match the URL path component only — no domain, no query string. One wildcard is supported: * matches any sequence of characters including slashes.

TEXT
/healthcare/*          matches /healthcare/ehr, /healthcare/hipaa/overview, etc.
/blog/*                matches every post under /blog/
/plans/                matches only the exact path /plans/
/lp/*                  matches all landing pages under /lp/
/*/pricing             matches /product/pricing, /enterprise/pricing, etc.

The matching implementation is in _path_matches() in api/services/analysis.py. It converts the pattern to a regex (* becomes .*) and tests the full path from start to end. Patterns are anchored — a pattern must match the full path, not a substring.

Test before saving
Paste a sample path next to your pattern mentally to check: does /blog/* match /blog/what-is-seo? Yes. Does /blog/ (no wildcard) match it? No.

Auto-suggest: let the LLM propose clusters

The fastest way to get started is Auto-configure & analyze on the dashboard. That button triggers the cluster suggester in api/services/cluster_suggest.py, which:

01

Pulls your top 40 pages by views

It calls pull_ga4_page_data over the last 30 days, sorts by views descending, and takes the top 40 paths. A GA4 property must be connected — the endpoint returns a 400 if not.

02

Sends paths + business context to Claude

The top pages plus the What matters text you wrote at signup (stored as business_context on the site row) are sent to claude-sonnet-4-20250514. The model is told to propose 3–6 clusters, skip patterns that match nothing, and avoid a single catch-all cluster.

03

Returns suggestions — you accept or discard

The API (POST /{site_id}/clusters/suggest in api/routes/site_config.py) returns suggestions without persisting them. The dashboard lets you review and accept. Accepted suggestions are bulk-created via POST /{site_id}/clusters/bulk.

Business context matters
The suggester feeds your business context into the prompt. A one-sentence description like “B2B analytics SaaS — conversions happen on /plans/ and /lp/demo-call/” produces more relevant clusters than leaving the field blank.

Adding and editing clusters manually

Open Settings and find the Clusters section. Each cluster needs:

  • Name — what shows in the briefing and dashboard (e.g. “Healthcare”, “Pricing”, “Blog”).
  • URL patterns — one or more patterns, one per line.

To rename a cluster or change its patterns, edit it in place. The update hits PUT /{site_id}/clusters/{cluster_id} in api/routes/site_config.py. Only the fields you send are changed.

To remove a cluster: DELETE /{site_id}/clusters/{cluster_id}. This removes the cluster definition and drops associated SnapshotCluster rows from future analyses. Historical snapshot rows for that cluster are not retroactively deleted.

Plan limits

The free plan allows 3 clusters per site. Pro and Agency plans have no cluster limit. The limit is enforced server-side in api/routes/site_config.py — both the single-create and bulk-create endpoints check current count before inserting.

Starter plan
The starter plan label appears in the codebase but has no active Stripe product and is treated the same as free: 3 clusters per site.

Clusters in the briefing and dashboard

Every analysis snapshot aggregates each cluster. The briefing in api/services/report_generator.py receives cluster metrics as part of its prompt context and surfaces clusters with notable changes — momentum swings, AI referral growth, unusually high bounce rates.

The dashboard cluster table shows per-cluster: total views, total impressions, total clicks, average position, average bounce rate, and AI referral percentage. These numbers come directly from the SnapshotCluster rows written during the last analysis run.

The MCP tool get_cluster_performance(site_id, cluster?) exposes the same data to AI assistants — pass a cluster name to filter to one, or omit to get all.

Practical tips

  • Keep clusters meaningful to your business — the report writer uses cluster names in its output. “Healthcare content” is better than “Cluster 1”.
  • Don't overlap clusters with each other unless intentional. A pricing page can belong to both “Pricing” and “High-intent” if you want to track it two ways.
  • If a cluster shows 0 views, check that your patterns actually match paths in GA4 — the GA4 path dimension includes the leading slash.
  • After changing patterns, run a new analysis to see updated cluster metrics. Pattern changes don't retroactively update older snapshots.

Related: Conversion pages and Vanity filters.