We already know that ranking pages based on Views across an entire website is a very bad idea. It's even worse to use such data to decide what to prioritise for attention.

The reason is that web pages are far too varied in length, scope and density.

There is no such thing as a "standard" web page, so comparing or prioritising them across an entire site using Views just leads to bad decisions.

Apples and oranges

For example, imagine I have a website that contains content on 2 topics: Domestic Policy and Foreign Policy.

Domestic Policy has bad content design. It consists of just 1 very long page. It is packed with information on several themes and contains thousands of words.

Pages within Domestic Policy and Foreign Policy

Foreign Policy has a somewhat better design solution. Although it contains approximately the same volumes as Domestic Policy, it is separated into 25 discrete pages. Each of these is short and targeted at individual sub-themes.

If you were to rank these pages across the entire site based on Views, the page about Domestic Policy would almost certainly get a higher ranking than any in Foreign Policy.

The reason is that anyone looking for information on Domestic Policy is compelled to locate it on a single page, no matter what aspect of that topic they are interested in.

In contrast, each of the shorter pages about Foreign Policy will only attract readers interested in that particular sub-theme. As such, they will get a much lower number of Views per page.

The danger is that—by relying on Views alone to track interest across the website—the web team could ignore all the meaningful and important differences in content design (length, scope, density, numbers of pages, etc) between these 2 topics.

This could lead them to conclude that Domestic Policy is by far-and-away the most popular content on the site. They may then decide to concentrate all available resources on it—and perhaps even consider deleting pages about Foreign Policy, because they seem so little used.

This is bad decision-making driven by bad data.

You can make far better decisions by ditching pages and tracking user interest using content topics as a 'standard set'.

Topics as the new standard set

Exit the web page "Hall of Mirrors"

A content topic is a set of pages that encompass some discrete subject matter.

For example, I might define a topic "Brexit" for all pages on my site about the UK's exit from the EU. Importantly, these pages do not need to be located in a single place, they can be anywhere on my website.

I then transpose these topics (very carefully!) into Google Analytics as Content Groups, using Regular Expressions. A measure of each topic based on Visits will soon emerge.

In this way, topics are perfect as a standard set for identifying and ranking the high-level content that users are most interested in.

Topics mean you don't have to worry whether one set of content has 25 pages and another just 1—or indeed any other difference in length, scope, volume, density or content design. These distinctions no longer matter and can be ignored.

You can finally exit the confusing hall of mirrors created by tracking Views of web pages. Content topics reveal users' true interests across the entire site as measured by Visits.

Topics make comparisons much more meaningful and drive better decision-making.

Hall of mirrors for web pages

Note: Page Views do have an important role when analysing content within individual topics. At that granular level, they are useful for isolating top information—but not at a cross-site level. This is something I will explore in my next article.

Focus on the top topics and ignore the rest

To see how this works, let's look at summary data from a sample site. After analysing and categorising thousands of pages, the site is shown to contain +/-100 topics. Of these:

This is exactly the type of information a web team needs to decide which content to prioritise—and which to ignore.

Comparative ranking of Domestic Policy and Foreign Policy using Views and Visits

As you can see in the image above, although the Domestic Policy landing page looks like the most popular content on the site (based on Views), when we examine the entire website using topics we discover that Foreign Policy is much more popular.

The cumulative Visits to Foreign Policy as a unified topic, far outstrip those to Domestic Policy. This tells us it would be a bad mistake to spend time on Domestic PolicyForeign Policy must get priority.

Indeed, based on the numbers above, we see that 25 topics attract the vast majority (80%) of user interest across the entire website. Concentrating all available effort on these will generate by far the biggest benefit for users.

And the remaining topics?

Of course, it would be nice to improve everything, but the Visit ranking tells us that 75 topics are simply of little interest to users.

Not only would upgrading these topics demand significant effort, the marginal benefit would be vanishingly small. It would be crazy to spend limited resources on the content that gets the least engagement.

As such, aside from some basic Quality Assurance (QA), you must ignore this long tail. That's the reality for under-resourced web teams.

All effort must go into top content first. Only when that is as good as it can possibly be, should secondary content be considered.

In my next article, we'll dig deeper. I'll show how to analyse data from Content Groups to isolate the topics that will benefit most from investment in readability.


And lastly—what to do with the navigation and wayfinding pages

As you create your Content Groups, remember to build your RegEx queries to exclude pages that are primarily used for navigation or wayfinding.

My rationale is that such pages do not (usually) include the destination information that users are looking for—they are mere signposts to the destination.

Indeed, the ideal (the "platonic website"!) form of a website would have no navigation or wayfinding at all. Everything desired by the user would be immediately present to them.

(As an aside—perhaps Google fulfils this platonic role? Its search results are so good, that many navigation and wayfinding features within a website are often unused, especially homepages.)

For instance, imagine the Foreign Policy topic above contains the following pages:

  • 1 landing (navigation) page
  • 4 navigation pages to sub-themes, e.g. US, EU, Asia, Africa
  • 5 information pages per sub-theme

In terms of gauging user interest, I only want to track Visits that actually engage with the core information.

A Visit that merely browses the landing or navigation pages, does not demonstrate attention to the core topic. The user may be lost or looking for something else. I do not want that activity to affect my rankings.

As such, when transposing topics into Content Groups, I write the RegEx to specifically exclude landing and navigation pages.

This makes my data much cleaner and more accurate, as only it includes engagement by users who feel compelled to engage with the destination information.

Of course, I don't completely ignore landing and navigation pages.

I also create separate "wayfinding" Content Groups to capture their activity. This can then be used assist other UX investigations, e.g. findability, actionability.

Read my the previous article: "How to set-up Content Groups to track what your users are interested in" (May 2021).


Note: The article above refers to Content Groups in GA Universal. GA4 also includes Content Groups, but the configuration is more complex. I have set it up yet but I look forward to exploring it, as it includes useful new metrics, e.g. Visitors as well as Visits.