Skip to main content
(02) 4948 8139 0421 647 317

Using TF-IDF Analysis In SEO

As SEO has become more sophisticated, there are numerous tools that can be used to help develop well targeted content, optimised for Google and other search engines. One of these tools is TF-IDF analysis, which may sound complicated but at the basic level it’s designed to look at the common words used by the top ranking web pages for a search, and show how your website compares.

TF-IDF is a mathematical formula that stands for Term Frequency – Inverse Document Frequency. It is a statistical analysis used in information retrieval to measure how important a word may be in a document compared to a larger group. Term Frequency measures the relative frequency of a term within a document, whereas the Inverse Document Frequency is a measure of how much information the word provides (is it common or rare across all documents), so together, the analysis measures how important a keyword phrase may be, compared to the frequency it is used across a set of web pages.

From an SEO perspective, there are a number of online tools – such as Seobility’s – that you can use to run an analysis to look at the top 10 ranking pages for a particular search term, and compare how a specific page uses these terms in relation to the average.

TF-IDF Analysis Tool from Seobility

You might then see a chart with the following metrics listed:

AVG TF*IDF (Total) – shows the average TF-IDF score for a given term, averaged across all of the pages found, so the most common terms are shown first and decline as the chart extends to the right.

Max TF*IDF – indicates the total number of times the keyword was used across all of the pages found.

URL WDF*IDF – relates to the page you have chosen to compare the analysis to, and so WDF means “Within Document Frequency” and refers to the frequency of a keyword within a single document, with a line showing if the usage is above or below the average to help indicate where text changes could be applied.

The idea behind this analysis is to consider how Google may be ranking web pages for a search, by looking at the specific term/s but also the surrounding content and related terms that provide the context for the search term theme. As Google’s ranking system has become more sophisticated and uses machine learning to look at signals for meaning and intent, the assumption is that Google uses a similar system to analyse word usage and patterns.

Therefore, by running this TF-IDF report for the top ranking pages and comparing to your own page of content, you can see which are the most popular terms being used, and how your content relates to these. You can therefore see where you may be using a word more or less frequently than the average, and so where you could look at rewriting content to see if this can improve your ranking for the term.

It may not be a magic solution, but some web pages have seen good improvements as a result of this analysis and by revising the page content around the core search term being targeted. Also if you web page is ranking just below the first page results, this could help tip your page into the first page and so increase your search visibility and the chance to receive more clicks from your target market. It’s certainly one way of testing changes to your content to see what impact this may have on your ranking performance.

You can also use this tool prior to writing a new page of content – such as a blog or news article – and see what words are being used by the top ranking sites, and so you can create your own content around this pattern to see if it increases your ranking chances for the main search term you are targeting.

As ever, if you have any questions about optimising your website, please contact us for a discussion. We have been SEO specialists since 2000.