N-gram analysis for keywords

N-gram analysis for keywords



Presentation of the possibilities of in-depth analysis of keywords and searched phrases using n-gram.


Target group

Google Ads Specialists


Table of Contents:


N-gram analysis for keywords

The idea of analyzing search terms – what are we doing this for? 

What is n-gram analysis?

N-gram – how to perform an analysis?

What are the conclusions from the n-gram

Find new keywords / negatives, improve quality score

Check the results of individual words

Collect large amounts of data in bulk

Decide to enter a negative keyword and match it

Investigate the effectiveness of long phrases

Is it worth using the combinations


Choosing the right keywords for your Google Ads campaign is one of the most important aspects of running an advertising account. According to internetlivestats.com, about 16-20% of search terms entered in Google each year are unique, i.e. they are entered in the search engine for the first time (!).


What does this mean for us? This is almost an infinite number of possibilities – combinations of keywords that we can use in campaigns.


This can be considered as a huge advantage and almost infinite management potential. On the other hand, it is this enormity of possibilities and the need to “break through” to the most important, valuable phrases that is a challenge for digital marketers around the world.




The idea of analyzing search terms – what are we doing this for?

When running a Google Ads campaign, you need to choose the right keyword match. If we want our advertisement to appear after entering one specific phrase (and closest synonyms), the obvious choice will be choosing an exact match. This gives us an amazing advantage in determining the optimal rate for a click and matching the appropriate message.


This significantly affects the CTR of ads, as well as the quality score. When choosing such a solution, it would seem that we spend our funds exactly where we should. The disadvantage of this solution is, however, that we significantly limit the potential of our campaigns and the number of queries to which we could show our ads.

Search terms report – inspiration and warning

When examining the possibilities of new phrases, you cannot miss the search terms report. You can check which phrases users entered by clicking on our ad, as well as which keyword these queries triggered.


The report is a great source of inspiration when adding new keywords, but also shows us whether we are displaying our campaigns with unwanted terms.

Optimize with a list of negative words


The next step in optimization is to enter lists of negative keywords/words. They effectively limit the display of ads for searches related to the promoted product or service, but not indicating the intention to use it – e.g. related to the search for information on interior design (“how …”, “what …”, etc.). Example below:





Going deeper – let’s search for some other queries connected with arranging your house:





On one hand, you could have a lot of different search terms regarding arranging a living room. But on the other hand, there are some possibilities which you should avoid when they are not related to your offer. For example:






A good way to start is to think of queries that are related to your brand and design/arranging space, but not exactly with your services – in this case arranging a garden or garage is not part of company service.

So the approach could be to exclude them at the beginning of your campaign – “garden” and “garage” (in phrase match). Excluding words in phrase match type will help us to avoid queries related to them. Your ads won’t be displayed when aforementioned queries are searched – “arrange garden flowers”, “arrange garage space”, etc. (so basically everything which contains the word “garden”).


Try to find those queries, which have small buying intent related to your service. During your campaign you’ll find hundreds or even more examples like above, depending on the scale of the campaign and its history.


What results from keyword analysis?

  • By adding new keywords, we find a new way to reach potential customers,
  • We better control the cost of such words,
  • We successfully match the ad to the new keyword, which has a positive impact on the quality score,
  • We avoid unwanted clicks,,

What is n-gram analysis?

The search terms report is a great way to make the most important changes to our set of negative keywords. However, the products we promote can be found using many phrases that are not obvious at first glance. Phrase or broad match modified match (“interior design” or +interior +design) can give us an infinite number of combinations of search terms.


The trick is to effectively limit, by using negative keywords on those terms that will not give us valuable users. As an example of the phrase “garage”. One effective solution is to search for such terms manually by reviewing the search terms report. However, it is quite time consuming :). A better idea would definitely be to use all the data at once, using a script.


How do you find phrases using the n-gram script?

Within such a situation, the so-called n-gram analysis comes to save a day! Unlike the standard set of search phrases, the n-gram set has an additional dimension. It allows us to divide search terms searched by users into one-, two-, three-, etc. word phrases.


The n-gram results contain a number of tables and each of which presents a statement for one variant of phrases – they consist of one word, two, etc. For example, by selecting the maximum number of words 7, the last table will contain phrases that are made up of 7 words. In the table we have:


1-gram – one word phrases,

2-gram – two word phrases,

3-gram – three word phrases,



For example, we have two entries in the search phrase report:

  • interior design loft inspirations,
  • interior design studio.

In the n-gram analysis, the above searches will be broken down into the following cases (below is a list of individual n-gram tables):


1-gram – interior, design, loft, inspirations, studio,

2-gram – interior design, design loft, loft inspirations, design studio,

3-gram – interior design loft, design loft inspirations, interior design studio,

4-gram – interior design loft inspirations.


In the results table we get an understanding about the impact made by each individual search term for on the campaign results. In addition to the word itself, we have the following information:

  • Query count (number of search queries) – how many times a given word appeared “inside” other, longer phrases,
  • Clicks,
  • Impressions,
  • Cost,
  • Number of conversions,
  • Conversion value,
  • CTR,
  • CPC,
  • Conversion Rate,
  • Cost / conversion,
  • Conversion value / cost. (Conversion value / cot = ROAS)

Below is an example of a list of tabs that the list consists of:


The list includes phrases in terms of account, campaign and ad group.


By far the most interesting tabs are “Account Word Analysis” and “Account 2-Garm Analysis”, because they allow you to find critical words contained in search terms that by filtering and sorting give the lowest (or zero) ROAS. Or they have the highest cost per conversion (or lack thereof). It is worth considering only those words that have a minimum number of clicks, e.g. 100 or we can clearly recognize in them a low value for the user in our target group. These points will be discussed in more detail later in this article.

N-gram – how to perform the analysis?

How to do an n-gram analysis? Brainlabs, the London agency comes to help, and provides a free script on its website. The site also includes detailed instructions for use.


The script should be entered into the tool: 

Google Ads – Tools & Settings -> Bulk Actions -> Scripts


After copying, remember to first complete the field with the destination of the Google Sheets file (variable – spreadsheetUrl) in which the data will be generated. After entering the variable, the script is ready to be used.


The script has optional parameters that can be changed in the script content itself.

They are e.g.

  • startDate / endDate (date range from which data is to be taken),
  • campaignNameContains,
  • campaignNameDoesNotContain – it is worth considering excluding brand campaigns here (remember to introduce a campaign naming system),
  • ignorePausedCampaigns – if your account contains past, paused campaigns, consider the option “false” to analyze all historical data,
  • ignorePausedAdGroups,
  • checkNegatives – the script will not generate phrases that will be held up by your list of negative keywords. If we have verified and systematized lists of negative keywords, it is possible to exclude terms from the lists – thus we will get less data for analysis,
  • minNGramLength / maxNGramLength – minimum and maximum number of words in generated phrases,
  • clearSpreadsheet – empty sheet before generating.

Below is the code screen with variables.





Below we will find another section where we set the boundary conditions for the value of individual campaign results, the so-called Thresholds. What does it mean? Using this section, we can limit the display of phrases that appear less frequently than the number indicated.


For example, by entering “impressionThreshold = 5” in the variable, the algorithm will skip searches that have fewer than 5 impressions in the report.






What are the conclusions from the n-gram?

The basic responsibility of the person responsible for maintaining the Google Ads advertising account is the analysis of search terms. Suppose you have entered a phrase matching word in your keyword set:


Interior design


For example we can see that 2-gram showed us that the query was triggered 2 times. When we checked 3-gram analysis it occurred, that these two words came up in 2 variants:

  • Interior design studio,
  • Interior design website.

It could be a tip to separate keywords from phrase match “interior design” to different ad groups and put them into exact match types – [Interior design studio], [Interior design website]. Later on, you could create more precise first headlines to these ad groups.


In this way, we can increase the Quality Score, ad relevance and, consequently, lower the cost of clicks.


This is one of the basic approaches to keyword analysis. However, n-gram analysis allows us to provide more detailed and aggregate analysis of searched phrases divided into specific words. So what answers can he give us?


Examples of conclusions that we can draw from the analysis.


Collect large amounts of data in bulk


Does your new order contain large data sets (keywords, long account history)? Great!


The most important advantage of the n-gram statement is that it allows for aggregated analysis of searched phrases, which in turn translates into the time of analysis. Let’s assume that we are once again considering the word “interior design”. Over the past year, the word could appear dozens of times in search results in various matches. For example, clicks may have related to the following searches:


Interior design inspirations

Interior design project

Interior design ideas

Interior design work

Creative ways of interior design

Interior design designers

How much costs interior design

Interior design cost



Searching your search history for many years can be very tedious. The advantage of the n-gram statement is that the words from the above searches – “inspirations”, “ideas”, can be found in n-gram as one instance. An example from the table above – “interior design” (query count – 6).


This is a huge time saver!


N-gram will tell us how many different searches the word appeared on and how many impressions, clicks, conversions brought us. We will also get information on which campaign such a search took place, thanks to which we can easily assign them to the appropriate list of negative keywords, at the campaign level or at the account level. All this based on one row of the table.


Check the performance of individual words


Consider the same example:


Interior design


Search phrases:


Interior design studio

Interior design designers


Both words could have right buying intention, but:

  • Search term “designers” brought us one conversion for 50$,
  • Search term studio got 5 conversions, but the cost per conversion was only 5$.

Despite of potentially right buying intention by us, what is the conclusion? You should definitely consider excluding search terms like “designers”, which is simply to expensive in regards to cost per conversion. Of course this approach should be considered by strategy, budget, etc.


Tip – first of all, focus on the words with the highest conversion value.




Aggregating phrases to exclude allows us to gain a broader view of which phrases generated a higher conversion cost than expected. Just enter the filter in the Cost / conv column. (cost per conversion) > medium / high (or specified by the customer).





Money saved for the next year (based on data from the last year):



Another approach to analysis is the introduction of a condition that allows us to determine searches that did not generate any conversion and brought only cost.


Enter filter – conversions = 0.







Money saved:



Customer response – priceless;)




Of course, keep in mind that every phrase we intend to exclude is a matter of individual approach. In addition, of course, not all phrases that meet the two above exclusion conditions would be suitable (we do not want to exclude qualitative phrases).


Decide to enter a negative keyword and match it


There are phrases that at first glance we should exclude at the level of the entire account or campaign. An example of such a word is free, where it would seem that the first impulse should be to exclude it.


Analyzing the data, however, it may turn out that the word free has brought some conversions for our store. Why?


A potential customer could look for information about the free delivery of product x, not about its free version.


In this case, depending on our offer, we exclude the word:

  • “Free x (product name here)”,
  • “Free” – unless we guarantee free delivery by promoting our offer,



Did I not exclude too much?

Perhaps you may already have an extensive structure of negative keyword lists and are using lists at the campaign or ad group level. It is worth considering whether you exclude phrases that in the past brought you results, but for some reason you decided to exclude them.


How to verify it?


Take all the phrases of all negative keyword lists you have and copy them to Google sheets. The first thing to do is to remove all characters that may indicate the type of match – broad modifier, to expression, exact – “+”, “” “,” [“,”] “. Use the replace function in Google sheets for this.


Example for phrase match:






We remove match characters to compare words from lists with searches from n-gram analysis. So go to the one-, two-, three-word fox n-gram and check the condition that the searched word appeared in your list of negatives.


You can use a formula count.if:


=COUNTIF(‘lista negatywów bez znaku dopasowania’!$H:$H,A2)



  • H:H is the column where your negatives are located,
  • A2 is a word from the n-gram list in any given row.

Then check what the word’s past results were and decide whether to remove the word from the negative list.





In this way, it may turn out that the exclusion lists contain phrases that have brought us conversions in the past (several, a dozen or so), at a satisfactory cost per conversion. Currently, however, they do not bring us results due to the presence in the exclusion list



First, to analyze the data, you need to do an n-gram analysis for a long enough period of time to “catch” the moment before adding the word to exclusions. Secondly, any changes to account exclusions should be discussed in detail with the client. Although phrases may seem like a waste at first glance, there may be a deeper reason why the word was added to the list.


However, if you find phrases that should definitely “get back in the saddle” again, you will find another value for your client 🙂



Investigate the effectiveness of long phrases


By executing the script, we have the option of entering the maximum number of words that we want to analyze at once. The higher the number, the longer the phrases will be analyzed.


Valuable information may be that the most valuable conversions brought us a phrase consisting of 5 words, which was a question about the promoted product. Of course, the number of results consisting of 5 words may not be impressive, but the use of such a word consciously can bring great results.


We can also enter this phrase into Google Ads separately.


Is it worth using the combinations?

N-gram analysis is undoubtedly one of the most effective ways to thoroughly analyze your advertising account for keywords. By applying the right inference methods, we can significantly reduce the cost of clicks.


However, it should be remembered that the use of n-gram makes the most sense if we have a relatively long account history and sufficient data for analysis.


PS There is also a n-gram script for analyzing ads – n-gram for ads


If you would like to start digital marketing activities, you need help optimizing results or you would like to improve your performance – contact us!

You may also like

How to automatically download leads from lead form extension in Google Ads? Use Zapier and Webhook!

Recent months have brought us lots of innovations too – both for users and advertisers. It is increasingly evident that a sharp turn to automation is impossible to avoid. However, aside from such expansive and time-consuming projects as work on machine learning, specialists from Silicon Valley make new improvements and features available in due time. One of such changes is the presentation of lead form extensions on accounts. For the time being, it is a BETA version, however an extremely promising one.

Czytaj więcej »