Home » Blog » How Google identifies this type of content

How Google identifies this type of content

As SEOs, we know a lot about Google. Algorithm updates are usually based on published patents . The fundamental purpose of the cambodia telegram data updates is to eliminate questionable SEO practices .

Questionable practices are any practices that attempt to exploit loopholes in Google’s algorithm to achieve higher search engine rankings.  Google penalizes websites that do this because the content provided to users on their search results pages is generally of low quality, meaning that the search engine results suffer as well.

Anyone who has been playing the SEO game for several years is well aware of the main Black hat tactics that Google penalizes (we will see some concrete examples later in the article).

Quick read: 3 Google patents to know to avoid an SEO penalty

  1. Patent of October 8, 2013 concerning ” content spinning “: automatic rewriting of identical pages to avoid duplicate content.
  2. Patent of December 13, 2011 concerning ” keyword stuffing “: keyword stuffing to position a site on a single word.
  3. Patent of March 5, 2013 concerning ” cloaking “: camouflage of content to deceive the algorithm.

To learn all about these Google patents and discover concrete examples of penalties related to them, simply read on .

 Why is it important how Google identifies Black Hat tactics?

Because you don’t want to accidentally make SEO mistakes that result in Google penalizing you! They’ll think you’re trying to take advantage of the system.

In fact, you simply made some costly SEO mistakes because you e-commerce and retail didn’t know it. To better understand how Google’s algorithm identifies bad SEO practices (and thus better understand how to avoid making SEO mistakes), you should review Google’s patents on some of the most common black hat tactics.

Content spinning

The patent in question: “  Identifying gibberish content in resources  ” (patent October 8, 2013) [1]

Content spinning is often used for link building purposes.

A website will rewrite the same post hundreds of times in an attempt to increase its number of links and traffic , while avoiding it being considered duplicate content. Some sites even manage to generate revenue from this type of content, through advertising links.

However, since rewriting content is quite a tedious task, many sites turn to automatic writing software that can automatically replace nouns and verbs . This usually results in the creation of very poor quality content, or, in other words, gibberish.

The patent explains  by identifying incomprehensible or incorrect sentences within a web page. Google’s system uses several factors to assign a contextual score to the page: this is known as the “  gibberish score  .”

Google uses a language model that can recognize brazil data when a string of words is artificial. It identifies and analyzes the different n-grams on a page and compares them to other n-gram groupings on other websites. An n-gram is a contiguous sequence of elements (in this case, words).

From this, Google generates a language model score and a  query stuffing  score . This is the frequency with which certain terms are repeated in the content . These scores are then combined to calculate the gibberish score. This is then analyzed to determine whether the content’s position in the results page should be changed.

Although the patent does not explicitly state that this system is intended to penalize spin-related articles, these often contain a lot of gibberish and are therefore the first to be penalized.

Keyword Stuffing

The patent in question: “  Detecting spam documents in a phrase based information retrieval system  ” (December 13, 2011) [2]

Keyword stuffing is one of the oldest “black hat” practices. It involves the unnecessary use of numerous keywords in order to improve the SEO of a piece of content.

At one time, many pages contained little or no useful information because they were just stuffing keywords together, with little regard for the meaning of the sentences. Google’s algorithm update has helped put a stop to this strategy.

The patent

The way Google indexes pages based on complete sentences is extremely complex. Addressing this patent (which is not the only one on this topic) is a first step toward understanding the impact of keywords on indexing.

Google’s system for understanding sentences can be broken down into three steps :

  1. The system collects the expressions used as well as statistics relating to their frequency and co-occurrence.
  2. It then classifies them as good or bad based on the frequency statistics it collected.
  3. Finally, using the predictive measure that the system has established from the statistics related to the co-occurrence of words, it refines the content of the list of expressions considered good.

The technology Google uses to accomplish these steps can be a headache! That’s why we’re going to get straight to the point.

 

Scroll to Top