The Ultimate Guide To Yandex Algorithms
When embarking on an international SEO campaign in Russia (and some countries in the Middle East and Eastern Europe), it’s impossible to look past one of Google’s biggest competitors, Yandex.
Yandex was founded in the year 1997 as Yandex Search, by CompTek.
Although its market share outside of Russia is small on the world stage (around 1.5% of global search), it holds a slim lead over Google in Russia with a 47.87% market share (in comparison to Google’s 47.33% by August 2017 estimates).
Recent news and court rulings in Russia have also gone against Google, setting a precedent that could potentially be used by the European Union to diminish Google’s power over Android devices.
So now is the opportune time to get to know one of Google’s biggest competitors a little better.
Yandex, like Google, has evolved since its inception in 1997 and has adapted to changes in user behaviour (switch from Desktop to Mobile), market forces, and has taken steps to limit the manipulation of search results from SEOs.
Yandex Mobile Algorithms
It’s estimated that Russia has almost 79-million smartphone users, and this is expected to grow to more than 93-million by 2021.
As a result, the mobile search experience and mobile SEO has taken heavy focus in the past couple of years.
In May 2017 the Russian Federal Antimonopoly Service ruled that Google’s default and restrictive search engine settings on Android devices were not in the best interest of the consumer and were anti-competitive.
As a result, Google was ordered to open up the Android platform and allow other search-engines to be selected as the device’s default.
Introduced in February 2016, Vladivostok changed Yandex’s core algorithm and how websites and content cater for the mobile user and mobile experience.
Since November 2015 Yandex had been tagging websites within search results as mobile friendly, which was only around 18-percent of Russian websites. This was an early signal to alert webmasters that they needed to begin making mobile first, responsive or mobile versions of their websites.
When Vladivostok was released, websites not yet mobile friendly weren’t automatically relegated down the rankings in favour of those that were. They did begin to see ranking fluctuations however, depending on the device a user was using.
While mobile friendliness is only one of hundreds of ranking factors, it is a critical factor in user experience.
Yandex Artificial Intelligence Algorithms
Yandex first entered the machine learning game in 2009, introducing MatrixNet. MatrixNet builds a formula based on thousands of variables and ranking factors, and weights them differently depending on the search query, intent and interpretations, thus returning more relevant search results.
This was also known as the Snezhinsk algorithm. For short queries with multiple common interpretations, non-commercial content began to outrank commercial content and the core algorithm began to take into account the domain ecosystem and value as a whole, rather than the value of individual pages.
Snezhinsk version 1.1 was released in March 2010 and worked to improve the search result quality for location-dependent queries.
On November 2, 2016 when the engine introduced the Palekh algorithm.
Palekh was designed to improve how the core algorithm understood and interpreted long tail search queries.
Using a series of neural networks, it worked to understand semantics and establish dominant and common interpretations of a long tail query to return search results that were relevant, even if the search terms themselves weren’t physically present in the content.
The Palekh algorithm was then further improved in August 2017 by the Korolyov update.
Speaking with Yandex’s International Communications Director, Melissa McDonald, I was able to get an explanation Andrey Styskin, the Head of Yandex Search, as to how Korolyov works.
Korolyov works to improve the foundations of Palekh, analysing more potential search results in a much more comprehensive fashion, in real-time, a lot faster.
Korolyov builds off of Palekh, Yandex’s first neural network based algorithm that works with the semantic side of queries and learns from users’ search behavior. Korolyov is able to match the meaning of a query with the meaning of pages, as opposed to the way Palekh used to work with headlines only. It also improves off the 150 pages Palekh was analyzing, by its ability to work with 200 000 pages at once.
Similar to RankBrain, Korolyov becomes more efficient with each incremental data point and its results feed into MatrixNet.
At the same time of the Korolyov announcement, Yandex also confirmed that MatrixNet had started to incorporate data from the crowdsourcing platform Yandex.
Toloka, as well as processing further anonymised user data to improve learning. As explained by Andrey:
The queries and clicks from the tens of millions of Yandex users will greatly contribute to Korolyov’s growing improvement. Since introducing Palekh, Yandex search quality has increased 2.8%. We are excited to see how Korolyov will impact the quality of Yandex search for our users too.
Like modern machine learning algorithms, Korolyov also takes steps to better understand a search querie’s dominant and common interpretation by conducting a “meaning analysis”:
Finally, Korolyov has improved semantic vectors that help it to conduct a ‘meaning analysis’, so the algorithm can also take into account the meaning of all queries that led users to a certain page and match them with every new query to that page. Every page is converted into semantic vectors during indexing, allowing Korolyov to work quickly with every new query and improve with the more queries it receives.
As technologies and networks begin to improve, as well as the increasing use of voice search, and the recent Russian Federal Antimonopoly Service settlement with Google, we could see Russia’s ~55 million Android smartphone users move away from Google (default search engine) to Yandex, or others.
Released in November 2018, and building on the previous Korolyov and Palekh updates, Andromeda included over one thousand improvements and features to Yandex’s search algorithms.
These notably included improvements such as enhancements to the “quick answer” functionality and the introduction of “experts” to aid users in getting better answers (quicker) for their queries and giving users an easier way to recognise the most accurate and relevant sites directly on search results pages through the allocation of official site badges.
Andromeda also saw the introduction of Yandex.Collections, a visual way for users to store and navigate content. Users have the ability to save search results in a visual tile format, whether they be links, images, movies, locations, etc… Logged in, a user can access these from mobile and desktop devices and also follow other, public collections – for example, if a user is interested in Zenit (St Petersburg), they can follow public collections for Зенит:
On December 17th, 2019, Yandex announced the Vega update.
The Yandex algorithm now uniquely trains neural networks through the knowledge of real-life subject matter experts to improve search results.
Vega also updated the search algorithms to group similar pages together using AI, meaning that through the clustering technique the resources required to crawl, store, and score the web can be used more efficiently, increasing Yandex’s search index to 200 billion documents.
As part of the Vega update, Yandex introduced Yandex.Q, a new question-and-answer service.
Yandex.Q makes use of the Yandex.Experts tool, which was introduced in the 2018 Andromeda update, and TheQuestion, a Q&A forum acquired by Yandex in early 2019 to form the new service.
Yandex.Q contains more than one million questions and answers from a gamut of subject matter experts. Users will type questions as normal in the Yandex.ru search bar, and Q answers will appear at the top of search results (akin to featured snippets in Google). The example given by Yandex in the Vega blog post is:
For example, someone looking for information on Alexander Pushkin can see answers from a literary critic, or a search on the behaviors of seals will result in a response from the head of the National Arctic and Antarctic Museum.
Yandex Link Based Algorithms
Like Google, Yandex’s algorithm was once subject to abuse from link spammers and PBNs.
They have however taken a number of steps to prevent such manipulation and poor quality results ranking in positions that they don’t deserve.
Yandex’s first venture into controlling link spam came in 2005 with the Nepot Filter (unofficial name).
This was in response to the increasing volumes of link exchanges and PBNs, where link volume was used to judge search rankings (similar to a pre-Penguin Google).
Rather than link quality, this filter imposed a penalty on the links themselves and specifically sought out websites that acquired big volumes of unnatural links in short spaces of time.
In March 2008 there was then an unnamed algorithm update that appeared to further tackle link spam.
Ranking Without Links
In December 2013, the then head of search Alexander Sadovsky announced that the company was preparing an algorithm without links being weighted or taken into account.
This was then released as a beta in Moscow for verticals such as real estate, tourism and consumer household appliances.
At the ByNet Week conference in 2015, Sadovsky announced a new, link based algorithm – Minusinsk.
Immediately after the announcement, thousands of websites received a notification through Yandex.Webmaster that they should stop using link spam and other manipulative methods to influence search results.
The Minusinsk roll-out saw three key events; May 15, 27, and June 23.
Yandex Local Algorithms
Local SEO in Russia and Yandex works a little differently to how Google handles similarly large countries, such as the united states.
In 2006 the Yandex algorithm introduced an automatic geo-classification of websites, so users got more relevant, local queries better matching some search intents.
Arzamas & Konakovo
The first named local algorithm, following the Magadan updates of 2008, came in April 2009, Arzamas.
The 2006 update enabled users to manually determine if they only saw results from their own region, where Arzamas made this decision automatically based on whether or not the query needed local or national results.
Webmasters became able to set regions within their Yandex.Webmaster tools and regional search results pages were introduced in Moscow, Saint-Petersburg, Ukraine, Belarus and Kazakhstan.
Arzamas was updated further in June and August 2009, where the ranking formulas were improved for cities such as Yekaterinburg, and older (trusted) websites gained higher rankings for non-location dependent queries. In December 2009, the Konakovo update expanded Yandex’s local ranking factors to 1,250 cities.
In September 2010 Yandex further improved the algorithm’s ability to detect which region a website was based in, even if the webmaster had neglected to indicate a region in Yandex.Webmaster. The Obninsk algorithm also had a negative impact on websites using low quality link spam.
In 2012, the local results algorithm was expanded to incorporate image results, so a user would see different results for an image query depending on if they were in Russia, Ukraine, Belarus or Kazakhstan.
Yandex Content & Quality Algorithms
Over many years, like Google, Yandex has taken a number of steps to ensure that users are presented with quality content that matches their search intents.
Unnamed 2007 Update
In July 2007 Yandex began introducing new ranking and factor weighting formulas for single, and multi-word queries. It also marked the start of Yandex search team support.
This was followed in 2008 by the first officially named Yandex algorithm, 8 SP1. At this point in time, typically, the top ten results were dominated by older, larger websites, and 8 SP1 introduced “Trust Rank” to determine to credibility (not age) of the website as a ranking factor. This also took some of the weighting off backlinks as a factor.
Magadan was introduced in May 2008, four months after 8 SP1, as Yandex learned how to interpret abbreviations. Magadan also came with a “test version”, and webmasters could visit the now defunct buki.yandex.ru, test the search results, and leave feedback for the Yandex search team.
Version 2.0 of Magadan came two months later in July and took into account the uniqueness of content, and the search engine began to understand the difference between commercial/non-commercial and local/national queries.
Nakhodka was introduced in September (again, 2008), as Yandex worked on promoted websites’ internal pages within search results, rather than just the homepage. Nakhodka also aimed to tackle the issue of cloaking much more aggressively.
The AGS filter was officially mentioned in September 2008, a filter Yandex claimed it had been working on since 2006. AGS was a penalty, and if you were hit by it Yandex only showed one to ten pages of your whole website within its search results (this was changed in 2014 so that there were no more page restrictions).
The first iteration of the AGS filter mainly tackled websites with low quality, duplicate content. This was updated in December 2009 (AGS 30) to also penalise websites that contained unique, but very low quality content that offers no user value.
AGS 40 was updated in November 2013 and targeted websites that were built to drive affiliate link clicks or advertising revenues through impressions. This quality update isn’t too dissimilar from the Google Fred update of 2017.
The next major update of AGS came in 2015, where Yandex focused on websites selling and placing links.
2010 saw the introduction of Spectrum in an event known as the Krasnodar update.
Spectrum divides queries into 60 different semantic categories and then calculates the weightings and proportions in which the answers should be presented to the user, diversifying the top ten search results.
So for a generic commercial product query, a user may see eCommerce stores, informational sites (such as Wikipedia), blogs, and forums rather than just commercial product websites.
This essentially covers all common interpretations of a query and improves the likelihood of a user finding the search result they want.
Reykjavik & Kaliningrad
In 2011 the Reykjavik update based search results on a user’s browser language and search language preference, it can be argued that this was Yandex’s first step towards user level SERP personalisation.
This was then expanded in 2012 by the Kaliningrad update, which took into account a user’s search history and behaviours on search results pages.
You Are Spammy
2011 was also the year Yandex introduced the You Are Spammy filter that specifically targeted over optimised, spammy, keyword stuffed text.
Website usability also became a factor, and a specific algorithm for e-commerce website usability, content levels and trust was also rolled out after an initial beta test in the Moscow region.
Obtrusive & Fake Pop-Ups
In May 2012 Yandex took steps in tackling websites using fake pop-ups, such as Windows or social media notifications.
This was further updated in 2014 to demote websites within organic search results with adult or obscene advertisements, and those with aggressive and obtrusive adverts.
Search Result Manipulation
Similar to how Google’s algorithms use search result user behaviour data to adjust the results displayed, Yandex does the same. Towards the end of 2014 Russian link networks performed a “link strengthening” service, which was becoming a real problem.
They would manipulate search results user data by excessively clicking on client results, or websites within their network linking to client websites under the premise it would improve the performance of the client site, directly and indirectly.
Introduction of the website quality index and removal of the TIC indicator
In August 2018 Yandex decided to move away from the TIC (Thematic Citation Index), which was based on a qualitative assessment of inbound links to a website, and replace it with the website quality index (IKS in Russian).
In calculating the new index, Yandex uses data from a wider variety of sources including Metrica, Maps, Zen et al to better understand what the website represents.
Last updated: December 2019