Advanced HREFLANG: Multilingual SEO for locales and regions in a single country
Often, when people talk about international SEO, they focus on expanding into new markets, countries and languages to increase their market share.
However, there can also be great gains in optimising for another language within your own country.
Even then, it’s not always so straight forward, as there are regional differences and dialects to take into account.
One of the biggest opportunities for single-country language optimisation is America, due to the vast number of people who speak Spanish as a first language.
In fact, there are approximately 40-million people in the United States who speak Spanish as their primary language, representing around 12 per cent of the total population, and the state of New Mexico has Spanish declared as the official language.
In terms of size, if this speaking population were a country, it would be the 9th largest European country.
Regional variations in country codes
When implementing HREFLANG, and optimising your website correctly for international SEO, it’s important that you use the correct ISO country and language codes, as well as regional variants.
That being said, a single language such as Spanish can be quite complex, as the below table shows:
Language Code | Language Code Meaning |
es-419 | Spanish as used in Latin America and the Caribbean |
es-us | Spanish for the United States |
es | Spanish |
es-mx | Spanish as used in Mexico |
es-xl | Spanish as used in Latin America |
es-es | Spanish – Spanish |
es-ar | Spanish as used in Argentina |
es-pe | Spanish as used in Peru |
es-la | Spanish as used in Lao |
es-co | Spanish as used in Colombia |
es-do | Spanish as used in the Dominican Republic |
es-ec | Spanish as used in Ecuador |
These variations all visited a single American website in 2017, which actually targets Mexican customers living in the United States.
Taking into account regional variations (the company is based in California, which has a Spanish speaking population of around 28%), it shows that optimising for just “es” within the langtag, might not always be the best solution for optimisation, as this website’s primary source of traffic was coming from users with browser language settings of es-419.
Langtags, regional subtags, variants and extensions
A language tag, or langtag, consists of a number of variables, including:
- language (“en”, “es”, “zh”, or a registered value)
- script (“Latn”, “Cyrl”, or other ISO 15924 codes)
- region (ISO 3166 coes, or UN M.49 codes)
- variant (such as “guoyu”, “Latn”, “Cyrl”)
- extension (single letter followed by additional subtags)
Region subtags are mostly based on ISO 3166-1 codes and can indicate the country, or regional variation.
The region code can also include UN M.49 region codes (such as the 419 in es-419).
UN M.49 codes (United Nations M.49 codes) cover larger geographical areas and can provide an alternative to a conflict, where an ISO 3166 could be reassigned in the registry.
ISO 3166-1 does in fact require UN M.49 to define what is, and isn’t a geographical area (country or region), worthy of its own code.
This means that the variations for language targeting can be expanded from the traditional “es” and “es-419”, to be more specific and targeted with HREFLANG implementations:
Tag | Form | Meaning |
en | language | English |
pt-BR | language-region | Portuguese as used in Brazil |
es-419 | language-region (UN M.49) | Spanish as used in Central and South America |
de-CH-1901 | language-region-variant | German as used in Switzerland, orthography of 1901 |
ru-Cyrl | language-script | Russian as written in Cyrillic |
ru-Cyrl-CS | language-script-region | Russian as written in Cyrillic as used in Serbia and Montenegro |
sl-Latn-IT-rozaj | language-script-region-variant | Slovenian as written in Latin as used in Italy, Resian dialect |
Constructing language tags
When constructing language tags, you should follow the below format:
{language}-{extlangtag}-{script}-{region}-{variant}-{extension}
This is an important practice in larger countries such as Russia or China, where dialects can vary greatly.
Because of how Yandex works in ranking and forming search results pages (personalised for users), if you’re only wanting to target specific population areas, you may want to consider dialects as a part of your strategy for more focused targeting.
That is, unless you’re willing to pay for air shipping to Khabarovsk or Vladivostok from Western Europe.
Conclusion
While reaching your target market is a good thing, and enables you to reach the users you want, the more targeted you go with HREFLANG targeting, the less matches you have with general browser and language settings.
For example, de will match with de, de-CH, de-CH-1901, de-CH-1996, de-AT, de-DE, de-1901, de-AT-1901; but de-CH will only match de-CH, de-CH-1901, de-CH-1996.
This is where business intelligence and aims need to play an important role.
Being specific, and targeting Swiss German, means that the content will not match “just” German and Austrian German.
These language codes are not only important for just getting HREFLANG correct, but also important for other technical aspects such as specifying the language in the server header, or in the HTML lang tag.