Research: Can service workers solve technical SEO problems?
This article represents a write-up of our technical research paper, which won the TechSEO Boost 2018 Call For Research competition, alongside entries from 2018 Drum USA Search Personality of the Year Eric Enge, and French software company OVH.
Synopsis
Over the past few years we’ve noticed an increasing trend of large brands being unable to implement certain elements of technical SEO that we often take for granted when aiming for best practice and technical excellence. Notably in the past 18 months we’ve encountered:
- An established, international travel brand with a multi-million dollar turnover unable to implement Hreflang, and other elements, with duplicate English content between versions and incorrect versions ranking in target markets.
- A cloud services tech company migrating from URLs to a new architecture on Github Pages, when GitHub Pages doesn’t support redirects.
- A retail brand with both an online store and 150 physical stores, looking to embark on a full PPC campaign but unable to disallow PPC landing pages in the robots.txt, or implement page level robots tags.
These issues have come about through a mixture of legacy platform issues, lack of SEO guidance during previous development phases, and lengthy and unproductive bureaucratic internal signoff processes.
As a result, each of these businesses were facing performance issues and the inability to enact key changes through the challenges they were facing.
On the face of it, there were three outcomes:
- Do nothing.
- Re-engineer and rebuild on a new, modern tech stack and platform at a significant cost (time and money).
- Proceed with making changes, not to best practice guidelines, and risk further issues.
Our research was born from the need for a fourth solution, to enable businesses to implement things like Hreflang, redirects and page level scripts, circumventing the obstacles posed.
Research hypothesis
The purpose of this research is to establish whether or not serverless technology can provide workable and accessible solutions to businesses. In summary:
- Utilizing the latest serverless technology, can we implement technical elements of SEO, such as basic Hreflang, redirects, and page level meta robots tags through service worker technology.
- If (1) is successful, can the implementations be picked up by Google in the rendered HTML of the page (making them useful).
Through serverless technology, specifically service works, we are attempting to modify page level content (HTML, JavaScript, CSS), whilst streaming from the backend with least latency impact from the processing. In order to achieve this, all data is processed in streaming mode, with minimal buffering through the search/replace phase. In order to reduce the impact on GC and memory allocation, all data processing has been performed on byte-strings.
A lot of large businesses still utilize a lot of legacy technology, which in itself is not necessarily a problem as the websites function.
However, when attempting to upgrade or adapt the platform to meet the needs of the business – such as implementing redirects to move to HTTPS or implementing Hreflang to accompany the international expansion of business.
In some cases, a lot of these implementations are retrospective as search engine optimization consultants haven’t been previously consulted, and development success has been based on whether or not it works.
Our research found that we were able to implement Hreflang and page level scripts that Google was able to view and process in the rendered HTML, as well as redirects and IP redirects through service workers.
These solutions provide webmasters, and development teams, the ability to modify content with painless code deployment and minimal associated DevOps costs. An alternative to this, with similar results, would be to maintain your own reverse proxy, such as Nginx and Lua, with content transformation in Lua.
The principle behind what we’re trying to achieve is the employment of code generation, which given a configuration, input data and specific run-time, will generate a self-sufficient worker bundle that can be deployed with no DevOps.
Research procedures and methodology
In order to test our hypothesis, we used a mixture of real-world applications and testing websites. The participant websites within this research were:
- A personal website – testing Hreflang implementations injected via Service Worker into the <head>
- A San Francisco based OS API gateway provider – using service workers to implement 301 redirects following a migration
For the purposes of these experiments we utilized Cloudflare Workers:
https://www.cloudflare.com/products/cloudflare-workers/Test One – Hreflang Implementations through the <head>
In order to test whether or not Hreflang could be implemented, and picked up by Google as part of the rendered HTML, we used my personal site https://dantaylor.online (WordPress) for this test, with the language options being:
URL Path | Target Language |
https://dantaylor.online | “en” – Non-regionally targeted English |
https://dantaylor.online/ru/ | “ru” – Non-regionally targeted Russian |
https://dantaylor.online/fr/ | “fr” – Non-regionally targeted French |
We first put the website onto Cloudflare by changing the DNS, and then set about creating a serverless application to augment the current site with a Hreflang injection into the <head>.
Process:
- Defining the required Hreflang implementation required for the URL
- Configuration of the worker bundle to represent the necessary Hreflang implementation
- Deployment of worker bundle to inject Hreflang before the </head>, and removal of existing Hreflang tags implemented through other means
In order to make this process “accessible” to non-developers, the input method needs to be simple. So we’ve developed this to take a CSV input file.
In this example for the test site homepage, we enter one “page” to a row, and each variation to the column. From this the worker bundle will inject the correct Hreflang output:
<link rel="alternate" href="https://dantaylor.online/fr/" hreflang="fr" /> <link rel="alternate" href="https://dantaylor.online/" hreflang="en" /> <link rel="alternate" href="https://dantaylor.online/ru/" hreflang="ru" />
Test Two – Implementing redirects
For this test we took redirects on a live website that we would typically implement through .htaccess or on the server, and implemented them through service workers due to the limitations of the client platform (Git Pages).
Results
The below table summarizes the results seen:
Test Conducted | Test Results |
Hreflang implementations in the <head> | Successful |
301 redirects following a HTTPs migration | Successful |
The first experiment, implementing Hreflang tags through service workers was successful as the injected Hreflang was discovered by Google within its rendered HTML:
The check was performed using the Mobile-Friendly Test as advised by John Mueller. Searching for the site within Google.fr and Google.ru also presents the desired alternate version for the market.
The second test of implementing 301 redirects on GitPages was successful, and organic performance/rankings have adjusted as per a normal migration, and behaving like a standard 301 redirect.
Conclusion
Using service worker technology (in this experiment Cloudflare Workers) is a viable solution to implementing technical “fixes” and basics, bypassing the restrictions of legacy technology stacks, platform restrictions and heavily congested development queues.
There are pros and cons to using Cloudflare Workers and stream transformation in this way, but we believe the pros outweigh the cons and can help large, legacy Frankenstein-esque platforms overcome a lot of obstacles in terms of development for technical SEO best practice.
Pros of using the service worker technology to implement these changes:
- Workers are written in JavaScript
- Simple API
- One-click deployment
- Next to zero DevOps
Cons of using workers/stream transformation in this way:
- Potential to affect and impact all requests
- Potential to add additional latency and slow page load times, depending on implementation
- Has the potential to introduce front-end bugs that are difficult to debug, when there is little/no access to the backend, and when it is unclear what is being modified/injected through stream transformation.
From these experiments, the principle of self-sufficient worker bundles can also be extended to implementing:
- JSON-LD schema
- Page level robots.txt
- JavaScript overlays (which dynamic content based on user IP detection)
- Geo-IP redirects
- HTTP header manipulation (such as returning a 451 code for EU IP users if your site isn’t GDPR ready)