How do you tidy .htaccess redirects?
A .htaccess file is a way to change the configuration of an Apache Web Server to enable or disable the software’s features and functionality – including, crucially for SEO purposes, a way to directly manage your website redirects.
By listing website redirects in .htaccess you can ensure that visitors are sent to the best place if they attempt to visit a page that has been moved or deleted, using standard methods like a 301 page redirect error code.
This has short-term applications, for example during redevelopment projects and site structure updates, as well as long-term applications for content that is permanently moved but still attracts traffic to its old URL via inbound links.
Over time, multiple successive projects can lead to .htaccess files becoming unwieldy and out of date – so it’s important to tidy them up from time to time, to keep things manageable and avoid errors.
How .htaccess files get messy
There are many different ways for your .htaccess file to get messy over time:
- Legacy redirects
- Past server migrations
- Obsolete and deleted pages
- Expired content
- Protocol switches (e.g. HTTP to HTTPS)
A particular problem is if you redirect a deleted page to a new URL, then at a later date you delete the new page and redirect it to another URL, and so on, creating redirect chains that the search engine robots will not follow in full.
Inconsistencies in the way your redirect rules are handled can also lead to confusion and messiness in your .htaccess file, for example, if URLs are treated differently with or without a trailing slash.
How to start tidying .htaccess
A good way to start a .htaccess audit is by checking your Analytics and Search Console data for current crawl errors, so you can prioritise these.
There are various tools to do this via Search Console, or you can analyse your own server logs for Googlebot requests.
Large lists can be confusing, so check for duplicated data and break down what’s left into manageable chunks, for example by splitting URLs at folder-level.
What error codes are interesting for .htaccess?
When checking your data, there are a few HTTP status codes to look out for:
- 404 Not Found
- 301 Moved Permanently
You should also look out for any of the following, to give you an idea of any new page redirects you might need to set up:
- 3xx redirects
- 4xx client errors
- 5xx server errors
If you have an existing .htaccess file (or at least, one that already contains redirects), it’s a good idea to keep a copy of each existing 301 rule for reference too.
You can remove 1xx and 2xx statuses from your data – these generally just indicate that a server request has been received, understood and actioned, and are unlikely to tell you anything useful with respect to .htaccess redirects.
Tasks to tidy up old .htaccess redirects
It’s useful to import your old .htaccess file into software such as Excel, where you can identify errors more easily and make edits relatively safely.
Add any newly discovered errors to your existing rules. If you want your .htaccess rules to be in alphabetical order or to match the logical structure of your site, this is a good time to do that – just remember to sort all the columns at once to keep your rows aligned.
From there, you’re ready to undertake some simple tasks to tidy up your .htaccess rules:
- Eliminate duplicate data, so each missing URL is only redirected by one rule.
- Look for gaps in data caused by 404 errors and fill them in with valid redirects.
- Identify any chains and loops, and replace them with a single 301 redirect URL.
It can be difficult to decide where to redirect obsolete pages that don’t have a present-day equivalent, but be realistic about the current value of those pages and the traffic they bring in – a manageable .htaccess file may be worth more to you than a few irrelevant website hits.
From specific to generic
One good option is to redirect specific but obsolete pages to a more general landing page. An example of this is to redirect deleted job advertisements to your careers page, or old deleted blog posts to your blog’s index page.
By doing this, you lose a bit of relevancy, but many visitors will understand that a redirect means the original page is no longer available, and you give them a second chance to engage with your content instead of just facing a 404 error page.
How long does a .htaccess audit take?
It depends on your data. Checking existing redirects can be quite quick – you just need to make sure the destination URL exists and avoid any chains.
Filling in destinations for 404 errors can take longer, especially if you need to decide on the best landing page for each situation manually.
Generally speaking, it’s a good idea to set aside a solid chunk of time without distractions, so you can update your .htaccess redirects in full and use a consistent method to decide the destination URLs throughout.
Remember to allow time for testing too. A poorly configured .htaccess file can create crawl problems for the search engine robots, which may, in turn, have a significant detrimental impact on your rankings, so don’t rush the final stage of implementation.
How to test .htaccess
There are a few things to look out for when testing .htaccess redirects:
- Each redirect originates from a 3xx, 4xx or 5xx status code
- Each redirect ends in a 200 ‘OK’ status code indicating success
- Every 404 error leads to a relevant redirect or a custom 404 error page
In some cases when there is no appropriate page to redirect to, a well crafted 404 error page might be the better option – this can display generic information along with links to key areas of your website, contact details and a search box to keep visitors engaged.
Finally, check your .htaccess thoroughly for mistakes. The single most important thing to look for is a typo that makes a rule apply to your homepage instead of an old blog post or deleted page – easily done, but with the potential to block traffic from your entire site.
With a careful final check and a logical approach throughout, you can tidy up old rules and create new ones for missing resources, leaving you with a neat and logical .htaccess file for the future.