The Complete Guide to Python for SEO
Python is an open-source programming language that has become popular among many in the SEO community as a way to automate repetitive tasks, improve technical SEO campaigns and save time without sacrificing results.
In this article, we will look at areas where Python can help you in your technical SEO efforts. By the end, you should have a picture of what you can do with Python in the context of your SEO.
Can I Learn Python?
First of all, if you don’t already know how to program using Python, you might be surprised by how easy it is to learn. Many SEO consultants appreciate its object-orientated approach, which helps to make it understandable if you don’t have a background in computer science.
Python is interpreted on a line-by-line basis. So if you can understand each line individually, you’re a good way towards understanding the complete function or script you’re looking at.
Like any language, you’ll need to pick up the grammar and vocabulary. Python has relatively basic syntax, making it easier to grasp. It also supports a range of external libraries and modules that allow you to extend the functionality of your scripts, without having to program those capabilities from scratch.
But Python is not just for individuals and small developers – the first Google web crawler was written using Python and while Google had humble beginnings, it reportedly remains a server-side language used by the search engine giant to this day.
How Do I Run Python?
Python scripts can be run via your terminal or command line Integrated Development Environment (IDE) or via a cloud-based service like Google Colab or Jupyter Notebook.
Be aware that as of 2020, Python 3 was designated a stable release, so going forward you should ensure that is the version you use as Python 2 becomes deprecated.
Cloud services are a good option. They allow you to test code line by line, which is helpful for beginners. They also mean you don’t need to worry about version updates, as the cloud provider should (in theory) update to the latest version of Python each time one becomes available and stable.
What Are Python Libraries?
Python libraries are add-ons for the programming language which enable additional functions, without you needing to program them yourself.
Some examples of library-based capabilities for Python include:
- Data analysis and extraction
- Language processing
- Machine learning
- Scientific computing
Libraries have names, so you might see discussions about NumPy (a scientific computing library) or Pandas (a data analysis library), among others. The Python library Requests may be especially useful if you need to make HTTP requests.
Using Python for Technical SEO
While markup languages such as HTML and CSS, and programming languages like JavaScript, all affect the way websites are built and displayed to end-users, Python can be used directly on technical SEO campaigns.
One of its greatest powers is its ability to automate time-consuming, labour-intensive repetitive tasks, which can potentially save you many hours.
You can also use Python for data analysis, which may be especially beneficial if you need to analyse very large data sets, including website analytics data, eCommerce transactions and in-depth keyword reports.
This is not about being lazy – quite the opposite. Automating SEO tasks using Python means you can avoid spending time on low-level analysis, and instead put your expertise into other elements of your SEO campaign that require more lateral thinking and past experience.
What Tasks Can I Automate with Python?
Python has broad and ever-increasing capabilities, so if you have a repetitive SEO task you would like to automate, it’s worth checking if a library is available to help.
Some simple SEO tasks you can automate with Python include:
- Image optimisation
- Keyword research
- Link analysis
- Mapping and migrations
- User intent analysis
As an example, you can write Python scripts that use deep learning to automatically generate image captions and alt attribute text when supplied with the image URL, allowing you to quickly fill any missing gaps in your existing SEO content and metadata.
This is not only good for SEO but, by adding descriptive image captions that match the contents of the pictures and photos on your site, you also make your content more accessible for partially sighted visitors who use screen reader software.
SEO Audit Scripts in Python
Scripts like Seth Black’s SEO Analyzer, which can be downloaded from GitHub, can crawl your website and provide SEO recommendations based on a list of commonly encountered problems.
You can crawl your website starting from the homepage, or by supplying the script with a sitemap in XML format – a good option if not all pages are yet linked from the homepage.
The script returns a set of data for each page including meta tags like the page title and description text, as well as the length of the page in words, so you can easily identify pages that could benefit from having more content added to them.
If pages are missing important SEO meta tags or image alt text, the script can alert you to this too. Filling in these basics is a good first step towards improving on-page SEO but identifying the missing elements and even generating suggested content to fill those tags can easily be automated in Python.
Optimise Images in Python
Reducing image file sizes can increase page load speed, which in turn is good news for your search rankings.
You can do this using a Python script to compress images either individually or across your entire site. A small percentage increase in the compression ratio, multiplied by the number of images on your site, can have a big impact, especially on pages with lots of pictures.
This saves you bandwidth with little to no visible difference in the quality of your images, and can be combined with other techniques such as ‘lazy’ image loading to complete the first render of your pages even faster.
Introducing Machine Learning in Python Scripts for SEO
Machine learning is the process by which Artificial Intelligence (AI) systems improve with each iteration of a task. Python’s intuitive syntax has made it a popular option when scripting machine learning applications.
This is enhanced further by the availability of libraries containing machine learning functions, and the community support available from fellow developers online. Because Python is open-source, there is a large community behind it, and there are examples of how to achieve the most common tasks too.
Machine learning works by analysing data, identifying patterns, and making predictions based on those patterns. As scripts run over and over again, they become ‘trained’ by the data they inspect – effectively improving their results without the need for any additional programming input.
Using Machine Learning in Python
You can run scripts using Python in order to train a dataset via machine learning methods, and then visualise that data. This can enable the kinds of features you may be familiar with on many websites and social networks, which use advanced, detailed algorithms to make decisions.
Some places you might have encountered this kind of AI-driven website content include:
- Curated timelines and personalised trends on social networks.
- Recommendations on music and video streaming services.
- Personal product picks on ecommerce sites.
Machine learning is a way for algorithms to ‘get to know you’ so they can return more personal – and theoretically more relevant – results. You may also have seen this in the Google Ads you see, and even in your Google Search results if you allow for them to be tailored according to your search history, location and other personal data.
How Can Machine Learning Help With SEO?
Machine learning is a powerful tool in marketing. Many of the examples given above are a way to increase your engagement, which in turn increases your dwell time on a particular website, search engine or social network.
By carrying out successive rounds of training, an AI can become more complex, sophisticated, and ultimately more accurate, producing output that helps to relieve the burden on human marketers.
It’s a real-world example of making a long story short, allowing marketers to jump to the end and apply the AI’s findings, without having to carry out detailed calculations by hand.
Machine learning models in SEO campaigns can offer insight into:
- Content quality
- Keyword opportunities
- Metadata optimisation
- Transcription of audio content
- User engagement
With time, these capabilities are becoming more and more sophisticated. AI can now add reasonably accurate subtitles or captions to video content and can transcribe podcast audio, allowing both to be provided in a more accessible way to users with visual or auditory impairments.
Google Natural Language Processing API
The Google Natural Language Processing API, or NLP for short, looks at text and attempts to determine not only its technical structure in terms of word count, grammar, spelling and so on, but also what the text as a whole actually means.
Its output provides an interpretation of the sentiment of the text, highlighting certain core information contained within it. You can use the API to examine your own content and learn more about how it is perceived by Google, as well as to train a machine learning model specifically on your own content, rather than using text from third-party sources.
Conclusion
The Turing Test, devised by computing genius Alan Turing in 1950 (which he called ‘the Imitation Game’), is a measure of the sophistication of AI-based on a computer’s ability to hold a convincing conversation in natural human-like language.
While most website chatbots still have a way to go to come close to passing that test, capabilities like machine learning and NLP are bringing it closer with every passing day, and Python is unlocking the creative potential of people who might not normally take the time to learn a programming language.
Combining all of those elements together – like SEO itself – is optimising the web design process and making it easier to tweak page and website content, embedded multimedia and metadata in order to achieve the best possible search result rankings.