Welcome, in this article we will explain how to find on-page SEO issues; what causes these issues; why they should be resolved and in doing so, what are the benefits.
Not only do technical issues cause problems for search engines but they can also have an effect on user experience such as missing title tags and Meta descriptions.
So fill your coffee mug, get comfortable and allow us to explain why it is important to spend a little extra time fixing a website’s on-page issues.
How To Find On-Page SEO Issues?
There are several tools out there which will effectively simulate a search engine crawl of a site and evaluate any on-page issues. We recommend using these tools; Moz Campaign Manager, Google Webmaster Tools and Screaming Frog.
In Moz Campaign Manager, their site crawler called Rogerbot will provide insight into the website’s issues. The bot crawls the site once every seven days and displays results in the form of an interactive graph which you can set filters and download .csv files to work on the solutions. Underneath the graph you are given overview list of priorities such as:
So now we have the list of issues, which leads us to ask the question…
What Causes These Issues?
Poorly Configured CMS
Some issues are caused by technical faults but they can often be produced by lack of understanding and overlooked by the site administrator.
Duplicate Page Content
Duplicate page content can happen for numerous reasons; most of them are technical as it is not likely that the website administrator would decide to put the same content in several different locations without distinguishing it from the original source.
Most of the time, duplicate content is caused by having more than one access point leading to different URLs being generated. For example an article that can be accessed through its category page leading to:
Session IDs that appear in URLs, for example when they have selected items to buy in their shopping bag, a session ID will be assigned to them. However in this system sometimes the ID is translated into the URL, so every time the customer clicks on an internal link it brings them to a separate URL with duplicate content.
Query strings and URL parameters can be a source of duplicate content, they allow you to track visitors on the site however it may prevent you from ranking well. For example, Google sees: www.example.co.uk/page/ and www.example.co.uk/page/?source=rss as duplicates.
Any other parameters that do not change vital parts of the page content will also be flagged as duplicates. Such as Layered Category pages within Magento sites when an attribute filter is on. For example a product with a filter on; www.example.co.uk/category.html?price=200-300 only differs slightly from the original and if it contains the same lengthy product description it can cause major duplication issues.
Sometimes other websites will use your content without linking to the original source, this is known as content syndication. The search engines won’t understand which source is the original and will have to deal with another version of the same content. Canonical tag – explain
If your system creates printer friendly pages, these could be crawled and flagged as duplicated content.
Now the oldest trick in the book – to have the www. extension or not? If both versions are accessible this could lead to duplicates of every page on the website. The administrator will need to make a decision whether to have the www. extension or not and 301 redirect to the one they prefer.
Duplicate pagination pages are most common on ecommerce sites with product categories that are not set up with the correct HTML code. Paginations should contain rel=”canonical” to tell the search engine that these pages may contain duplicate content but not intentionally. Rel=”next” & rel=”prev” codes should be used, without them Google will not understand how the multiple pages relate to one another.
Thin & Boilerplate Content
Thin content has little or no added value, such as a doorway page (click to enter) or multiple pages created for slightly different phrases. This can be seen on some affiliate sites; if the product description is just copied with nothing added, it is thin content. Added value in the content is important to create a different user experience from other affiliate sites or going directly.
Boilerplate content is defined as repeated non-content across web pages. If Google recognises that the primary content across several pages is repeated, it will most likely filter it to ignore or even penalise this practice.
Title tags are very important to catch a potential visitor’s attention therefore if they are duplicated, not structured properly or missing your site could be losing out. The recommended length of a title tag is 55 characters, if any larger than 512-pixel width; Google will cut the title with an ellipsis. If the title tag is not encouraging clicks then it needs to be fixed.
This is not a ranking factor in Google, however it can have significant effect on click-through rates from SERPs. Duplicate or missing Meta descriptions can reduce your chances of success in SERPs – think of it as the window to your business on Google; it is the first contact point for a potential visitor. If the searcher does not know what is on the page and how it relates to their search query, they are less likely to trust the result. The recommended length of a Meta description is no longer 160 characters (418-pixel width).
Missing alt tags from images means that you are less likely to receive traffic from a Google image search and image based social media sites. The alt tag is used to decipher the subject of the image.
Uncompressed images cause slower page speed loading times, which has been confirmed as a ranking factor for Google as it leads to a bad user experience.
This is not necessarily an issue but having structured data helps Google get a sense of what the content is about enhance your search rankings.
Schema.org microdata makes it easier for search engines to interpret information on webpages more effectively in order to serve more relevant results to search quires. When Google recognises a schema code, certain content which relates to a search query could be used in rich snippets by Google, such as a search for; how do you make fajitas? The recipe is displayed below with the source link:
Looking at the source code of the referenced page, we find the code from schema.org:
<article itemscope itemtype=”http://schema.org/Recipe” >
It is important to use schema code as: “These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model… Many applications from Google, Microsoft, Pinterest, Yandex and others already use these vocabularies to power rich, extensible experiences.”
Known as a 302 Redirect, they need to be analysed to see if they necessary. Having a few 302 redirects is fine but if your site uses them in excess it could reduce your rankings. Temporary redirects pass 0% of the ranking power to the referred page. The only reason to use this type of redirect is if the page has been temporarily moved such as a sold out product; if it’s coming back in the page needs to keep its ranking power.
A big upset to user experience; this could damage your website’s reputation if Google finds a large number of broken links.
If a page’s HTML status code is 500 then there is a server error. There are many reasons why a website shows this error but the two main causes are; a PHP Timeout or a permissions error. 503 Service Unavailable is only used as a temporary ‘closed for repair’ message from the site’s server.
A 404 error returns when the server cannot find anything matching the Request-URI. Usually a 404 page is shown to the visitor, these pages can be customised to discourage them from leaving the site altogether.
Why should these issues be resolved?
The reasons to resolve on-page issues range from improving user experience to avoiding Google penalties.
Duplicate content is not so much a usability issue as long as the visitor gets the information they were searching for. However the website’s visibility score will be hurt if a large amount of duplicate content is found. Moz brakes down the three largest issues with duplicate content:
- Search engines are not sure which version of the page(s) to include/exclude from their indices.
- Search engines don’t know whether to transfer the link metrics, such as trust, authority, anchor text, link juice etc. to one of the pages or separate them between the multiple versions.
- Search engines are confused which version(s) to rank for search query results.
Solving duplicate content will undoubtedly help your site as search engines will display one version of a page with 100% of its link metrics. This may send your content soaring up the rankings. From an SEO perspective the Google Panda 4.2 update has not been confirmed or denied to have any ranking effect on duplicate content. John Mueller, Google Webmaster Analyst, points out that the clean-up of duplicate content caused by technical faults should be put at low priority, yes it should still be dealt with, however overall what Google Panda is asking; is this content on this the best of its kind?
This leads us to solving thin or boilerplate content which should be made a priority. Fixing these on-site issues will certainly decrease the chances of your site being hit by a Google penalty.
Solving missing or duplicate title tags and Meta descriptions can improve organic click-through rates from the site’s listings in SERPs. This can be enhanced by the use of structured data enabling Google to display rich snippets of content.
Making sure the amount of errors on the website is kept at a minimum should be made a priority. Google will see a large number of errors as not caring for visitors, which could reduce rankings. This also goes for 302 redirects; if they are deemed unnecessary then they should be changed to 301 redirects.
Solving missing alt tags on images will help search engines decipher the context of the image thus making is visible in search results. Reducing the page-speed load time of your website by compressing images will most likely improve your rankings.
Recently Google Panda has been confirmed as part of Google’s core ranking algorithm. Currently at version 4.2, it was first introduced in February 2011 and is Google’s most powerful spam-fighting algorithm.
A PR piece written by Jennifer Slegg, overseen by Google states that “Panda is an algorithm that’s applied to sites overall and has become one of our core ranking signals. It measures the quality of a site, which you can read more about in our guidelines. Panda allows Google to take quality into account and adjust ranking accordingly.”
In the long term it is best to avoid:
- Poorly written content (content produced with software)
- Short content with no added value
- Content which is duplicated
This will reduce the chances of a website falling in rankings due to a google algorithm update or a penalty.
In conclusion, it is important to solve on-page issues to improve user experience, gain visibility in SERPs and avoid Google penalties. We would also advise to keep up-to-date with these issues for general housekeeping of your website.