How Google Knows What Sites You Control And Why it Matters – Whiteboard Friday
Increasing Search Traffic By 20,000 Visitors Per Month Without Full CMS Access – Here’s How…
Posted by RoryT11
This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of Moz, Inc.
Trying to do SEO for a website without full access to its CMS is like trying to win a sword fight with one hand tied behind your back. You can still use your weapon, but there is always going to be a limit to what you can do.
Before this metaphor gets any further out of hand, I should explain. One year ago, the agency I work for was asked to run an SEO campaign for a client. The catch was, it would be impossible for us to gain full access to the CMS that the website was built on. Initially I was doubtful about the results that could be achieved.
Why no CMS?
The reason we couldn’t access the CMS is that the client was part of a global group. All sites within this group were centrally controlled on a third-party CMS, based in another country. If we did want to make any ‘technical changes’, it would have to go through a painfully slow helpdesk process.
We could still add and remove content, edit Metadata and had some basic control over the navigation.
Despite this, we took on the challenge. We already had a strong relationship with the client because we handled their PR, and a good understanding of their niche and target audience. With this in mind, we were confident that we could improve the site in a number of ways that would enhance user experience, which we hoped would lead to increased visibility in the SERPs.
What has happened in the last year since we started managing the search marketing campaign has emphasised to me just how important it is to implement well-structured on-page SEO. The client’s website is now receiving over 20,000 more visits from organic search per month than it did when we took over the account.
I want to share with you how we achieved this without having full access to the CMS. The following screenshots are a direct comparison of January 2013 and January 2014.
Corresponding figures can be viewed in the summary at the end of the post.
Analytics
When we were granted access to analytics for the website, we got our first real insight into how the site was performing, and what we could do to help it perform better.
By analysing the way visitors were using the site (visitor journeys, drop-off points, most visited pages, which pages had highest avg. time etc.), we could start to structure our on-page strategy.
We identified how we could streamline the navigation to help people find what they were looking for quicker. We also decided it was necessary to create clearer call-to-actions, which would shorten the distance from popular landing pages, to the most valuable pages on the website.
We also looked at the top landing pages, and with what keyword data we had access to, we were able to define more clearly why people were visiting the site, and what they expected when they landed on a page.
For example, the site was receiving a lot of traffic for one of its products, with visitors coming into the site from a range of relevant short and longtail keywords. However, they would almost always land on the product page.
We noticed by analysing visitor journeys from this page that they would leave to try to find more information on the item, because the majority of visitors weren’t entering the site at the buying stage of the conversion cycle.
However, where this supporting information lived on the site wasn’t immediately obvious. In fact, it was nearly four clicks away from the product landing page!
It was obvious we’d have to address this, and other similar issues we identified simply by conducting some fairly simple analytic analysis.
Product Pages
The product pages were generated from a global product catalogue built into the content management system. They aren’t great, but because we didn’t have access to the catalogue or the CMS, there was not much we could do directly to the product pages.
Rewriting content
I don’t necessarily believe that there is such a thing as ‘writing for SEO’. Yes, you can structure a page in a certain formulaic way with keywords in header tags, alt tags and title tags.
You can factor low-competition longtail phrases and target keywords into the copy as well…but if you sacrifice UX in favour of anything that I’ve just mentioned, then I’ll just be honest, you’re doing it wrong.
From looking at the data in Google Analytics (low avg. time on site and a bounce rate that should have been lower), and reading through the website ourselves, it became clear that the content needed to be rewritten.
We did have a list of target keywords, but our main objective was to make the content more valuable to the users.
To do this, we worked closely with the PR team, who had a great understanding of the client’s products and key messages. They had also developed personas about the type of visitor that would come to the client’s site.
We were able to use this knowledge as a foundation to rewrite, restructure and streamline sections of the website that we knew could be performing better.
Another thing we noticed from analysing the content is that interlinking was almost non-existent. If a visitor wanted to get to another piece of information or section of the website, they’d be restricted to using the main navigation bar. Not good…
We addressed this in the rewriting process by keeping a spreadsheet of what we were writing and key themes in those pages. We could then use this to structure interlinking on the website in a way that would direct visitors easily to the most relevant resources.
As a result of this we have seen time on site increase by 14.61% for visitors from organic search:
Working with the PR team
As I have mentioned, we also handled PR for this client. Luckily, the PR team provided brilliant support to the search marketing side of the account.
This has proved integral to the success of this campaign for two reasons:
1) The PR team know the client better than anyone. It might even be fair to say they know more about the products and target audience than the client’s own marketing team.
This helped us build a firm understanding of why people would come to the site, what they’d expect to see, and what the client wanted to achieve with its web presence.
This was great in terms of helping us identify what people would search for to find the site, which in turn allowed us to structure the content rewrite more effectively.
2) By working with the PR team, we were able to co-ordinate the on-page and off-page work we were doing, to align with PR campaigns.
For example, if they were pushing a certain product, or raising awareness of a specific campaign, we knew we’d see an increase in search volume in those areas. The SEO team would then also focus efforts on promoting the same product.
When the search volume increased, our site was there to capture the traffic. Unlike in the previous example when the traffic was sent to a product page, we were able to create a fully optimised landing page.
With this approach we knew we’d get a good volume of targeted traffic – we just needed to be there to capture it and give a friendly nudge in the right direction.
Restructuring navigation
The main navigation menu on the site proved to be a source of great frustration. Functionality was extremely limited…we couldn’t even create dropdown menus as that wasn’t built into the CMS.
That meant we needed to be really tight with our navigation options, as well as making it obvious where each navigation link would lead.
Again, we worked with the PR team and the client, as well as using information from Google Analytics to learn about how visitors were using the site, and how the client wanted them to use the site.
Armed with this information, we streamlined the navigation to support user experience by creating better landing pages for the navigation links and making the most popular and valuable pages of the website more accessible.
The result has been that although people are spending more time on page than 12 months ago, they are visiting fewer pages. This has helped us inform the client that navigation was working better, and visitors were able to find the information they required more easily:
Valuable content
There’s a vicious rumour circulating at the moment that quality content (no… not 300 word blog posts) can help drive SEO success. Well, we decided to test this for ourselves…
As well as rewriting existing copy, we also created new content that we hoped would drive more organic search traffic to the site.
We created infographics (good ones), product-specific and general FAQs, video and text based tips and advice pages, as well as specific landing pages for the client’s three ‘hero’ products.
We knew from looking at the analytics that there was definitely opportunity to get more longtail traffic, but we wanted to combine this with creating a genuinely useful resource for the visitors.
Nothing we did was hugely resource intensive in terms of content creation, but what we did create was driven by what the data told us people wanted to see.
As a result, the tips and advice pages and FAQs have both pulled in significant volumes of organic search traffic, and given users something of value.
The screenshots below illustrating this are taken from the middle of August 2013, when the pages went live, to the end of January 2014:
Fixing Errors
With the site plugged into Moz, we were pretty shocked to see the crawl diagnostics return 825 errors, 901 warning and 976 notices. This equated to almost one warning and one error on every single page on the site. The biggest culprit being duplicate page titles, duplicate page content and missing or non-existent Metatags.
The good news – I got to spend tonnes time doing what every SEO hates loves – handcrafting new metadata!
The bad news – the majority of errors were caused by the CMS. How it dealt with pagination, the poor integration of the product catalogue and the way it handled non-public (protected) pages.
As part of our initial audit on the site, we noticed the site didn’t even have a robots.txt. As you know, this meant the search engine bots were crawling every nook and cranny, getting in places that they had no business going in.
So, as well as manually crafting new metadata for many pages, we also had to try and get a robots.txt that we had written onto the site. This meant going through a helpdesk, where they didn’t understand SEO and where English wasn’t their first language.
A gruelling process – but after several months of trying, we got that robots.txt in place, making the site a lot more crawler friendly.
Now we’re down to 122 errors and 377 warnings. Okay, I know it should be lower than that, but when you can’t get change how the CMS works, or add functionality to it, you do the best you can.
Conversions
The client does not sell directly through its website, but through a network of distributors. The quickest way for a customer to learn about their closest distributor is to use the ‘Contact Us’ page. Again, admittedly, this is far from the best system but unfortunately, it is not something we’re able to change at this stage.
Because of this, we made people visiting the ‘Contact Us’ page a conversion goal that would be a KPI for the campaign. We have seen this increase by over 21% in the last 12 months, which has helped us prove value to the client, as these are the kinds of visits that will have a positive impact on their bottom line. It’s good to know you’re not only driving a high volume of traffic, but also a good quality of traffic.
Off-page
The reason I’ve saved off-page to last is that I really don’t dwell on it. Yes, we did follow traditional ‘best practices’; blogger and influencer outreach, producing quality content for people to link to – but we didn’t do anything revolutionary or game-changing.
The truth is, we had so much work to do on-page, that we kind of let the off-page take care of itself.
I’d in no way advocate this approach all the time, but in this case we prioritised getting the website working as hard as it could. In this case, it paid dividends and I’ll tell you why.
Conclusions – Play to your strengths
Managing an SEO campaign without full access to a CMS undoubtedly poses a unique set of challenges. But what it also forced us to do was play to our strengths.
Instead of overcomplicating any of the more ‘technical’ SEO issues, we focused on getting the basics right, and using data to structure our strategy. We took an unfocused, poorly structured website, and shaped into something valuable and user-friendly.
That’s why we’ve seen 20,000 more unique visits per month than we were having when we took over the campaign a year ago – we did what many people would consider ‘basic SEO’ really well. I think this is what I want the key takeaway to be from this case study.
It’s probably true that SEOs are experiencing something of an identity crisis, but as Rand eloquently argued in his recent post, we still have a unique skill set that can be incredibly valuable to any business with an online presence. What we may consider ‘basic’ still has the potential to deliver fantastic results.
Really, all we’re trying to do is make our websites more user-friendly and more crawlable. If you do that, you’ll get the results. Hopefully that’s what I’ve illustrated in this post.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!
Continue reading →Link Audit Guide for Effective Link Removals & Risk Mitigation
Posted by Modestos
This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of Moz, Inc.
This step-by-step guide aims to help users with the link auditing process relying on own judgment, without blindly relying on automation. Because links are still a very important ranking factor, link audits should be carried out by experienced link auditors rather than third party automated services. A flawed link audit can have detrimental implications.
The guide consists of the following sections:
- How to make sure that your site’s issues are links-related.
- Which common misconceptions you should to avoid when judging the impact of backlinks.
- How to shape a solid link removal strategy.
- How to improve the backlink data collection process.
- Why you need to re-crawl all collected backlink data.
- Why you need to find the genuine URLs of your backlinks.
- How to build a bespoke backlink classification model.
- Why you need to weight and aggregate all negative signals.
- How to prioritise backlinks for removal.
- How to measure success after having removed/disavowed links.
In the process that follows, automation is required only for data collection, crawling and metric gathering purposes.
Disclaimer: The present process is by no means panacea to all link-related issues – feel free to share your thoughts, processes, experiences or questions within the comments section – we can all learn from each other 🙂
#1 Rule out all other possibilities
Nowadays link removals and/or making use of Google’s disavow tool are the first courses of action that come to mind following typical negative events such as ranking drops, traffic loss or de-indexation of one or more key-pages on a website.
However, this doesn’t necessarily mean that whenever rankings drop or traffic dips links are the sole culprits.
For instance, some of the actual reasons that these events may have occurred can relate to:
- Tracking issues – Before trying anything else, make sure the reported traffic data are accurate. If traffic appears to be down make sure there aren’t any issues with your analytics tracking. It happens sometimes that the tracking code goes missing from one or more pages for no immediately apparent reason.
- Content issues – E.g. the content of the site is shallow, scraped or of very low quality, meaning that the site could have been hit by an algorithm (Panda) or by a manual penalty.
- Technical issues – E.g. a poorly planned or executed site migration, a disallow directive in robots.txt, wrong implementation of rel=”canonical”, severe site performance issues etc.
- Outbound linking issues – These may arise when a website is linking out to spam sites or websites operating in untrustworthy niches i.e. adult, gambling etc. Linking out to such sites isn’t always deliberate and in many cases, webmasters have no idea where their websites are linking out to. Outbound follow links need to be regularly checked because the hijacking of external links is a very common hacking practice. Equally risky are outbound links pointing to pages that have been redirected to bad neighborhood sites.
- Hacking – This includes unintentional hosting of spam, malware or viruses that come as a consequence because of hacking.
In all these cases, trying to recover any loss in traffic has nothing to do with the quality of the inbound links as the real reasons are to be found elsewhere.
Remember: There is nothing worse than spending time on link removals when in reality your site is suffering by non-link-related issues.
#2 Avoid common misconceptions
If you have lost rankings or traffic and you can’t spot any of issues presented in previous step, you are left with the possibility of checking out your backlinks.
Nevertheless, you should avoid the falling victim of the following three misconceptions before being reassured that there aren’t any issues with your sites backlinks.
a) It’s not just about Penguin
The problem: Minor algorithm updates take place pretty much every day and not just on the dates Google’s reps announce them, such as the Penguin updates. According to Matt Cutts, in 2012 alone Google launched 665 algorithmic updates, which averages at about two per day during the entire year!
If your site hasn’t gained or lost rankings on the exact dates Penguin was refreshed or other official updates rolled out, this does not mean that your site is immune to all Google updates. In fact, your site may have been hit already by less commonly known updates.
The solution: The best ways to spot unofficial Google updates is by regularly keeping an eye on the various SERP volatility tools as well as on updates from credible forums where many site owners and SEOs whose sites have been hit share their own experiences.
SERPs volatility (credit: SERPs.com)
b) Total organic traffic has not dropped
The problem: Even though year-over-year traffic is a great KPI, when it’s not correlated with rankings, many issues may remain invisible. To make things even more complicated, “not provided” makes it almost impossible to break down your organic traffic into brand and non-brand queries.
The solution: Check your rankings regularly (i.e. weekly) so you can easily spot manual penalties or algorithmic devaluations that may be attributed to your site’s link graph. Make sure that you not only track the keywords with the highest search volumes but also other several mid- or even long-tail ones. This will help you diagnose which keyword groups or pages have been affected.
c) It’s not just about the links you have built
The problem: Another common misconception is to assume that because you haven’t built any unnatural links your site’s backlink profile is squeaky-clean. Google evaluates all links pointing to your site, even the ones that were built five or 10 years ago and are still live, which you may or may not be aware of. In a similar fashion, any new links coming into your site do equally matter, whether they’re organic, inorganic, built by you or someone else. Whether you like it or not, every site is accountable and responsible for all inbound links pointing at it.
The solution: First, make sure you’re regularly auditing your links against potential negative SEO attempts. Check out Glen Gabe’s 4 ways of carrying out negative SEO checks and try adopting at least two of them. In addition, carry out a thorough backlink audit to get a better understanding of your site’s backlinks. You may be very surprised finding out which sites have been linking to your site without being aware of it.
#3 Shape a solid link removal strategy
Coming up with a solid strategy should largely depend on whether:
- You have received a manual penalty.
- You have lost traffic following an official or unofficial algorithmic update (e.g. Penguin).
- You want to remove links proactively to mitigate risk.
I have covered thoroughly in another post the cases where link removals can be worthwhile so let’s move on into the details of each one of the three scenarios.
Manual penalties Vs. Algorithmic devaluations
If you’ve concluded that the ranking drops and/or traffic loss seem to relate to backlink issues, the first thing you need to figure out is whether your site has been hit manually or algorithmically.
Many people confuse manually imposed penalties with algorithmic devaluations, hence making strategic mistakes.
- If you have received a Google notification and/or a manual ‘Impacts Links’ action (like the one below) appears within Webmaster Tools it means that your site has already been flagged for unnatural links and sooner or later it will receive a manual penalty. In this case, you should definitely try to identify which the violating links may be and try to remove them.
- If no site-wide or partial manual actions appear in your Webmaster Tools account, your entire site or just a few pages may have been affected by an official (e.g. Penguin update/refresh) or unofficial algorithmic update in Google’s link valuation. For more information on unofficial updates keep an eye on Moz’s Google update history.
There is also the possibility that a site has been hit manually and algorithmically at the same time, although this is a rather rare case.
Tips for manual penalties
If you’ve received a manual penalty, you’ll need to remove as many unnatural links as possible to please Google’s webspam team when requesting a review. But before you get there, you need to figure out what type of penalty you have received:
- Keyword level penalty – Rankings for one or more keywords appear to have dropped significantly.
- Page (URL) level penalty – The pages no longer ranks for any of its targeted keywords, including head and long-tail ones. In some cases, the affected page may even appear to be de-indexed.
- Site-wide penalty – The entire site has been de-indexed and consequently no longer ranks for any keywords, including the site’s own domain name.
1. If one (or more) targeted keyword(s) has received a penalty, you should first focus on the backlinks pointing to the page(s) that used to rank for the penalized keyword(s) BEFORE the penalty took place. Carrying out granular audits against the pages of your best ranking competitors can give you a rough idea of how much work you need to do in order to rebalance your backlink profile.
Also, make sure you review all backlinks pointing to URLs that 301 redirect or have a rel=”canonical” to the penalized pages. Penalties can flow in the same way PageRank flows through 301 redirects or rel=”canonical” tags.
2. If one (or more) pages (URLs) have received a penalty, you should definitely focus on the backlinks pointing to these pages first. Although there are no guarantees that resolving the issues with the backlinks of the penalized pages may be enough to lift the penalty, it makes sense not making drastic changes on the backlinks of other parts of the site unless you really have to e.g. after failing a first reconsideration request.
3. If the penalty is site-wide, you should look at all backlinks pointing to the penalized domain or subdomain.
In terms of the process you can follow to manually identify and document the toxic links, Lewis Seller’s excellent Ultimate Guide to Google Penalty Removal covers pretty much all you need to be doing.
Tips for algorithmic devaluations
Pleasing Google’s algorithm is quite different to pleasing a human reviewer. If you have lost rankings due to an algorithmic update, the first thing you need to do is to carry out a backlink audit against the top 3-4 best-ranking websites in your niche.
It is really important to study the backlink profile of the sites, which are still ranking well, making sure you exclude Exact Match Domains (EMDs) and Partial Match Domains (PMDs).
This will help you spot:
- Unnatural signals when comparing your site’s backlink profile to your best ranking competitors.
- Common trends amongst the best ranking websites.
Once you have done the above you should then be in a much better position to decide which actions you need to take in order to rebalance the site’s backlink profile.
Tips for proactive link removals
Making sure that your site’s backlink profile is in better shape compared to your competitors should always be one of your top priorities, regardless of whether or not you’ve been penalized. Mitigating potential link-related risks that may arise as a result of the next Penguin update, or a future manual review of your site from Google’s webspam team, can help you stay safe.
There is nothing wrong with proactively removing and/or disavowing inorganic links because some of the most notorious links from the past may one day in the future hold you back for an indefinite period of time, or in extreme cases, ruin your entire business.
Removing obsolete low quality links is highly unlikely to cause any ranking drops as Google is already discounting (most of) these unnatural links. However, by not removing them you’re risking getting a manual penalty or getting hit by the next algorithm update.
Undoubtedly, proactively removing links may not be the easiest thing to sell a client. Those in charge of sites that have been penalized in the past are always much more likely to invest in this activity, without any having any hesitations.
Dealing with unrealistic growth expectations it can be easily avoided when honestly educating clients about the current stance of Google towards SEO. Investing on this may save you later from a lot of troubles, avoiding misconceptions or misunderstandings.
A reasonable site owner would rather invest today into minimizing the risks and sacrifice growth for a few months rather than risk the long-term sustainability of their business. Growth is what makes site owners happy, but sustaining what has already been achieved should be their number one priority.
So, if you have doubts about how your client may perceive your suggestion about spending the next few months into re-balancing their site’s backlink profile so it conforms with Google’s latest quality guidelines, try challenging them with the following questions:
- How long could you afford running your business without getting any organic traffic from Google?
- What would be the impact to your business if you five best performing keywords stop ranking for six months?
#4 Perfect the data collection process
Contrary to Google’s recommendation, relying on link data from Webmaster Tools alone in most cases isn’t enough, as Google doesn’t provide every piece of link data that is known to them. A great justification for this argument is the fact that many webmasters have received from Google examples of unnatural links that do not appear in the available backlink data in WMT.
Therefore, it makes perfect sense to try combining link data from as many different data sources as possible.
- Try including ALL data from at least one of the services with the biggest indexes (Majestic SEO, Ahrefs) as well as the ones provided by the two major search engines (Google and Bing webmaster tools) for free, to all verified owners of the sites.
- Take advantage of the backlink data provided by additional third party services such as Open Site Explorer, Blekko, Open Link Profiler, SEO Kicks etc.
Note that most of the automated link audit tools aren’t very transparent about the data sources they’re using, nor about the percentage of data they are pulling in for processing.
Being in charge of the data to be analyzed will give you a big advantage and the more you increase the quantity and quality of your backlink data the better chances you will have to rectify the issues.
#5 Re-crawl all collected data
Now that have collected as much backlink data as possible, you now need to separate the chaff from the wheat. This is necessary because:
- Not all the links you have already collected may still be pointing to your site.
- Not all links pose the same risk e.g Google discounts no follow links.
All you need to do is crawl all backlink data and filter out the following:
- Dead links – Not all links reported by Webmaster Tools, Majestic SEO, OSE and Ahrefs are still live as most of them were discovered weeks or even months ago. Make sure you get rid of URLs that do no longer link to your site such as URLs that return a 403, 404, 410, 503 server response. Disavowing links (or domains) that no longer exist can reduce the chances of a reconsideration request from& being successful.
- Nofollow links – Because nofollow links do not pass PageRank nor anchor text, there is no immediate need trying to remove them – unless their number is in excess when compared to your site’s follow links or the follow/nofollow split of your competitors.
Tip: There are many tools which can help crawling the backlink data but I would strongly recommend Cognitive SEO because of its high accuracy, speed and low cost per crawled link.
#6 Identify the authentic URLs
Once you have identified all live and follow links, you should then try identifying the authentic (canonical) URLs of the links. Note that this step is essential only in case you want to try to remove the toxic links. Otherwise, if you just want to disavow the links you can skip this step making sure you disavow the entire domain of each toxic-linking site rather than the specific pages linking to your site.
Often, a link appearing on a web page can be discovered and reported by a crawler several times as in most cases it would appear under many different URLs. Such URLs may include a blog’s homepage, category pages, paginated pages, feeds, pages with parameters in the URL and other typical duplicate pages.
Identifying the authentic URL of the page where the link was originally placed on (and getting rid the URLs of all other duplicate pages) is very important because:
- It will help with making reasonable link removal requests, which in turn can result in a higher success rate. For example, it’s pretty pointless contacting a Webmaster and requesting link removals from feeds, archived or paginated pages.
- It will help with monitoring progress, as well as gathering evidence for all the hard work you have carried out. The latter will be extremely important later if you need to request a review from Google.
Example 1 – Press release
In this example the first URL is the “authentic” one and all the others ones need to be removed. Removing the links contained in the canonical URL will remove the links from all the other URLs too.
Example 2 – Directory URLs
In the below example it isn’t immediately obvious on which page the actual link sits on:
http://www.192.com/business/derby-de24/telecom-services/comex-2000-uk/18991da6-6025-4617-9cc0-627117122e08/ugc/?sk=c6670c37-0b01-4ab1-845d-99de47e8032a (non canonical URL with appended parameter/value pair: disregard)
http://www.192.com/atoz/business/derby-de24/telecom-services/comex-2000-uk/18991da6-6025-4617-9cc0-627117122e08/ugc/ (canonical page: keep URL)
http://www.192.com/places/de/de24-8/de24-8hp/ (directory category page: disregard URL)
Unfortunately, this step can be quite time-consuming and I haven’t as yet come across an automated service able to automatically detect the authentic URL and instantly get rid of the redundant ones. If you are aware of any accurate and reliable ones, please feel free to share examples of these in the comments 🙂
#7 Build your own link classification model
There are many good reasons for building your own link classification model rather than relying on fully automated services, most of which aren’t transparent about their toxic link classification formulas.
Although there are many commercial tools available, all claiming to offer the most accurate link classification methodology, the decision whether a link qualifies or not for removal should sit with you and not with a (secret) algorithm. If Google, a multi-billion dollar business, is still failing in many occasions to detect manipulative links and relies up to some extent on humans to carry out manual reviews of backlinks, you should do the same rather than relying on a $99/month tool.
Unnatural link signals check-list
What you need to do in this stage is to check each one of the “authentic” URLs (you have identified from the previous step) against the most common and easily detectable signals of manipulative and unnatural links, including:
- Links with commercial anchor text, including both exact and broad match.
- Links with an obvious manipulative intent e.g. footer/sidebar text links, links placed on low quality sites (with/without commercial anchor text), blog comments sitting on irrelevant sites, duplicate listings on generic directories, low quality guest posts, widget links, press releases, site-wide links, blog-rolls etc. Just take a look at Google’s constantly expanding link-schemes page for the entire and up-to-date list.
- Links placed on authoritative yet untrustworthy websites. Typically these are sites that have bumped up their SEO metrics with unnatural links, so they look attractive for paid link placements. They can be identified when one (or more) of the below conditions are met:
- MozRank is significantly greater than MozTrust.
- PageRank if much greater than MozRank.
- Citation flow is much greater than Trust Flow.
- Links appearing on pages or sites with low quality content, poor language and poor readability such as spun, scraped, translated or paraphrased content.
- Links sitting on domains with little or no topical relevance. E.g. too many links placed on generic directories or too many technology sites linking to financial pages.
- Links, which are part of a link network. Although these aren’t always easy to detect you can try identifying footprints including backlink commonality, identical or similar IP addresses, identical Whois registration details etc.
- Links placed only on the homepages of referring sites. As the homepage is the most authoritative page on most websites, links appearing there can be easily deemed as paid links – especially if their number is excessive. Pay extra attention to these links and make sure they are organic.
- Links appearing on sites with content in foreign languages e.g. Articles about gadgets in Chinese linking to a US site with commercial anchor text in English.
- Site-wide links. Not all site-wide links are toxic but it is worth manually checking them for manipulative intent e.g. when combined with commercial anchor text or when there is no topical relevance between the linked sites.
- Links appearing on hacked, adult, pharmaceutical and other “bad neighborhood” spam sites.
- Links appearing on de-indexed domains. Google de-indexes websites that add no value to users (i.e. low quality directories), hence getting links from de-indexed websites isn’t a quality signal.
- Redirected domains to specific money-making pages. These can include EMDs or just authoritative domains carrying historical backlinks, usually unnatural and irrelevant.
Note that the above checklist isn’t exhaustive but should be sufficient to assess the overall risk score of each one of your backlinks. Each backlink profile is different and depending on its size, history and niche you may not need to carry out all of the aforementioned 12 checks.
Handy Tools
There are several paid and free tools that can massively help speeding things up when checking your backlinks against the above checklist.
- Cognitive SEO – Ideal for points 1, 2, 4, 5, 6, 7, 9
- Google Backlink Tool for Penguin & Disavow Analysis – Can greatly help with point 2 to identify unnatural links based on various footprints.
- Netpeak Checker (Free) – Can extract all metrics needed in point 3 and help with WhoIs scrapping in 6.
- LinkStat by MattSight – Can assist with 4.
- Scrabebox – Can assist with 6 , 11.
- Net Comber – Can help with points 6, 12.
Although some automated solutions can assist with points 2, 4, 5, 8 and 10, it is recommended to manually carry out these activities for more accurate results.
Jim Boykin’s Google Backlink Tool for Penguin & Disavow in action
#8 Weighting & aggregating the negative signals
Now that you have audited all links you can calculate the total risk score for each one of them. To do that you just need to aggregate all manipulative signals that have been identified in the previous step.
In the most simplistic form of this classification model, you can allocate one point to each one of the detected negative signals. Later, you can try up-weighting some of the most important signals – usually I do this for commercial anchor text, hacked /spam sites etc.
However, because each niche is unique and consists of a different ecosystem, a one-size-fits-all approach wouldn’t work. Therefore, I would recommend trying out a few different combinations to improve the efficiency of your unnatural link detection formula.
Sample of weighted and aggregated unnatural link signals
Turning the data into a pivot chart makes it much easier to summarize the risk of all backlinks in a visual way. This will also help estimating the effort and resources needed, depending on the number of links you decide to remove.
#9 Prioritizing links for removal
Unfortunately, there isn’t a magic number (or percentage) of links you need to remove in order to rebalance your site’s backlink profile. The decision of how much is enough would largely depend on whether:
- You have already lost rankings/traffic.
- Your site has been manually penalized or hit by an algorithm update.
- You are trying to avoid a future penalty.
- Your competitors have healthier backlink profiles.
No matter which the case is it makes common sense to focus first on those pages (and keywords), which are more critical to your business. Therefore, unnatural links pointing to pages with high commercial value should be prioritized for link removals.
Often, these pages are the ones that have been heavily targeted with links in the past, hence it’s always worth paying extra attention into the backlinks of the most heavily linked pages. On the other hand, it would be pretty pointless spending time analyzing the backlinks pointing at pages with very few inbound links and these should be de-prioritized.
To get an idea of your most important page’s backlink vulnerability score you should try Virante’s Penguin Analysis tool.
#10 Defining & measuring success
After all backlinks have been assessed and the most unnatural ones have been identified for removal, you need to figure out a way to measure the effectiveness of your actions. This would largely depend on the situation you’re in (see step 3). There are 3 different scenarios:
- If you have received a manual penalty and have worked hard before requesting Google to review your backlinks, receiving a “Manual spam action revoked” message is the ultimate goal. However, this isn’t to say that if you get rid of the penalty your site’s traffic levels will recover to their pre-penalty levels.
- If you have been hit algorithmically you may need to wait for several weeks or even months until you notice the impact of your work. Penguin updates are rare and typically there is one every 3-6 months, therefore you need to be very patient. In any case, recovering fully from Penguin is very difficult and can take a very long time.
- If you have proactively removed links things are vaguer. Certainly avoiding a manual penalty or future algorithmic devaluations should be considered a success, especially on sites that have engaged in the past with heavy unnatural linking activities.
Marie Haynes has written a very thorough post about traffic increases following the removal of link-based penalties.
Summary
Links may not always be the sole reason why a site has lost rankings and/or organic search visibility. Therefore before making any decision about removing or disavowing links you need to rule out other potential reasons such as technical or content issues.
If you are convinced that there are link based issues at play then you should carry out an extensive manual backlink audit. Building your own link classification model will help assessing the overall risk score of each backlink based on the most common signals of manipulation. This way you can effectively identify the most inorganic links and prioritize which ones should be removed/disavowed.
Remember: All automated unnatural link risk diagnosis solutions come with many and significant caveats. Study your site’s ecosystem, make your own decisions based on your gut feeling and avoid taking blanket approaches.
…and if you still feel nervous or uncomfortable sacrificing resources from other SEO activities to spend time on link removals, I’ve recently written a post highlighting the reasons why link removals can be very valuable, if done correctly.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!
Continue reading →The Panda Patent: Brand Mentions Are the Future of Link Building
Posted by SimonPenson
There has long been speculation about how Google actually measures “brand authority.” Many times over the past couple of years have those who speak outside of those fortified Googleplex walls made it clear that brand building is the way to win the organic visibility war.
That statement however has always sounded wooly in the extreme. How is it possible to measure an almost intangible thing at scale and via a complex formula? If you are Google, it seems there is ALWAYS a way.
A fairly innocent-looking patent filed last month, which some say could be the Panda patent may have gone some way to answering that question.
Within this post we dive into that patent and other supporting evidence in an attempt to understand what the opportunity may be for digital marketers in the future. As part of that attempt, we offer our interpretation of various pieces of the patent itself, and also look at actual data to see if mentions are already playing a part in the ranking of sites.
The patent in question, which can be found here and has been expertly covered by Bill Slawski, may cover the Panda Algorithm’s key workings, but the piece we are really interested in right now is the information around measuring site authority and relevance using a ratio of links and mentions, or “implied links.” It’s this specific area that got both the team here at Zazzle Media and also at Moz excited. You can see Cyrus Shepard and Rand Fishkin’s reaction right here:
I knew it! RT @CyrusShepard: This Google patent defines non-linking citations as “implied links” http://t.co/cOxv0irklk
— Rand Fishkin (@randfish) March 26, 2014
So, what exactly does the patent imply? It is complex, wordy, and difficult to interpret, but it starts by talking about what Google calls “reference queries:”
“A reference query for a particular group of resources is a previously submitted search query that has been categorized as referring to a resource in the particular group of resources.”
To most of us, that statement appears as bad English at best, but there are a couple of ways this could be used. It first allows Google to look at what terms people have used previously through search to find and then click on a site, or group of sites. In doing so, it would then also allow them to “map” semantically relevant queries (and thus mentions of a brand) to a site in order to further extend an understanding of the “popularity” or “authority” of that entity.
The patent also covers a mechanism for allowing Google to discount some links and give others greater weighting based on a modification factor:
“The method of claim 1, wherein determining a respective group-specific modification factor for a particular group of resources comprises: determining an initial modification factor for the particular group of resources, wherein the initial modification factor is a ratio of a number of independent links counted for the particular group to the number of reference queries counted for the particular group.”
This could be especially important where lots of links from the same company (or “group”) point at a site, as the search engine could discount those from the true overall picture. It then also gives engineers the ability to look at “quality” as a measure of the overall relevance to the queried subject matter, which is called out in a separate bit of the patent:
“For example, the initial score can be, e.g., a measure of the relevance of the resource to the search query, a measure of the quality of the resource, or both.”
Then, the patent specifically mentions that links can either be “express” or “implied,” calling out non-linking mentions in a rather unmistakable way:
“The system determines a count of independent links for the group (step 302). A link for a group of resources is an incoming link to a resource in the group, i.e., a link having a resource in the group as its target. Links for the group can include express links, implied links, or both. […] An implied link is a reference to a target resource, e.g., a citation to the target resource, which is included in a source resource but is not an express link to the target resource. Thus, a resource in the group can be the target of an implied link without a user being able to navigate to the resource by following the implied link.
What does all this mean? It means that once a connection is made by someone typing in a brand name or other search query and then clicking on a site it creates a connection in Google’s eyes. The search engine can then store that info and use it in the context of unlinked mentions around the web in order to help weight rankings of particular sites.
If this is the Panda Patent, as it is part of a wider algorithm, it would also look at the quality of pages on a site and how “commercial” they are in their targeting – this was originally designed to negatively impact content farms that created content targeted aggressively at commercial terms – those that think “search engine-first” as opposed to “audience-first.”
The patent publication was closely followed by a related Webmaster video by Matt Cutts within which the head of Google’s webspam team talked about a “forthcoming update” that would look differently at how the search engine measures authority:
By arguing that there is a difference between popularity and links, Cutts made clear that his engineers are looking very closely at how to continue to tweak the existing algo to make more popular sites rank higher.
That’s a big deal.
Those two pieces of new evidence suggest there is a seismic shift underway in how links are weighted and how relevance is measured – the two building blocks of search.
It’s something Rand here at Moz first touched on back in late 2012 and I covered in detail in this post a few weeks later.
In it I gave a little background into why Google is hell bent on getting away from the concept of a link-based economy:
“My view is that Google is really trying to clear up the link graph and with it valueless links so that it can clearly understand relevance and associations again.
It’s something that web ‘creator’ Tim Berners Lee first wrote about back in 2006 in this magazine article and Google has been talking about ever since, ploughing lots of cash into acquisitions to get it there.
So, why invest so much time, effort and resource into such a game-changing project? Quite simply because its existing keyword based info retrieval model is in danger of being bettered by semantic search engines.”
Let’s look into the how and why in a little more detail.
Links
Everybody who reads this article will be more than aware of the importance of links in building authority.
A lot has changed in this space over recent years, however, as Google has developed systems to measure where the link is coming from and assign more (or less) value to it as a result.
The next logical step in that process could be the downgrading of the follow link within the overall picture of ranking factors. It is something that has been expected for some time and was reiterated by Moz’s own panel of experts in the annual Ranking Factor survey.
We know that follow links have been gamed “to death” in the past, so it would make sense to make that particular element a little less important in the overall mix. Nofollows can still tell Google a lot about a site, as can how much people are talking about it.
Nofollow
By definition a nofollow link does not pass equity, or PageRank, to a site. We know that for certain. What is less clear is what, if anything, it does pass.
Google certainly knows these links are there, and this latest patent could suggest they are taking a little more notice of them than they are letting on.
The patent highlights the importance of “reference queries” and “implied links,” and also that Google looks to discount links from the same “group” or brand and instead wants to concentrate on independent links from unassociated domains.
Critically, PageRank is not mentioned, which suggests that other factors are being measured in terms of how much equity is being passed by each independent link, or mention.
Mentions
It is the “implied link” element that makes most interesting reading, as it is black-and-white evidence that Google is looking at mentions as a measure of authority.
It is logical, after all, that a popular brand would have more people talking about it online than one that is simply good at manipulating the algorithm and has invested heavily in link building to that end.
The results from our sample research also support this, with larger, better-known brands generally attracting greater numbers of mentions than others.
Testing
Of course, the question remains, is this already in use? To test this properly would require a monumental amount of measurement across a plethora of verticals over an extended period of time, and sadly, I did not have the time or resources to run that project for the sake of this post, but it is still worth sharing some of the data and an overview of what may be going on.
The caveat here is absolutely that this does NOT constitute any kind of fact-finding mission, simply an informed commentary on a few anomalies that cannot be explained simply by looking at follow links alone.
To discover if there are any initial signs that this kind of system may already be in effect, I spent some time analyzing three separate, random SERPs here in the UK.
They were:
- “Car Insurance”
- “Mortgage Calculator”
- “Mens Clothing”
All three are competitive terms and are “owned” in the main by what we might know as brands in the wider world.
Below you can see a simple chart for each of these, showing:
- Follow links
- Nofollow links
- Follow/nofollow ratio
- The number of brand mentions in the last four weeks
- Ratio between links and mentions
Clearly this isn’t a scientific study, but it does serve as a “finger in the air” analysis from which a few interesting observations can be made.
The first two tables contain data examining the whole domain’s link profile, while the third looks specifically at links to the URL indexed for the particular term we are analyzing.
The raw data is below and here is an explanation of where that data was drawn from:
- Position – Records what position we saw the domain in for the given search term in google.co.uk.
- Follow link – for the first two (‘Car Insurance’ and ‘Mens Clothing’) this is the number of follow links across the whole domain. For ‘mortgage calculator’ it records just the follow links into the specific URL indexing for that term. The data is from AHREFS.
- As above but for no follow links.
- The number of referring domains into the domain (‘Car Insurance’ and ‘Mens Clothing’) and the URL (‘Mortgage Calculator’).
- Mentions – How many mentions of the brand there have been in the last four weeks (as taken from Moz Fresh Web Explorer and using exact match brand term only).
- Ratio of follow to no follow links – Designed to see if there is a correlation with this and position.
- Ratio of links to mentions – A look at the relationship between how many links a site has and how many mentions in the previous four weeks.
Let’s now look at each table in a little more detail but with the understanding that there are a myriad of other factors that affect the result. After each piece of analysis I have added general comments:
“Car insurance”
- Despite having considerably fewer links overall across the domain, Go Compare is first. Does this suggest they have hit a sweet spot in terms of links versus mentions and brand metrics?
- Money Supermarket proves that more links doesn’t win. The site has many, many more than anyone else in the top five and yet is not first. Link volume clearly matters, but it is absolutely not the only factor at play.
- Do Compare the Market and Confused “win” and feature in the top five on the strength of their mention data? A higher ratio than the top three but MUCH lower link numbers suggests that might be the case.
- LV.com is the anomaly as its mention to link ratio is low. This suggests that sheer numbers of links are potentially helping it rank well as well as on-page factors that make the brand super-relevant for car insurance.
“Mens clothing”
- ASOS and House of Fraser walk away with it here, and interestingly both of these sites have very similar ratios for both follow and no-follow links AND for links and mentions.
- Next is an anomaly but only because measuring true mentions is very hard due to the brand having a “generic” word as its “brand.” Could this cause issues for Google going forward in measuring similar brands?
- Again, Burton (in particular) and Topman to a degree show that you can earn your place with high link-to-mention ratios.
- Topman should rank higher. Is it because of link quality? Certainly the ratio of follow to nofollow is lower than those above them.
“Mortgage calculator”
- This batch looks just at the URL ranking for the term, not the whole domain, to allow us to see both sides of the story.
- The BBC clearly runs away with it in every sense and by these figures would be almost impossible to usurp.
- The data is less correlated here; suggesting that domain-wide factors are definitely at play in factoring which URL should rank where.
- NatWest is a really interesting result as it has very few links relative to those around it. The ratio of no-follow-to-follow is very high, and the brand gets a lot of mentions. The site is also very relevant for “mortgages” as a percentage of overall content.
As a further piece of analysis we have also included a “random” site from page two to support the concept that a combination/ratio of links to mentions affects rankings.
That site is woolwich.co.uk, a small UK building society (bank). Its own mortgage calculator page was ranking 16th when we ran the analysis, and interestingly, we can see that its ratio of follow-to-nofollow is low, as is the number of mentions of the brand in the past four weeks. On pure followed link numbers the site should rank top five, but instead it languishes on page two.
Is there a perfect ratio?
It’s clear that the small sample above is no true reflection of how the ratio of links to mentions affects rankings, but it has certainly raised some interesting points for further discussion and testing.
What certainly can be said is that the measurement of brand mentions is certainly possible, and Google certainly now has the patent to cover it off as a potential ranking factor.
Mentions alone do not tell the whole story, of course, and links are still very important in the overall mix of factors that affect rankings, but it is now time to start thinking about how you can create brand buzz and grow those mentions.
The possible good news for the smaller guys, though, is that it does seem that Panda is beginning to look at how much of a site is relevant to any one specific term and giving over extra “authority” to that site in that niche.
A great example are the building societies in the UK that rely on mortgages to make up the majority of their business. As a result they do seem to rank better for mortgage-related terms.
This could really help specialist businesses compete again with the “big guys.”
What can you do?
The key point here is how you can utilize this data to improve your own strategy, and while there is no conclusive proof that mentions and nofollow links matter, the evidence is starting to pile up.
Given what Mr. Cutts said in the video we mention earlier in the post, the patent for Panda and increasing amounts of data highlighting similar findings, the argument is certainly there.
My advice would be to begin thinking outside of follow links. Be happy to earn (and build) nofollowed links and mentions. Think outside the link, because there IS value in driving mentions and building brand.
How do you do that? The simple answer is to get creative with your communications strategy and build content that will make people talk about you and share. At the heart of that is a great ideation process, and I have previously shared Zazzle Media’s own way of creating great ideas consistently here.
Think particularly around creating content that plays on core emotions and also give things away. In a detailed report first shared in 2010, titled ” Social Transmission, Emotion, and the Virality of Online Content,” the authors discovered there is a strong relationship between emotion and how likely it is your content will “go viral.”
Amongst the many eye-opening discoveries the publication discusses:
- Negative content tends to be less viral than positive content
- Awe-inspiring content and content that surprises or is humorous is more likely to be shared
- Content that causes sadness can become viral but is generally less likely to
- Content that evokes anger is more likely to be shared
And finally, make sure you start tracking and reporting mentions and nofollowed links. Knowing how you are performing can help you iterate your PR and content strategies to achieve greater traction.
In short, compelling content, created over the long term WILL now win, as it is being rewarded.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!
Continue reading →