Recovering the Data Google Has Hidden Away – Whiteboard Friday

Posted by randfish

It’s no secret that Google keeps a lot of secrets. From keyword data to link data to traffic data (and surely more), there’s a lot that we could benefit from — if they’d only share it! Since that’s not likely to happen anytime soon, Rand takes us through various ways to access that all-important data in this week’s Whiteboard Friday.

Recovering the Data Google Has Hidden Away Whiteboard

Click on the whiteboard image above to open a high resolution version in a new tab!

Video Transcription

Hello. My name is Ranigo Montoya. You killed my SEO data. Prepare to die.

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we are celebrating Halloween six days late. Hopefully, that’s all right with all of you. For those of you outside the U.S., Halloween is this holiday where we dress up in costumes. We’ve done it a few times here on Whiteboard Friday, but it’s been a couple of years, so I’m thrilled to be able to bring this back.

I thought in keeping with the theme of Inigo Montoya, one of my favorite characters from “The Princess Bride,” we would talk about recovering some of the data that Google either hides or has taken away from us over the years. It’s actually kind of cool, because there are a number of tools and processes that we have not had access to three years ago, four years ago, some of them even two years ago, that today enable us to do things that are really remarkable.

Let’s start with keyword data.

We need keyword data about traffic, which keywords send traffic to our website, which keywords send traffic to other people’s websites, which traffic comes from ads, which traffic comes from organic, where it goes, and what it does. This is critical because we want to see which terms are actually sending the traffic so we can know which ones to concentrate on.

We want to know which ones convert, because if a keyword converts well that probably means we should be focusing more energy on it. We should be trying to rank higher for it. Maybe we should be bidding on it. Perhaps we should be thinking about expanding the universe of terms and phrases around that term that could send us traffic. We should know which ones to target with paid and which ones deserve some organic rank boosting efforts.

How do we do this? Google now sends all traffic with keyword not provided. What’s our process here? Well, for a while we’ve had some estimation tools. Moz has one in Moz Analytics. I think Conductor‘s got one in their platform. A number of other platforms that rank, track, and then connect to your analytics try and predict which keywords are sending traffic. We can do even better today, and that’s thanks to a couple of tools.

One of them is SimilarWeb. SimilarWeb has a panel of users, essentially people who’ve installed software and toolbars and all sorts of other things that track their browser activity. They’ve opted in to this. This is not without their knowledge. They know. They’ve consented. This panel actually includes millions and millions of users. Thus we can get a real sample set of web users, especially in countries like the United States and in Europe where SimilarWeb’s panel is relatively large and Israel as well. We can see that data at the clickstream level. Unfortunately, I can’t see through this wig.

Because they can see it at the clickstream level, Google is not hiding that. They see this person perform this search. They visited this page, then they visited this page, and then they went over here. That tracking means that if you go into SimilarWeb today, SimilarWeb Pro at least, and you plug in a domain, your domain or other people’s domains, you’ll actually be able to see a list of the keywords that sent them the most traffic and where that traffic went to.

That is killer. You can export this. You can put it into CSV. You can then compare it up to the pages that get traffic. Really, really cool.

SEMrush is another one. SEMrush monitors tens of millions of keywords. I think it’s something in excess of 50 million keywords in the United States and many more millions outside the U.S. as well. They can show who ranks for what today as well as historically, so you can see over time the trend there, and you can see who’s advertising. If you say, “Hey I know that this competitor is targeting the same market space as us,” I can now go to SEMrush. I can plug them in. I can see all the keywords that they’re getting organic traffic from and paid traffic from. Then, I can start to say, “All right. Maybe I should add this to my keyword research list. Maybe I should target these, etc.”

Again, you can match it up with that visit data to say we know that this is the URL that ranked and we know the keyword, so since we know the page that ranked, we can see in our own visit data, from landing page reports, which ones got the traffic and which ones didn’t. That’s pretty darn cool as well. It maybe even gets us to a place of implied click-through rate, which is great.

One thing to be aware of is you have to do a lot of this manually. Today there’s no tool that really connects up like the SEMrush and SimilarWeb data along with a landing report. That’s a little frustrating. Hopefully, we’ll get there soon.

Link data

Link data is one of those things that Google took away many, many years ago. They still provide some link data in Webmaster Tools, now called Search Console, but it is not fantastic. It’s not comprehensive. It’s a little bit of a pain to get through. It cuts off at certain limits, etc. We want to know why because we want to know who’s linking to us and to our competitors. We want to watch for spam. We want to be able to compare our links versus the competition, understand ranking influence from those links potentially, and find new link opportunities. That’s especially true for competitive link analysis.

How? Well, we’ve got the traditional three tools — Majestic, Ahrefs and Moz. There have been a bunch of analyses recently. The way that I think of them is Majestic has the largest index by a long shot. I think they’re two or three times larger than Ahrefs. Ahrefs is anywhere between about 100% this size, so same size as Moz, and 200% depending on how Moz’s indices are doing, which hopefully they’ll be doing a little better soon. Moz is the smallest of these three, which I’m embarrassed to say.

What Moz is really good at is metrics. It’s actually metrics that mean that Moz is so small, because it’s hard to process all those things like page authority and domain authority and spam score, etc.

Ahrefs is terrific for identifying fresh links and high value links. They also have a number of great features inside that tool, that I really like and many SEOs really like, around sorting, filtering, exporting, and visualization.

Majestic has got that huge index. They also have some really great features. They’re getting a little more sophisticated with their metrics. I think their metrics are doing nicely compared to Ahrefs.

Each of these are crawling the web and then building indices that are searchable by all of us, which is great. This means that a lot of this data is recovered.

Let’s be clear. None of those sources — Majestic, Moz, Ahrefs — none of them are the same size as Google. None of them are crawling exactly what Google is crawling. At least here at Moz, and I assume Majestic and Ahrefs do this as well, we try and model the web as best we can around what’s in Google.

When we look at large sets of search results, which we compare to our indices each time, we’re between 75% and 80% of what’s in Google’s results that are in our index. That’s good. I think it’s not great, but it gives you a sense for how these folks are crawling. It’s really about the estimation. Many, many SEOs are combining these sources along with Webmaster Tools in order to find all the links that they possibly can.

Traffic data

You have your own traffic data. But competitive traffic data has always been a pain in the butt. We want that for competitive comparison. We want it to identify missed opportunities like missed channels or links that maybe we were thinking, hey, I’m not sure if I should get a link there, but it looks like that’s very valuable. It’s sending a lot of traffic, or a relationship with a partner, or an API, or a data source, or even an advertising partnership or relationship.

How can we do that?

Well, there are a few tools. I mentioned SimilarWeb before. They’re an excellent choice for this.

There’s also a tool out there called Jumpshot. If you use the virus checker AVG, which is relatively popular, Jumpshot is basically owned by them. Anyone with that virus checker has their browser activity monitored and then sent back to Jumpshot. It’s all anonymized, and yes it’s with consent. You agree to it in the Terms of Use when you download that.

I also think it’s fine to use Quantcast, but only when the site has been quantified, meaning they’ve opted in to Quantcast program and they’ve put Quantcast pixel tracking on their site. Otherwise, Quantcast in our view — and I did an analysis of this just about six months ago — is very, very inaccurate.

That’s true for Quantcast. It’s true for Alexa, comScore, and Compete. I would not recommend any of those other ones.

I haven’t tried Jumpshot personally. I’ve seen some folks say that they’re good but not as good as SimilarWeb. You can check out both of those and see what you think.

What’s really nice about this is being able to look at where my competitors are getting traffic, and how is that increasing or decreasing over time? What are they doing that I’m not doing? What are they doing that I should potentially investigate? It’s a lot like competitive link intelligence except on the traffic side. I think for a broadly-focused web marketer, it’s critical.

Finally, growth or shrinkage of search visibility

This is frustrating. I really wish Google provided this better, but thankfully there are some very good tools out there. What we want to be able to do is track our competitors’ successes in search, watch for potential penalties, and explain why traffic has gone up or down.

Explaining why traffic has gone up or down drives SEOs nuts. I’m sure most of you are sitting there going, “Yeah, that’s just the worst.” How we can do this is SEMrush with rankings, SimilarWeb with competitive traffic, and our own analytics to show us data. By combining these, we can essentially say, “Hey, I saw my search traffic go up. Is that because I now rank for more stuff?”

If you’re rank tracking with Moz Analytics or any of the many ranking solutions out there, you can of course go and look at the tracking that you’ve got there. If you’re not finding the solution in the keywords that you are tracking, you might check out SEMrush, because they might show you data about, hey, here are keywords you haven’t been tracking that they’ve been tracking that are showing why that traffic delta’s happening.

Same thing is true for SimilarWeb’s traffic. You can go and look at the people who are ranking alongside you and say, “Hey, are they still getting the same amount of search traffic that we are? Because if they’ve gone down or they’ve gone up, that suggests that more search volume is to blame, not a rankings change.

Now we can start to sort through these things. We can really figure out who’s rising in the rankings, who’s falling, and why is their traffic going up or down if it is.

With this, we can recover a ton of the data that we’ve lost. None of these are super easy. They’re not completely plug and play. But many of them are friendly, usable, and really useful for when you have these problems.

All right, everyone, look forward to the comments. We will see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Comments are closed.