About frans

Website:
frans has written 4625 articles so far, you can find them below.

7 Empowering Presentations and More from MozCon

Posted by EricaMcGillivray

At the MozPlex, we’re all still coming down from the incredible energy, excitement, and new ideas that MozCon brings every year. Thank you again to all of you who joined us to make this year’s MozCon the best ever. For those of you who couldn’t join us, we wanted to share some of the best slide decks from MozCon (videos coming next month!) and also share downloads for all the decks from MozCon.

Additionally, for those planning ahead, make sure to buy your early bird ticket for MozCon 2015. We expect to sell out again, so grab this great deal now!


Mad Science Experiments in SEO & Social Media

by Rand Fishkin

Rand’s put on his lab coat, literally, and dived into SEO and social media experiments. He looked at the correlations and causations for everything from how rapid tweeting of photos affects follower gains/losses to how clicks might influence SERPs.


You are So Much More than an SEO

by Wil Reynolds

Wil once again brought his A-game and his push that SEO is a growing field and we SEOs must grow with it. He brings us all together in a presentation exploring how with our colleagues in other marketing disciplines, we’re failing to capture business for our brand or clients.


Bad Data, Bad Decisions: The Art of Asking Better Questions

by Stephanie Beadell

Stephanie taught us how to think about the questions we ask in surveys differently. Are we biasing our audience and leading them to answering our survey in a certain way? Are we collecting the wrong types of data?


YouTube: The Most Important Search Engine You Haven’t Optimized For

by Phil Nottingham

Phil brought his video expertise back to the MozCon stage. This year, he tackled YouTube, the world’s second largest search engine, but one often ignored by marketers. Phil puts you on track to stop being befuddled and make a YouTube plan for your brand.


How to Never Run Out of Great Ideas

by Dr. Pete Meyers

Pete surprised everyone this by talking about not the Google Algo. Instead, he dove into one of his other passions: creating great content. Pete shows you how to be brave and build out your big idea.


Scaling Creativity: Making Content Marketing More Efficient

by Stacey (Cavanagh) MacNaught

Stacey’s presentation followed up Pete by diving into how to make the content process happen, especially if you have multiple clients or work at an agency. She addressed how to find the right audience for your content. And then how to throw all your ideas on the table and sort out the best ones.


How to Use Social Science to Build Addictive Communities

by Richard Millington

Rich believes in the power of communities. He walked the MozCon audience through how to build up a community through shared experiences and rituals. Rich also showed how to make a business case for community building.


Can’t get enough MozCon decks? You can download all of them in the Agenda section on the MozCon page.

Buy Your MozCon 2015 Ticket!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Retaining SEO Value in Syndicated Content and Partnerships

Posted by Laura.Lippay

Link exchanges vs. partnerships

Six years ago, Yahoo! was called out (on this very blog!) for buying text links. Being the lone SEO at Yahoo! in the US at the time, training, teaching, guiding and policing all of the people involved in over a dozen Yahoo! Media websites, my heart stopped when I saw this post. The thing is, though, I knew the biz dev team at Yahoo! had absolutely no concept of link exchanges for SEO (that said, I have no idea about Great Schools – those were some nice anchor text links).

While most SEOs work on link relationships, most biz dev folks, especially in mid to larger sized online companies, work on business relationships. Every Yahoo! property had biz dev folks who were actively making deals to work with other sites for things like:

  • Access to complimentary content that Yahoo! didn’t have on the site (like the partnership between Yahoo! Real Estate and Great Schools in that example).
  • Exchanging content or links in hopes of getting more visibility and traffic, like the links to partners Heavy and Bleacher Report at the bottom of Mandatory.com’s site.
  • Syndicating content to other sites for more visibility, like Eventbrite’s events syndicated out to distribution partners, or receiving syndicated content in order to provide more content to the visitors, like mom.me’s content from SheKnows or StyleList.

Most biz dev at Yahoo! was done horribly wrong in the SEO sense actually, with links in JavaScript or content in iFrames, or linking out to more SEO-savvy partners who were nofollowing their links back. So I set out to educate Yahoo! biz devs with the powerful opportunities they were missing with this guide to retaining SEO value in partnerships (updated for today’s biz devs). I still use it often for the larger companies I work with, and I hope some of you find it useful for your clients or yourselves as well.

Note that it’s very important, whether you’re an SEO working with biz devs or you’re a biz dev working on partnerships, that the things mentioned below are thoroughly considered before writing and signing a contract with a partner, since some of these things will need to be spelled out in the contract, and oftentimes negotiated.

Any additional ideas are gladly welcome in the comments!

The importance of SEO in partnerships

Search engines follow links across the web to discover and classify content. The content and context of pages linking to each other is taken into consideration in classifying content and surfacing it in search results.

Consider these factors that contribute to a site and/or page’s ranking:

Links = votes

Links to a site are treated like votes to the site/page. The quality, quantity and context of the links from one page to another are used by search engines in classifying and ranking a page.

Links = relationships

Any pages linking to each other are related to each other. This can include links in articles, in footers, in content modules and in comments among others. This can be helpful when related content links to each other (on the same site or across different sites). This can be damaging when receiving links from low-quality, spammy sites (typical in link-building) or linking to low-quality or spammy sites (typical in UGC comments).

Syndication = content duplication

Any time the same or very similar content populates the majority of more than one page on the internet, there is a good chance that the duplicates will be hidden from search results. The search engine will attempt to pick/choose the best version of the duplicate for searchers and hide the rest so other content options can appears in the search results.

Search engines can’t always determine content source

When there is more than one version of the same content, search engines will try to determine the source and provide that in search results. Oftentimes when content is syndicated, the source does not actually rank first, especially if a small or newer site is syndicating out to larger, older and/or more popular sites with more activity.

Best practices for linking to partners

This depends on the nature of the partnership & competition. Consider what should be written into the contract ahead of time.

  • Options for linking to competitive content on partner sites (you are trying to rank for/drive traffic for the same thing as the partner):
    • Don’t link: If you don’t need to link to the competitive content on the partner site, don’t do it.
    • Add Nofollow: Adding a nofollow tag on the link (in the code) tells search engines that you may not trust what is on the other end of that link, so you’re not officially “voting” for it. Not linking to/voting for the partner content can potentially help in preventing it from outranking yours. This may need to be negotiated, since it’s possible both parties will want links without a nofollow on it.
    • JavaScript Links: You can link to the partner with the link in JavaScript code. Search engines often pick up on JavaScript links today, but still more often ignore it (so far).
  • Options for non-competitive content (you are not trying to rank for/drive traffic for the same thing):
    • Link freely and naturally, in ways that work best for user experience.

Best practices for getting links from partners

For any inbound links from partners (in articles, content modules, on the site, etc), check how the links will be treated, and make sure the treatment specifications you want are written in the contract. Here are suggested options:

  • Require a link: Require that the article links back to the original on your site. This can be text link “[Article Title] originally appeared on yoursite.com”, with the article title being the hyperlink back to the original article. Make sure the link goes to the original article URL on your site, and not to the home page.
  • Check the links from their site to yours:
    • No nofollow tags on links from the partner site to yours: This may need to be negotiated (for the same reasons as we’re saying to add nofollows on links from your site to partner sites above). Nofollow tags typically don’t pass value to the destination page.
    • No links in JavaScript: Since links in JavaScript typically aren’t crawled and/or utilized in ranking by search engines, links to your site from partners that are in JavaScript wouldn’t provide the value to your site/page that a regular crawlable text link would.
    • No links as images: The best link is a keyword-rich link. A linked image (even if the image is of text), may not be interpreted the same and will often not carry as much weight as a text link. Images may have alt attributes that describe the image (which search engines take into account) but that does not carry as much weight as a regular text link.
    • No 302 redirects on links: When Google encounters a 302 redirect it keeps the original page in the index and doesn’t pass PageRank onto the destination URL (since a 302 redirect is technically a temporary redirect). Do not allow partners to send the link through a 302 redirect to your site.
    • Keywords in links: When possible, try to have partners link to your page(s) using relevant anchor text. The anchor text of a link provides context for search engines and can help a page rank for that text. For example, if a partner is linking to your article about The Best Geeky Books of 2011, make sure they use the title of the article The Best Geeky Books of 2011 (or something similar and relevant) as the link text rather than something vague like click here or visit our partner (that’s not what you want to rank for).
    • Linking 1:1 relationships: Make sure that links from partner pages go to the most relevant pages on your site. Do not just have them link to your home page. If possible, link to related articles or similar content. This helps provide context for search engines, provides a better experience for users, and can help bring visibility to deeper pages on your site.
  • Check canonical tags: Check the canonical tag in the head section of the code on the articles you’ve syndicated to partners. Make sure that canonical points to the article on your site. Otherwise there should ideally be no canonical tag.
  • Specify linking and redirect rules: Specify rules for what domains should and should not be linking and redirecting to your site. A partner may want to redirect some old domains as part of a package of sites in their network that can send traffic to your site, but this might actually hurt your site’s performance in Google. Rankings and traffic should be tested any time a new domain is redirected to the site.

Content sharing/syndication best practices

For content syndicated from your site to a site on another domain or subdomain

Important considerations

  • Deep linking within your content: When possible (and user-friendly), provide links in your article/blog content to other areas on your site that are referred to in the content. For example, in an article about The San Francisco Giants Suspension of Guillermo Mota on a sports site, link the first mention of Guillermo Mota in the article back to the Guillermo Mota page on your website. Do not overdo it – only provide links where readers might want more information and only the first instance. User experience should always come first.
  • Absolute URLs: Make sure links in your content that is being syndicated are absolute (full URL) not relative (partial URL). Relative URL links in syndicated content will link back to the partner site, whereas absolute URL links will link back to your site.
  • No parameters: If possible, do not add parameters onto links in content you’re syndicating out. Search engines see parameters as a different URL. If you must use parameters, make sure the correct treatment of parameters is specified in Webmaster Tools.
  • No link stripping: Make sure the partners are not stripping/removing links that are in your content once it’s on their site.

Highly Recommended Considerations (that may need to be negotiated)

  • Rel=canonical: Require that the partner add the rel=canonical tag to the head of pages specifying the article URL on your site as the canonical. This tells search engines that of these duplicates, the one at your site is the canonical, or primary version.
  • Publish first: Publish the content you’ll be syndicating on your own site before allowing partners to publish it. This can help identify your site as the source (and also generate more links from other sites and social networks).
  • Limited text syndication: You can allow partners to show a limited amount of text and then have the readers click to view all/more, bringing them to your site. This allows the full article to only live on your site and is also a better traffic driver.
  • Noindex: Allow the partners to syndicate the content on their site but they must add a noindex tag to the header of those pages on their site. This will allow their site visitors to view and share the content, but the content will not be crawled by search engines.

Links in blog and editorial content being syndicated

  • Editors: Editors can link straight to the end destination in blog posts and articles (whoever they’re linking to has earned it). No special linking rules to follow.
  • Developers:
    • Absolute URLs: Make sure all links in content being syndicated out are absolute URLs (the link is the full URL). This way when the article is picked up in other places the link is not broken, and it links back to your site.
    • Parameters: If using parameters on links (not recommended unless necessary), make sure to specify how Google should treat those parameters in Google Webmaster Tools.
    • No nofollows: Do not add the nofollow tag to links in content you’re syndicating out (if you control the HTML).

Links in user generated content (UGC) on your site

This depends on the nature of the UGC content.

  • Comment links: Links are ideally not allowed in comments because of the potential for comment spam. If they are allowed they should always have a nofollow tag (placed on the link in the code).
  • Options for profiles and other UGC content:
    • If content is not moderated, allow links as text only (not hyperlinked) or not at all.
    • If content is moderated, links should be ok, but moderators should be trained in how to recognize and combat link spam, as it can easily look like natural linking.

For more information

Cross-domain canonical tags:
http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html

Nofollow tags:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=96569

Absolute vs. Relative URLs:


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

What Happened after Google Pulled Author and Video Snippets: A Moz Case Study

Posted by Cyrus-Shepard

In the past 2 months Google made big changes to its search results

Webmasters saw disappearing  Google authorship photos, reduced video snippets, changes to local packs and in-depth articles, and more.

Here at Moz, we’ve closely monitored our own URLs to measure the effect of these changes on our actual traffic. The results surprised us.

Authorship traffic—surprising results

In the early days of authorship, many webmasters worked hard to get their photo in Google search results. I confess, I doubt anyone worked harder at author snippets than me

Search results soon became crowded with smiling faces staring back at us. Authors hired professional photographers. Publishers worked to correctly follow Google’s guidelines to set up authorship for thousands of authors.

The race for more clicks was on.

Then on June 28th, Google cleared the page. No more author photos. 

To gauge the effect on traffic, we examined eight weeks’ worth of data from Google Analytics and Webmaster Tools, before and after the change. We then examined our top 15 authorship URLs (where author photos were known to show consistently) compared to non-authorship URLs. 

The results broke down like this:

Change in Google organic traffic to Moz

  • Total Site:  -1.76%
  • Top 15 Non-Authorship URLs:  -5.96%
  • Top 15 Authorship URLs:  -2.86%

Surprisingly, authorship URLs performed as well as non-authorship URLs in terms of traffic. Even though Moz was highly optimized for authors, traffic didn’t significantly change.

On an individual level, things looked much different. We actually observed big changes in traffic with authorship URLs increasing or decreasing in traffic by as much as 45%. There is no clear pattern: Some went up, some went down—exactly like any URL would over an extended time.

Authorship photos don’t exist in a vacuum; each photo on the page competed for attention with all the other photos on the page. Each search result is as unique as a fingerprint. What worked for one result didn’t work for another.

Consider what happens visually when multiple author photos exist in the same search result:

One hypothesis speculates that more photos has the effect of drawing eyes down the page. In the absence of rich snippets, search click-through rates might follow more closely studied models, which dictate that results closer to the top earn more clicks.

In the absence of author photos, it’s likely click-through rate expectations have once again become more standardized.

Video snippets: a complex tale

Shortly after Google removed author photos, they took aim at video snippets as well. On July 17th, MozCast reported a sharp decline in video thumbnails.

Most sites, Moz included, lost 100% of their video results. Other sites appeared to be “white-listed” as reported by former Mozzer Casey Henry at Wistia. 

A few of the sites Casey found where Google continues to show video thumbnails:

  • youtube.com
  • vimeo.com
  • vevo.com
  • ted.com
  • today.com
  • discovery.com

Aside from these “giants,” most webmasters, even very large publishers at the top of the industry, saw their video snippets vanish in search results.

How did this loss affect traffic for our URLs with embedded videos? Fortunately, here at Moz we have a large collection of ready-made video URLs we could easily study: our Whiteboard Friday videos, which we produce every, well, Friday. 

To our surprise, most URLs actually saw more traffic.

On average, our Whiteboard Friday videos saw a 10% jump in organic traffic after losing video snippets.

A few other with video saw dramatic increases:

The last example, the Learn SEO page, didn’t have an actual video on it, but a bug with Google caused them to display an older video thumbnail. (Several folks we’ve talked to speculate that Google removed video snippets simply to clean up their bugs in the system)

We witnessed a significant increase in traffic after losing video snippets. How did this happen? 

Did Google change the way they rank and show video pages?

It turns out that many of our URLs that contained videos also saw a significant change in the number of search impressions at the exact same time.

According to Google, impressions for the majority of our video URLs shot up dramatically around July 14th.

Impressions for Whiteboard Friday URLs also rose 20% during this time. For Moz, most of the video URLs saw many more impressions, but for others, it appears rankings dropped.

While Moz saw video impressions rise, other publishers saw the opposite effect.

Casey Henry, our friend at video hosting company Wistia, reports seeing rankings drop for many video URLs that had thin or little content.

“…it’s only pages hosting video with thin content… the pages that only had video and a little bit of text went down.”
Casey Henry

For a broader perspective, we talked to Marshall Simmonds, founder of Define Media Group, who monitors traffic to millions of daily video pageviews for large publishers. 

Marshall found that despite the fact that most of the sites they monitor lost video snippets, they observed no visible change in either traffic or pageviews across hundreds of millions of visits.

Define Media Group also recently released its 2014 Mid-Year Digital Traffic Report which sheds fascinating light on current web traffic trends.

What does it all mean?

While we have anecdotal evidence of ranking and impression changes for video URLs on individual sites, on the grand scale across all Google search results these differences aren’t visible.

If you have video content, the evidence suggests it’s now worth more than ever to follow video SEO best practices: (taken from video SEO expert Phil Nottingham)

  • Use a crawlable player (all the major video hosting platforms use these today)
  • Surround the video with supporting information (caption files and transcripts work great)
  • Include schema.org video markup

SEO finds a way

For the past several years web marketers competed for image and video snippets, and it’s with a sense of sadness that they’ve been taken away.

The smart strategy follows the data, which suggest that more traditional click-through rate optimization techniques and strategies could now be more effective. This means strong titles, meta descriptions, rich snippets (those that remain), brand building and traditional ranking signals.

What happened to your site when Google removed author photos and video snippets? Let us know in the comments below.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

CRO Statistics: How to Avoid Reporting Bad Data

Posted by CraigBradford

Without a basic understanding of statistics, you can often present misleading results to your clients or superiors. This can lead to underwhelming results when you roll out new versions of a page which on paper look like they should perform much better. In this post I want to cover the main aspects of planning, monitoring and interpreting CRO results so that when you do roll out new versions of pages, the results are much closer to what you would expect. I’ve also got a free tool to give away at the end, which does most of this for you.

Planning

A large part running a successful conversion optimisation campaign starts before a single visitor reaches the site. Before starting a CRO test it’s important to have:

  1. A hypothesis of what you expect to happen
  2. An estimate of how long the test should take
  3. Analytics set up correctly so that you can measure the effect of the change accurately

Assuming you have a hypothesis, let’s look at predicting how long a test should take.

How long will it take?

As a general rule, the less traffic that your site gets and/or the lower the existing conversion rate, the longer it will take to get statistically significant results. There’s a great tool by Evan Miller that I recommend using before starting any CRO project. Entering the baseline conversion rate and the minimum detectable effect (i.e. What is the minimum percentage change in conversion rate that you care about, 2%? 5%? 20%?) you can get an estimate of how much traffic you’ll need to send to each version. Working backwards from the traffic your site normally gets, you can estimate how long your test is likely to take. When you arrive on the site, you’ll see the following defaults:

Notice the setting that allows you to swap between ‘absolute’ and ‘relative’. Toggling between them will help you understand the difference, but as a general rule, people tend to speak about conversion rate increases in relative terms. For example:

Using a baseline conversion rate of 20%

  • With a 5% absolute improvement – the new conversion rate would be 25%
  • With a 5% relative improvement – the new conversion would be 21%

There’s a huge difference in the sample size needed to detect any change as well. In the absolute example above, 1,030 visits are needed to each branch. If you’re running two test versions against the original, that looks like this:

  • Original – 1,030
  • Version A – 1,030
  • Version B – 1,030

Total 3,090 visits needed.

If you change that to relative, that drastically changes: 25,255 visits are needed for each version. A total of 75,765 visits.

If your site only gets 1,000 visits per month and you have a baseline conversion rate of 20%, it’s going to take you 6 years to detect a significant relative increase in conversion rate of 5% compared to only around 3 months for an absolute change of the same size.

This is why the question of whether or not small sites can do CRO often comes up. The answer is yes, they can, but you’ll want to aim higher than a 5% relative increase in conversions. For example, If you aim for a 35% relative increase (with 20% baseline conversion), you’ll only need 530 visits to each version. In summary, go big if you’re a small site. Don’t test small changes like button changes, test complete new landing pages, otherwise it’s going to take you a very long time to get significantly better results.

Analytics

A critical part of understanding your test results is having appropriate tracking in place. At Distilled we use Optimizely so that’s what I’ll cover today; fortunately Optimizely makes testing and tracking really easy. All you need is a Google analytics account that has a custom variable (custom dimension in universal analytics) slot free. For either Classic or Universal Analytics, begin by going to the Optimizely Editor, then clicking Options > Analytics Integration. Select enable and enter the custom variable slot that you want to use, that’s it. For more details, see the help section on the Optimizely website here.

With Google analytics tracking enabled, now when you go to the appropriate custom variable slot in Google Analytics, you should see a custom variable named after the experiment name. In the example below the client was using custom variable slot 5:

This is a crucial step. While you can get by by just using Optimizely goals like setting a thankyou page as a conversion, it doesn’t give you the full picture. As well as measuring conversions, you’ll also want to measure behavioral metrics. Using analytics allows you to measure not only conversions, but other metrics like average order value, bounce rates, time on site, secondary conversions etc.

Measuring interaction

Another thing that’s easy to measure with Optimizely is interactions on the page, things like clicking buttons. Even if you don’t have event tracking set up in Google Analytics, you can still measure changes in how people interact with the site. It’s not as simple as it looks though. If you try and track an element in the new version of a page, you’ll get an error message saying that no items are being tracked. See the example from Optimizely below:

Ignore this message, as long as you’ve highlighted the correct button before selecting track clicks, the tracking should work just fine. See the help section on Optimizely for more details.

Interpreting results

Once you have a test up and running, you should start to see results in Google Analytics as well as Optimizely. At this point, there’s a few things to understand before you get too disappointed or excited.

Understanding statistical significance

If you’re using Google analytics for conversion rates, you’ll need something to tell you whether or not your results are statistically significant – I like this tool by Kiss Metrics which looks like this:

It’s easy to look at the above and celebrate your 18% increase in conversions – however you’d be wrong. It’s easier to explain what this means with an example. Let’s imagine you have a pair of dice that we know are exactly the same. If you were to roll each die 100 times, you would expect to see each of the numbers 1-6 the same number of times on both die (which works out at around 17 times per side). Let’s say on this occasion though we are trying to see how good each die is at rolling a 6. Look at the results below:

  • Die A – 17/100 = 0.17 conversion rate
  • Die B – 30/100 = 0.30 conversion rate

A simplistic way to think about Statistical significance is it’s the chance that getting more 6s on the second die was just a fluke and that it hasn’t been optimised in some way to roll 6s.

This makes sense when we think about it. Given that out of 100 rolls we expect to roll a 6 around 17 times, if the second time we rolled a 6 19/100 times, we could believe that we just got lucky. But if we rolled a 6 30/100 times (76% more), we would find it hard to believe that we just got lucky and the second die wasn’t actually a loaded die. If you were to put these numbers into a statistical significance tool (2 sided t-test), it would say that B performed better than A by 76% with 97% significance.

In statistics, statistical significance is the complement of the P value. The P value in this case is 3% and the complement therefore being 97% (100-3 = 97). This means there’s a 3% chance that we’d see results this extreme if the die are identical.

When we see statistical significance in tools like Optimizely, they have just taken the complement of the P-value (100-3 = 97%) and displayed it as the chance to beat baseline. In the example above, we would see a chance to beat baseline of 97%. Notice that I didn’t say there’s a 97% chance of B being 76% better – it’s just that on this occasion the difference was 76% better.

This means that if we were to throw each dice 100 times again, we’re 97% sure we would see noticeable differences again, which may or may not be by as much as 76%. So, with that in mind here is what we can accurately say about the dice experiment:

  • There’s a 97% chance that die B is different to die A

Here’s what we cannot say:

  • There’s a 97% chance that die B will perform 76% better than die A

This still leaves us with the question of what we can expect to happen if we roll version B out. To do this we need to use confidence intervals.

Confidence intervals

Confidence intervals help give us an estimate of how likely a change in a certain range is. To continue with the dice example, we saw an increase in conversions by 76%. Calculating confidence intervals allow us to say things like:

  • We’re 90% sure B will increase the number of 6s you roll by between 19% to 133%
  • We’re 99% sure B will increase the number of 6s you roll by between -13% to 166%

Note: These are relative ranges. That being -13% less than 17% and 166% greater than 17%.

The three questions you might be asking at this point are:

  1. Why is the range so large?
  2. Why is there a chance it could go negative?
  3. How likely is the difference to be on the negative side of the range?

The only way we can reduce the range of the confidence intervals is by collecting more data. To decrease the chance of the difference being less than 0 (we don’t want to roll out a version that performs worse than the original) we need to roll the dice more times. Assuming the same conversion rate of A (0.17%) and B (0.3%) – look at the difference increasing the sample size makes on the range of the confidence intervals.

As you can see, with a sample size of 100 we have a 99% confidence range of -13% to 166%. If we kept rolling the dice until we had a sample size of 10,000 the 99% confidence range looks much better, it’s now between 67% better and 85% better.

The point of showing this is to show that even if you have a statistically significant result, it’s often wise to keep the test running until you have tighter confidence intervals. At the very least I don’t like to present results until the lower limit of the 90% interval is greater than or equal to 0.

Calculating average order value

Sometimes conversion rate on its own doesn’t matter. If you make a change that makes 10% fewer people buy, but those that do buy spend 10x more money, then the net effect is still positive.

To track this we need to be able to see the average order value of the control compared to the test value. If you’ve set up Google analytics integration like I showed previously, this is very easy to do.

If you go into Google analytics, select the custom variable tab, then select the e-commerce view, you’ll see something like:

  • Version A 1000 visits – 10 conversions – Average order value $50
  • Version B 1000 visits – 10 conversions – Average order value $100

It’s great that people who saw version B appear to spend twice as much, but how do we know if we just got lucky? To do that we need to do some more work. Luckily, there’s a tool that makes this very easy and again this is made by Evan Miller: Two sample t-test tool.

To find out if the change in average order value is significant, we need a list of all the transaction amounts for version A and version B. The steps to do that are below:

1 – Create an advanced segment for version A and version B using the custom variable values.

2 – Individually apply the two segments you’ve just created, go to the transactions report under e-commerce and download all transaction data to a CSV.

3 – Dump data into the two-sample t-test tool

The tool doesn’t accept special characters like $ or £ so remember to remove those before pasting into the tool. As you can see in the image below, I have version A data in the sample 1 area and the transaction values for version B in the sample 2 area. The output can be seen in the image below:

Whether or not the difference is significant is shown below the graphs. In this case the verdict was that sample 1 was in fact significantly different. To find out the difference, look at the “d” value where is says “difference of means”. In the example above the transactions of those people that saw the test version were on average $19 more than those that saw the original.

A free tool for reading this far

If you run a lot of CRO tests you’ll find yourself using the above tools a lot. While they are all great tools, I like to have these in one place. One of my colleagues Tom Capper built a spreadsheet which does all of the above very quickly. There’s 2 sheets, conversion rate and average order value. The only data you need to enter in the conversion rate sheet is conversions and sessions, and in the AOV sheet just paste in the transaction values for both data sets. The conversion rate sheet calculates:

  1. Conversion rate
  2. Percentage change
  3. Statistical significance (one sided and two sided)
  4. 90,95 and 99% confidence intervals (Relative and absolute)

There’s an extra field that I’ve found really helpful (working agency side) that’s called “Chance of <=0 uplift”.

If like the example above, you present results that have a potential negative lower range of a confidence interval:

  • We’re 90% sure B will increase the number of 6s you roll by between 19% and 133%
  • We’re 99% sure B will increase the number of 6s you roll by between -13% and 166%

The logical question a client is going to ask is: “What chance is there of the result being negative?”

That’s what this extra field calculates. It gives us the chance of rolling out the new version of a test and the difference being less than or equal to 0%. For the data above, the 99% confidence interval was -13% to +166%. The fact that the lower limit of the range is negative doesn’t look great, but using this calculation, the chance of the difference being <=0% is only 1.41%. Given the potential upside, most clients would agree that this is a chance worth taking.

You can download the spreadsheet here: Statistical Significance.xls

Feel free to say thanks to Tom on Twitter.

This is an internal tool so if it breaks, please don’t send Tom (or me) requests to fix/upgrade or change.

If you want to speed this process up even more, I recommend transferring this spreadsheet into Google docs and using the Google Analytics API to do it automatically. Here’s a good post on how you can do that.

I hope you’ve found this useful and if you have any questions or suggestions please leave a comment.

If you want to learn more about the numbers behind this spreadsheet and statistics in general, some blog posts I’d recommend reading are:

Why your CRO tests fail

How not to run an A/B test

Scientific method: Statistical errors


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Real-World Panda Optimization – Whiteboard Friday

Posted by MichaelC

The Panda algorithm looks for high-quality content, but what exactly is it looking for, how is it finding what it deems to be high-quality, and—perhaps most pressingly—what in the world can we do to befriend the bear?

In today’s Whiteboard Friday, Michael Cottam explains what these things are, and more importantly, what we can do to be sure we get the nod from this particular bear.

For reference, here’s a still of this week’s whiteboard!

Video transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. I’m Michael Cottam. I’m an independent SEO consultant from Portland, Oregon and have been a Moz associate for many years.

Today we’re going to talk about Panda optimization. We’re going to talk about real world things you can do, no general hand waving. We’re going to talk about specific tactics you can use. We’re going to talk about first of all what does Panda measure, secondly, how might Panda actually go about measuring these factors on your site, and then lastly, what are you going to do to win based on those factors.

What does Panda measure (and what can we do about it)?

To start off, this is the list of the major factors we’re going to talk about for Panda: thin or thick content; the issues around duplicate or original content; the top heavy part of the Panda algorithm; how do you come up with fabulous images and how is Panda going to measure how fabulous they are; and rich interactive experience pieces.

Thin (thick) content

First of all, thin/thick content. Certainly, a lot of sites got penalized when Panda first came out where the site design had basically broken the content out into a lot of pages with just a few sentences on it. Here we’re talking about how much text is there per page? How might Panda actually go about measuring this? This is probably the easiest piece to measure of everything on here. It’s very simple programmatically to strip all the HTML tags out and then just do a word count.

There was a study done — I think it was last summer by serpIQ, and there’ll be a link to that in the notes — that showed that for reasonably competitive terms you needed 1,500 to 2,500 words on a page to rank on page 1. They average this over ten or twenty thousand different keyword searches. Stripping out the HTML tags, count the words, what do you have left? Analyze your own pages and see if you’re up near that 1,500 mark.

How do we win on that? Well, this is all about size matters. At least 1,500 words, push to 2,000 or 2,500 if you can. Sometimes that may mean going through your site and condensing four or five pages of content all into one page. You might think, well, that might make a giant long page, terrible user experience. But you can solve this with tab navigation so all the content is on the page. When you click a tab, JavaScript changes the CSS style of the various tabs to make one part show versus the other part. Google’s going to see everything in all those tabs when they crawl the page, because it’s all in the HTML before you click.

Duplicate/original content

The second thing let’s talk about is duplicate and original content. Now there’s been a ton of stuff written about duplicate content and penalties and how does Google check this, that, and the other.

Lately we’ve seen a bunch of different blog posts from different places talking about press releases and how press releases, well, they’re evil. The links don’t count. Google didn’t spot them all. Google is much better at it than they used to be. But still, if you do a Google search on any e-press release you’ve done, you generally find if you search the first sentence or so of the press release, you’ll find four or five indexed pages containing that. But that’s way better than it was 3 years ago when you’d get 60 pages all be indexed still with nothing else in it.

The press release piece is probably the easiest piece for Google to measure for original content, because if you think about what happens when a press release is republished, you’ve got the site template from whichever news site or industry site is going to run it, header/footer, maybe some sidebar and some ads, you have the press release as one contiguous chunk, and that’s really it. If Google’s going to do page chunking to try to pull out the template, and the header and the footer, and things like that and see what is the core content of the page, that’s probably the simplest case for them to do.

If you’re interweaving bits of text you got from different places with your own text, customer reviews, things like that, that aren’t going to be the same as other sites, then it’s much harder for Google to spot.

What might Google be doing to try to decide does this block of text on your page exist on a hundred other sites? There are various techniques like hashing, or there are ways to record a thumbprint vaguely of what the word patterns are and things like that. That’s not the hard part. There’s lots of talk about the thumbprint and hashing.

The difficult part is if you’ve got a page that’s got content from 12 different places and it’s not just all the manufacturer’s product content or whatever, you’ve got you’ve got your own customer reviews or your own intro sentence at the top, things like that, if you interweave that, that makes it very difficult for Google to go and chunk the page up into meaningful pieces, know when the chunks start and end, and then compare that to what they found on all the other sites that happen to be selling the same product that you’ve got to put the product description on from your site, etc.

What do you do to win there? You really want to interweave the original content that you’ve created. That might be your overview, your customers’ reviews, things like that, your ratings. Interweave that with the stock text and photos. Break it up a bit. What you don’t want is one giant block of text that is exactly the same as that giant block of text that’s on the other hundred sites that are selling the same product you’re trying to sell.

Top-heaviness

Let’s talk about top heavy, a pretty important part of the Panda algorithm. Mostly when people talk about the top heavy algorithm, the example they give is ads above the fold. But if you actually read what Google said about it when they launched it, the description of what they’re trying to solve, it’s not really just about ads above the fold. It’s about anything that’s not content above the fold and your structure of your website pushing that content down, so that when the user lands on your page, they can’t get anything useful without scrolling. That’s what it’s really about.

How might Google be going about measuring whether your site or your page is top heavy? Certainly, if you look at the tools that are built into the Chrome developer tools, Firefox developer tools has similar sorts of things where they can render the whole page there and give you the dimensions and highlight that on top of the page for you. So certainly it’s very easy for them to go and render the whole page.

They’re not going to read through the HTML and assume the first X number of words is above the fold. No sites render that way any more. So they’re going to have to be rendering it to determine above the fold. There’s just too much CSS positioning happening today.

So render and measure the pixels. Then how do you know whether it’s ads or template or content? Now with a lot of the stuff I’m saying here we don’t know absolutely what Google is doing to measure these things, but we can guess and infer based on how we see it behave, what ranks and doesn’t, and also just knowing how parsers are written, how crawlers are written, things like that, what’s possible.

The simplest way, if I were Google Panda, the way I would decide whether something was content or not is I would see if it was clickable. It’s very easy to tell whether a given element there is linked to anything else. This is not going to be a foolproof thing, but your menus are going to be clickable, ads are going to be clickable for sure, navigation buttons are going to be clickable.

There are going to be some false positives with things like photo carousels that may be clickable to advance and things like that. But in general, if you’re trying to do a quick and dirty analysis and say what above the fold is content, if you wipe out everything that’s clickable and wipe out everything that’s white space, you should be left with various blocks around the screen which is probably going to be content. That’s probably what they’re doing. I pretty much bet on that.

How do you win? First of all, minimize your header. If your header has a lot of white space and things are stacked, that’s going to push the content down further on every single page on your site. Look at: Does the width of your main menu bar really have to have that much space above and below it? Has your logo got a lot of white space before the top of the page? Are you putting your share buttons down in a way that pushes everything down? Look for those sorts of things, because a little bit of win there moves a lot of content up the page above the fold on every page of your site.

Another question might be: Okay, so what’s above the fold? Obviously, we don’t know for sure, but we can guess since the vast majority of people are running browsers that are better than 1280 by 1000, that’s probably a good benchmark. If you’re analyzing your own site, look at it with 1280 by 1000, and that’s most likely about the kind of dimensions that Google’s looking at for above the fold.

Image fabulosity

Images are certainly rich content. Everybody loves images rather than text. It makes a much more engaging experience. How is Google going to go and measure how fabulous your images are?

If you’ve got great, fabulous original images, then that’s probably great content to show the user. If you’ve got the same product photos that the other hundred websites all have, then not so much.

What’s Google likely to be doing? First of all, if you’ve never played with Google reverse image search, give it a shot. It’s incredibly powerful. I do a lot of work in the travel industry, and the problem with the travel industry is if you’re brochuring hotels on your site, really your only source for hotel photos, unless you travel to all the destinations and shoot them yourselves, very expensive of course, is you’re going to get the hotel’s image library.

You could take those images. Maybe they show up as 5 mg photos in TIFF format. You can change them to JPEG. You can shrink them down to maybe 1000 pixels wide from the original 5000. You can do a little sharpening. You can convert the formats. You might change the contrast. You might even overlay some text, save it with a different file name. Google will still spot those.

If you do a reverse image search on a hotel photo from pick any site you want, you’ll find hundreds and hundreds and hundreds of other sites that all have the exact same photo. They’re all named differently. They have different dimensions. Some are JPEG, some are PNG files, etc.

Google reverse image search is really good. To think that Panda isn’t using that to decide whether you have original images I think is crazy. If they’re not doing it, they’ll be doing it next week. Don’t think that just because you renamed a file or cropped it or resized it a little, that you now have an original image. You do not.

Image dimensions are undoubtedly another factor that Google’s going to be looking at. Nobody really wants to decide to go to overwater bungalows in Bora-Bora by looking at little tiny, postage stamp size thumbnails. If you’ve got big thousand pixel wide pictures of these things, that’s fabulous content. You’ve got to expect Panda is going to like that because users are going to like that. Size and originality.

How do you win? Go big. Be original. Okay, you say, “But how do I be original? I’ve got X number of hundred or thousand products on the site. It all comes from manufacturers. I can’t shoot my own photos.”

Consider for your major search targets, like category pages, so not necessarily individual product pages but category pages, make up an image that’s a collage of some of those other images. Take those pieces, glue them together, use whatever Photoshop kind of software you want, but make up a new image that consists of images that are from the manufacturers of the products in that category, and that can be your new image header for that page. Make that category page, which is probably a better search target for you anyway, rank better.

Interactive experience

Certainly, a more engaging page is one where there’s a video to play, or a map you can zoom in on and browse around and see where the hotels are and click on and things like that. Undoubtedly, part of what Panda’s doing is measuring your site to say how much fun is there here to play with for the user.

How’s Google going to measure that? Well, this is an interesting issue, because if you look at how YouTube videos are embedded, by default it’s with an iframe. If you look at how a lot of the mapping tools are embedded by default, it’s with an iframe.

Why is that bad? Let’s think about how Google has considered iframe content in the past in terms of links and on page content and things like that. If you iframe it in, Google has been considering it as belonging to the page it was iframed in, not the page that is embedding that content. So the risk you have here is if you’re using iframes to embed maps or videos, things like that, not sure that Panda’s going to be able to spot that and realize you’ve got embedded rich content.

Chances are with YouTube, Wistia, Vimeo, and a few like that Google’s probably done a little bit of work to try to spot iframed in videos. But you know what? There’s a better solution there. With Wistia, you’ve got the SEO embed type that creates an embed object, not an iframe. YouTube, there’s the little checkbox, after you click Share Embed, that says “use old embed code.” So you can do that.

The other thing you can consider is where you don’t have a video already and you want to add rich content, make an introductory video for a category, for your company, for a product. It can be the same stuff that you’ve already written as content for that category or about your company, about us, that sort of stuff. Just talk to the camera and do a 30 second introductory video for that category, that product, or read your review out basically from a whiteboard behind the camera. Then use the transcript of that video as that extra text content on the page.

When we talk about maps, I really prefer to use the Google Maps API. It’s a JavaScript API. You might have some questions. Can Google follow the JavaScript? Well, I think in the case of maps it’s their own product, and certainly Google’s interested in knowing whether a page has a map embedded.

If you screenshot a map and then turn it into a JPEG, well that’s nice. It’s another big image, and it’s probably original now or looks original to Google, but it’s not that extra rich interactive content that a map is.

My advice is use the Google Maps API. I think they’re on Version 3.0. It’s actually a lot easier to use, once you’ve seen an example, than you might think. That seems to work very well for producing that other piece of interactive content.

I’ve talked a lot here. How much does this work? Links are still very important for ranking. Two or three years ago, I would say links were 80%, 90% of what it took to get something to rank. Panda has changed that in an insane way.

Here’s the test example. Go to Google and do a search for best time to visit Tahiti. You’ll find my little site, Visual Itineraries, up there at number one for that, ahead of TripAdvisor, Lonely Planet, USA Today, all these other sites. These other sites have between 10,000 and 250,000 domains linking to them. My site has under 100. I rank number one for that.

Now, in case you think okay, yeah, it’s internal link anchor text or page title match, things like that, here’s the other proof. Do a Google search for “when should I go to French Polynesia.” The only word in that that matches the page title or any anchor text is the word “to.” It’s a stop word, that’s not going to count. I’m still like number three or number four on page one, up with all these other guys that have tens or hundreds of thousands of domains linking.

Please click through to my site, because I don’t want bounce rate stuff happening, and actually have a look and see what I’ve done. See the thin header I’ve got at the top. Have a look at the images I’ve got in there. Some of them I created by screenshotting Excel charts. I’ve got embedded video. I’ve got an embedded Google Map.

There we go. Thanks everybody, and take care.

Video transcription by Speechpad.com


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →