google-deindexedOver the years many SEO’s have used web directories as a very easy way of getting links for their clients. Many of those SEO’s bought the directory links from us back in the day!

Post penguin Google has been clamping down heavily on these blatant SEO techniques so I was intrigued as to how far they would go with the de-indexing of these sites.

Data is the key to successful search marketing campaigns in 2014

As a link building agency owner I know a lot about web directories and probably have more data than most on sites that site owners could have used to gain some quick and cheap web directory links back in the older days of SEO. So I thought I would conduct a little experiment.

I grabbed the raw URLs for 5489 web directories and crawled all home pages using LinkRisk – a link profile auditing tool that I am one of the founders of.

I used an old client report from January 2011 for the dataset so these are sites that were live and indexed in Google as of January 2011.

A lot has happened in SEO since then – lets find out how much!

Here is a link to the raw data set with LinkRisk annotations

Ignore the fact that all sites have a link active flag of FALSE and that they all have a LinkRisk score of 500 – I needed to enter some false information to get the crawlers started.

I am sure you would not be surprised to hear that all these links came back as being “risky”!

But how risky? Hopefully the following analysis will give you some insight

PageRank

Indexed or not

I used a website’s PageRank as an indication of indexed or not. Whilst this is not perfect, if the PageRank API tells me that a site has a PageRank of -1 (or greybar when viewed as a toolbar PageRank) then I will class this as deindexed.

I was surprised to see that from our dataset we still had 2 PR7 directories – BOTW I understand as it has age and some trust on it’s side but I was shocked to see that www.abc-directory.com shows the following metrics –

  • PR – 7
  • DA – 57
  • PA – 64
  • TrustFlow – 63
  • CitationFlow – 62

Does that make it a good link? According to metrics: yes. According to common sense: no – although after a manual check I can see that whilst it is an SEO web directory it may add a small amount of value to the internet.

Personally, in 2014 I would not want a link from this site but we know a lot of our clients are still focussed on just using metrics so many of you may want to submit to this site.

Looking at the PageRank of all other sites – 51% are PR -1 and 30% are PR0 making 80% of the sites I analysed useless to anyone for SEO purposes.

Server status code

Whilst looking at the indexation stats I also looked at the server codes returned.

As expected, we saw a lot of 408’s (24%) – Probably down to the low quality and slow server the sites are hosted on (I attempted to crawl each site 5 times with a 30 minute interval between pings in case the site was temporarily down – it doesn’t take much to DDOS a $3 per month server!)

server-status-codes

Only 0.11% returned a 404 status code which suggests that many site owners are keeping the domains live to try to earn money by charging for the removal of the link from that directory.

Site footprints

Google extensively use pattern matching to detect spam networks and looking at the list of domain names I have used for this data experiment it is clear to see that it does not take a team of Stanford PhD graduates to detect this. I started by looking for common “money” keywords in the domain names –

From our 5489 sites crawled –

  • 120 sites or 2% of all sites analysed contained the word “Submit”
  • 412 sites or 7.5% of all sites analysed contained the word “Free”
  • 330 sites or 6% of all sites analysed contained the word “SEO”
  • 883 sites or 16% of all sites analysed contained the word “Link”
  • 2486 sites or 45% of all sites analysed contained the word “Dir” *

*I used a wildcard for this term as we see many sites with dir instead of directory so wanted to cover both

So, as we can clearly see, a lot of these directory owners used keyword rich URLs to try and get their sites ranked for terms such as “free directory submit” etc but this does leave their sites exposed to the axe of Google.

TLD of the directories

tld-overview

The aim of the game pre-penguin with directory links and running a network of web directories was always to create as many as possible in as short a time with as low a cost as possible.

Naturally in this scenario we expect to see more .info than we do .co due to the economics of 0.99p for a .info domain vs £30 for a .co per year

An overview of the breakdown is –

  • .com – 54.2%
  • .info – 19.9%
  • .net – 8.55%
  • .org – 6.72%
  • .biz – 0.95%

I was surprised to see so many .com’s but I guess this allowed the directory owners to target the whole world rather than using a localised TLD like .co.uk that may only appeal to sites targeting the UK or another country.

Links to directory sites

Whilst experiencing the flashbacks to the days of reciprocal linking I also wanted to consider what links these directories have coming into them. For the purpose of this post I used raw link counts provided by the MajesticSEO API

The site with the most links had 122,042,447 links – WOW directorymh.com – that is a lot of links. The site with the least had 0 links!

On average, from my dataset a directory has 208,235 inbound links! I have a few anomalies in my dataset and I also have 40 sites with 0 links, 1 with 122 million and all amounts in between

Flow Metrics from MajesticSEO

MajesticSEO’s flow metrics are used heavily in this office so I wanted to include these in my analysis

  • The average CitationFlow of a directory in our dataset – 13.58
  • The average TrustFlow of a directory in our dataset – 5.69
  • The site with the highest flow metrics – TrustFlow = 63, CitationFlow = 62
  • The site with the lowest flow metrics – TrustFlow = 0, CitationFlow = 0

Not really high quality sites are they! Except for abc-directory again which seems to have built up a pretty solid reputation!

Only 11 sites (or 0.2% ) from our dataset can be found in the Majestic million – no major surprises there though!

Social Signals

Personally I have never tweeted or liked a directory listing but we all know that social signals can easily be created.

From our 5489 sites,

  • The average number of tweets is 5 – The highest being 3,745
  • The average number of FaceBook likes is 39 – The highest being 106,156
  • The average number of Google +1’s is 16 – The highest being 17,458

Evidently some directory owners also know how to create social media accounts!

Contact details

As part of the crawl I tried to detect either a contact page or an email ID for the directory owner – useful if you are trying to clean up these links

From our 5489 sites I could only detect:

  • 1489 contact pages – or 27% of the dataset
  • 213 contact email addresses – or just under 4% of the dataset

It did not surprise me to find out that not many publish this information these days as I imagine they get a lot of emails requesting for links to be removed! We even saw some sites that have changed the contact page to just contain a PayPal link for $5 to remove a link!

Conclusions

So in conclusion, Google has done very well at detecting these directory networks and appears to have caught up with most of them! in fact almost all of the thousands of sites we analysed are now pretty much dead and the only reason to keep them alive is to charge for link removal.

There are many other variables I could have considered when analysing the site data and I am sure I will have missed some obvious ones – Feel free to reach out to me if you feel I need to add something to the post and I will update it when I have done the analysis.

Q – Would I use web directories as part of my SEO strategy in 2014 and beyond?
A – Simple answer – No
Q – Should you use web directories as part of your SEO strategy in 2014 and beyond?
A – Probably not but that is the thing with SEO – It is very much an opinion based industry and you have to do what you feel is correct for your client.

Why not build links using local business citations using real websites that rank and send traffic for local phrases rather than spammy made-for-SEO websites.

Whilst this may all seem like common sense to most people, you would be surprised how many people are still using web directory submission as part of their strategy!

Why not contact the team at Marketing Signals and see how we can suggest a much safer way of gaining inbound links to you or your client’s websites.