The Internet is flourishing with exhaustive analyses studying how well each SEO factor correlates with the actual ranking. The ultimate idea behind is to be able to dissect Google algorithm and find out which factors are the most important to get ranked.
All right, a little bit of math is required here to go further. The optimal correlation equals 1 and this basically relies on the perfect collinearity of two vectors in one plan i.e. two things that evolve at the same pace. To take a very simple and concrete example, collinearity corresponds to the little formula you use when you are cooking. If the recipe book mentions 8 apples, 2 eggs and 1 cup of flour for 4 people, you should use 16 apples 4 eggs and 2 cups of flour for 8 people.
The mathematical rational behind studying ranking factors does seem quite straight forward, isn’t it? What is the recipe to get ranked #1? Why does it sound to me so wrong mathematically then?
First, we know the impact of some ranking factors is not linear. It means that the saying “the more, the better” is not true. Let’s take one example with keywords. Matt Cutts said it clearly: 1, 2 or 3 times a keyword will increase the chances to rank well because it is confirming the topic. Then there is a plateau phase where you could say few times the same keyword, it would not change anymore the ranking. Finally, after a certain threshold, a keyword becomes deleterious to the ranking because it is spamming.
Second, we know the ranking factors are not independent variables. For example, we can see backlinks. Not only the same “bell shaped” relationship between backlinks and ranking applies, but also the number of backlinks is weighted by others factors such as the numbers of referring domains or the relevance of the domains. “One thousand links from one single domain” has virtually no value compared to “one single link from one thousand domains”. And what would be the outcome if you would have one thousand domains, but 900 of them would be irrelevant to your field?
Third, a correlation looks at 2 dimensions, one pair at a time i.e. like a recipe where each ingredient goes linearly with the number of people you wish to serve. The ranking algorithm does seem a little more complicated than ranking = function (ranking factor X), doesn’t it?
Finally, we do not even know how many ingredients are in Google secret recipe.
Together, if you think about the numbers of variables, the outcome of each variable becoming either positive or negative as it grows and this being affected by the surrounding variables, you can know immediately that the interpretation of such data is close to impossible. The Google ranking algorithm is not a cooking recipe.
Socrates said “the only true wisdom is in knowing you know nothing” and it is time to tell the truth: unless you are one very special Google staff member, you do not know what is going on, and you can only guess…