How does Page Ranking Happen: Search Engines and their Secrets

Page-Ranking-in-Search-Engines. How does Page Ranking Happen: Search Engines and their Secrets


What is Page Ranking:

To understand the importance of it, we must first understand what page ranking means when it comes to search engines and your internet experience. When you go online and use Google, say for example, you type in the words you think will best enable Google to find your desired result. But from the moment you type in your result, to the moment the search engine finds relevant webpages for you, a myriad of processes occurs in the depths of the mathematical space where the Google servers weave their magic.

Google will first check its index of all the websites and the webpages therein to see which pages contain each individual word that is in your search query. It will then sort these pages in descending order from the pages that contain all the words you typed in to the ones that some of it, progressively going lower and lower. It will then apply its algorithms to figure out which pages are the most relevant and rank them accordingly, showing you the best results at the top of the first page, and the least relevant pages it could find, among all the pages it checked, at the bottom.

 

Where it all Started:

So, obviously the internet was not always so efficient. In the beginning, search engines would simply crawl the web for any and all webpages available and build an index. Every time someone tried to look for something, the search engine would go through all these pages and show the results. This was not nearly sophisticated enough a process compared to the one Google employs today.

Sergey Brin and Larry Page, in their PhD dissertation paper at Stanford, hypothesized that there was a way to sort through these pages in such a way that only the most relevant results would show up for the user. This algorithm depended on ranking the pages being searched based on how many pages linked back to it. Not just that, it also accounted for the pages that had a high domain authority linking back to a page related to the user’s search query.

For example, if you looked up the words “how to drive safely”, something I hope you learned in driving school, Google will not just look for pages that contain the words in your search query, but also look for the results most relevant to your needs. It will do so by showing the pages that have most credibility in that they have either many other pages linking back to them or pages that are governmental, which get a high ranking by default, or pages that are connected to other pages having high domain authority. For example, a page that New York Times website has linked back to, New York Times being a high ranked source as far as credibility is concerned.

 

The Algorithm:

So, what is this algorithm that decides which pages are ranked higher and how does it decide? Glad you asked. Sergey and Larry of Google came up with the mathematical formula. It goes:

PR(A) = (1-d) + d (PR(Ti)/C(Ti) + … + PR(Tn)/C(Tn))

Where

  • PR(A) is the PageRank of page A,
  • PR(Ti) is the PageRank of pages Ti which link to page A,
  • C(Ti) is the number of outbound links on page Ti and
  • d is a damping factor which can be set between 0 and 1

Google uses a damping factor of 0.85 because it proves to be the most reliable variable used when calculating a page’s rank. It doesn’t have to be 0.85 as when calculating for a small number of webpages it doesn’t make much difference. However, when calculating for a trillion webpages, which is how many webpages there are on the internet right now, 0.85 provides the most stable solutions.

Google performs these calculations on each unique webpage when it is being ranked starting with assigning the same value to all webpages involved in the calculation. I know, this is getting too technical, so we won’t go any deeper into the algorithm itself. If you want to learn more about this side of the Google search engine, the cold, calculated and price part of it so to speak, here. Just understand that if a webpage A is contains an inbound link from another webpage B that has very few outbound links but several inbound links of itself, the rank of webpage A will be high. But, if the webpage A has inbound links from webpage B and webpage B has many outbound links to other web pages as well, then the rank of webpage A will be lower.

 

Why Should You Care:

Well, because it affects everything you do online. The webpages that appear when you search for something don’t just pop up out of nowhere. This complicated algorithm, along with hundreds of other factors that Google uses, which are not revealed by the goliath search engine, calculate and specifically tailor everything you see on the internet. Theoretically, there are a trillion pages you could visit online, but do you? No. No one has the time to do that and it is also not necessary when all you wanted to know was where you can get the best burger around your location. So, we trust Google, and other search engines, to do it for us.

The only possible issue with page ranking is that there can be pages with really good and relevant content that might not get a high page rank because no other webpages link back to it or the pages linking back to it are not ranked high themselves. And that is where the growth happens. Google is constantly trying to better its algorithm to account for such oversights, and it is no easy feat. The hundreds of other factors we mentioned above are a part of solving this puzzle. Several of the world’s best engineers, designers and user experience researchers work tirelessly on the Google campus in San Francisco to make our search results as relevant as “machinely” possible. We know we just casually invented a word there, but who can resist coming up with a completely unique pun? Not us.

Food for Thought: Ensuring that you receive only credible and precise results for each and every search query you enter into the Google search bar is one of the main elements that drive the search engine to constantly better and perfect its search algorithm. However, how can a machine know and comprehend all the emotional and sentimental intentionality that goes into a question or a query that a human generates? Find the answer in next blog.