Advanced SEO Topics
The Future of Search: Our Window to Knowledge
The human thirst for knowledge is an unquenchable one. A desire that is limited by the constraints of time, with volumes and volumes of data to consume, and only one lifetime to absorb it all. An index in at the back of a book, the Dewey Decimal System and the mighty Google: all created to satisfy the need to divide the infinite mound of information, to locate the proverbial needle in the haystack.
So where do we go when we need something? To the internet, of course. Most people use search engines to find the information they need in the now, without much effort on the user’s part. But technology has launched an attack on the senses, we now do not merely read to gain an understanding, it’s an all out audio-video library.
Somewhere out there, the answer to virtually every question ever asked can be found on the web. But how do we match the searcher with their solution? The old model is outdated, a system of pushing websites the top of the search engine based on how popular it is (links wise). This model was useful for back in the day, when there wasn’t so many answers floating around, and providing ten of the most popular answers would suffice. But now the consumer is information hungry, they know there is a webpage that satisfies their exact query that’s out there, somewhere, but it may be so tangled in the web that is out of reach of even the most ardent searcher.
Nowadays, in the social media paradigm shift, as the digital technologies are more and more pervasive and devices more and more affordable, we have to think about other elements. Beside moving towards the discourse on application of the above mentioned search, and the issue of the change in users’ searching needs and behavior, we need to address the human agent factor – the user in the relation to the network technology, the internet. This also draws the global problem for the users’ access to new technologies due to various reasons. Though the access is not everything, the user’s skills are the essential factor for engagement, the active use of the services, as well as to the search ability skills. Digital literacy is the fundamental element and the web creators should always have in mind the user and user’s needs – when building new applications and search entities, and services. Not all the users needs are the same at each region and each virtual habitat, therefore research before any action and strategic planning are the essential. Knowing the user and users habits and needs create the playground for further steps of the developers and engineers. Since the internet services include the web that is very social and semantic in the last decade, many software architects and engineers are trying to (re)build personalization features. Recent example comes from the Google’s new personalisation feature head on by offering a bookmarklet that integrates other social networks data from Social Web into searches. The tool is called “Focus On The User” and compares Google’s recent Search Plus Your World to those that would be, the engineers say, more relevant. However, Search Plus Your World doesn’t include any content from social networks – Twitter, Facebook, or MySpace, but it includes lots of content from Google’s social network, Google Plus.
This is the proof of concept of collaboration of some engineers at Facebook, Twitter and MySpace, in consultation with several other social networking companies. Trends are moving to open-sourcing the code so that anyone may use it, re-use it, and also it can engage the user through these services and platforms. Of course, where the internet is heading now is towards creating open source, federated systems, including semantic metadata initiatives are already building decentralized silos of data and design. Observing the web on the global perspective – decentralization hits social design of web, understanding how web works, how search engines works, and of course – the user – having in mind their habits, needs, their skills; even in the places where is the wide spread of connecting devices, i.e. mobile phones in world regions where the poverty and economic problems are the issue.
The search engines are listening: they tailor the results to past searches, they take into account where you are in the world and now with Google +, they take into base results on your friends interests as well. But how do the search engines measure user intent? When you ask Google “what should I eat for breakfast today?”, it doesn’t understand that you are a person, who needs to eat, that breakfast is a meal or that you are having difficulty deciding what kind of breakfast to have. Instead Google looks for web pages that contain text that is similar to “what should I eat for breakfast today” and gives you results that it thinks are relevant based on the words alone, rather than the ‘flavour’ of the sentence.
Enter the Google Knowledge Graph, where the idea is to make the algorithm understand that you want options for breakfast, so that instead of giving you articles gravitating around the words in the question “what should I eat for breakfast today” is says “what should” = suggest to “I” = user “eat” = have a meal “breakfast” = morning meal and “today” = this day. So Google can now “suggest user have a meal this day morning”. From there Google can spit out a number of options such as close by restaurants open for breakfast or breakfast recipes that require simple household ingredients.
The Google Knowledge Graph, in other words, is going to try and apply meaning and significance to words, rather than them just being a random sentence that it attempts to match to with the same or similar words Google has found across the internet. The knowledge graph is still in its infancy, and it remains to be seen what it can really do, but its output can be found in a number of popular search terms at present.
So Knowledge Graph looks to be a useful feature of search and it’s already been implemented, but what about the social factor? Since 2009, Google have been working on Confucius, where the idea is to have queries answered instantaneously with social Q&A sites, such as Yahoo! answers, Wikianswers and Askville. Google has identified a gap in information for people who need highly-relevant and timely information.
For example if a user needs to know how to remove Facebook timeline from their profile, usually you would expect an official Facebook page showing up as an answer. Of course Facebook wants users to have timeline and haven’t officially created a page letting people know how to do this, no such result will be found through them. Instead, typing “how to remove facebook timeline” into Google search will instead bring you to results from Q&A sites, because Google now knows that pairing a searcher with a community discussing how to do something is more likely to provide genuine thoughtful discussion, as opposed to taking a user to the official site.
Finally, there is Google +, which attempts to make results more relevant to you based on who else has viewed that web page, and whether or not they liked it. Google + still hasn’t received the mass adoption that other social networks have, yet, but the effect it has on results can be seen. If users can see that their friends (or perhaps people they admire, such as industry experts, celebrities or athletes) endorse a web page, it can help them more easily identify a page they want to visit.
So, what is Google going to do for you in the future? Well it looks like 3 things are happening: Google wants to understand human language better to better understand user intent. The search engine also wants to get searchers plugged into the ‘grass roots’ discussion of Q&A’s, in order to provide an alternative to non-user friendly sources of information. And Google wants to integrate social influence into your results, helping you to choose the right result for you based on endorsement by friends of influential people.
Last week I presented a method for search result hijacking. The story got much coverage in the SEO community, perhaps due to the fact that Rand Fishkin’s authority pages were also compromised as part of our experiment. One thing I did not elaborate on in the original article was the peculiar way Google Webmaster Tools handle document canonicalisation.
You can see somebody else’s links in your Google Webmaster Tools as if you were the authorised user of that site. The process involves creation of identical document copy and the results are visible in about two weeks.
In the following screenshot we see what Google calls “an intermediate” link. An old link which points to our old domain 301′s to our new domain. Nothing unusual about this.
There are other instances of the “intermediate link” in Google Webmaster Tools. One of them is related to document canonicalisation process described in a paper called “Large-scale Incremental Processing Using Distributed Transactions and Notiﬁcations” by Daniel Peng and Frank Dabek. This is exactly the same process I used in the result hijack article and the most interesting thing is that it works in reverse! (I’ll get to that later).
Here’s an example of one such case, some of you may remember this website from my hijacking experiment:
As you can see the “intermediate link” notification suggests that my page above receives a link from Rob’s website but the thing is, it doesn’t. So what’s going on?
Well, the page I created is a replica of the original page on htt
I am seeing the same links the owner of that site would see in their Google Webmaster Tools.
Here’s the interesting part, it works in reverse! You don’t even have to hijack the result for this to work, you can see the results with the ‘loser’ URL. If you create a duplicate page with lower PageRank (not very difficult to do that is it?) of any page on the web, you will be able to see its links in your own Google Webmaster Tools.
To test this concept I copied a PDF from another site and simply got it indexed, in about ten days I saw all its backlinks in webmaster tools, here it is:
Now I can take any page/document from my competitor, place it on a domain of my choice, have it indexed and in a few weeks I’m able to see all their backlinks within Google Webmaster Tools.
It took me exactly 14 days to see the link data of Rob’s website.