Subscribe

RSS Feed (xml)

SEO Expert India


Should SEO Care About Real Time Search?

The hype and the reality: One of the more popular buzz words in search over the last while is real time search. For starters, thats a bit of a misnomer; there...

.....are NO actual real time engines thats simply not possible (even Twitter search updates in intervals not entirely real time). Regardless, people keep getting worked up about this particular area to the point of wondering how it will affect SEO efforts.

Last week the folks at Media Post contacted me for some quotable quips for a piece on real-time search and (One Riots) PulseRank. They wanted to know if SEOs were considering this the wave of the future or something new to the lexicon.

For their part, One Riot has said, "We believe PulseRank will replace PageRank over time for the real-time Web. The reasons are clear. PageRank is based on the number of links to a page or a specific URL builds over time, as people link to pages. It provides the searcher with the "authoritative answer -- These are some bold statements and yet another in a long line of self professed Google Killers " but is there credence? Were going to look at the world of real time search and see if there really is anything to be looking at (as SEOs) care to come along?

What is real time search?
Ok, for starters I often see the commonly known real time search engines are simply seeking out social mentions not really crawling the web and indexing it. This, to me, is the first part of the problem are they really search engines? Or simply buzz monitoring tools?

Real-time search is one of the more difficult areas for search engineers to deal with. Much of the problem lies in establishing the most authoritative answers without being bogged down by spam. Dealing with web spam requires a great many signals that are hard to come by in 'real time'. At the end of the day if 'real time' search was an effective approach Google and others would (likely) be doing it already.

There are more than a few problems associated with real time search including;

* Spam " this would be the most problematic area for any search engine. It would be nearly impossible to combat web spam in a large scale environment. This is why most of the major engines havent moved beyond almost real time (such as Googles query deserves freshness approach.

* Ranking " as with the QDF, there needs to be some type of evaluation of quality and authority to make ranking of documents effective. Some of the above engines use domain and (social) user authority, but this approach does kill some of the democratic nature of the web and the rich get richer. What about lesser known domains and new users/content?

* Social dependencies " most of these real time engines are reliant on social signals which really does limit the actual abilities as a search engine. They are NOT indexing pages that arent getting social luvin. And despite popular belief, not ALL content on the web is social worthy and this makes such search engines limited in scope.

If we consider the above, there is a lot of work to be done if such real time search approaches are to ever be of value. But

Rubber meets the road
In the testing we ran so far there is no real sense to the ranking mechanisms. In looking at some of the more general queries (in this case; PageRank) it seems that merely being the most recent citation is all that matters. Sadly, the post itself is devoid of any real content relating to PageRank and thus shouldnt really be ranking. But it did see the problem here?

One Riot - This is the application with the vaunted PulseRank which claims to be a superior method from Googles PageRank. Our early testing showed it to be generally inferior to Collecta, but in this test it did OK comparatively.

Test time; 49 Minutes - this was only AFTER it was Tweeted by someone that had my blog RSS hooked up to a Twitter account. What is interesting is this was the first one to list the actual blog post, not the Tweets