Posts on the search



April 12, 2007

Google's One Box for Weather

google one box result
After several trials and false starts Google seems to have entered the weather information sector using its one box result. Major weather network sites like weather underground will be feeling the impact.

Similarly, Google will impact the shopping sector with its increasing use of the one-box-result to integrate its other services like Froogle or Google Base ecommerce items.

So far, the services being promoted with the onebox are

  • Music
  • Movies
  • News
  • Stock Quotes
  • Weather
  • Travel
  • Maps
  • Local Businesses
  • Images
  • Shopping
  • Books
  • News Archives
  • Google Groups
  • Blog Search
  • Search History
  • Desktop Search
  • Definitions
  • Questions
  • Patents
  • Local Time
  • Patents

Google's shadow lengthens

Technorati Tags: , ,

January 23, 2007

Wikipedia Renages on Attribution by Using the "nofollow" Tag in Outbound Links

Wikipedia is the online public domain encyclopedia, which has recently surpassed all commercial Encyclopedia services, like Encyclopedia Brittanica. In an effort to combat spammers who edit encyclopedia entries with external links in order to gain Google´s PageRank favour, has implemented the "nofollow" tag on all external links.

Wikipedia has succumbed to this controversial measure, just like all blog services, the blog search engine technorati, and the social filtering site del.icio.us. The measure was originally proposed by Google to remove the incentive for spammers to add bogus comments to blogs in an effort to increase banklinks to their spam site.

The "nofollow" tag disables the link in question from Google's algorithm, basically announcing to search engines that "this-is-not-a-good-link"; the tag removes any PageRank assignment for that link. Google first proposed the "nofollow" for fighting spam on blog comments and trackbacks. Spammers cannot leach pagerank from high PR blogs with automated comment generators.

The measure is a double edged sword for Google though. Google relies on counting links to a website to judge its importance. If webmasters use "nofollow" links which do not count for pagerank, Google will have no links to count. The concept behind their algorithm looses power.

The "nofollow" tag also goes against the central principle of attribution, which plays a critical role, that of currency, for the creative commons and open source ecology. GIven that wikipedia is lives because of the open data and creative commons ecology, this could affect it critically.

Linking out of wikipedia provides attribution to other sources and providers of information; short changing on attribution could reduce the number of wikipedia content volunteers and inbound linking to it.

Technorati Tags:

October 2, 2006

del.icio.us implements the "nofollow" tag

delicious

The social tagging site del.icio.us has implemented the "nofollow" tag on its home page. Following the example of blog search engine technorati, del.icio.us implements the anti-spam measure proposed by Google.

The "nofollow" tag disables the link in question from Google's algorithm, basically announcing to search engines that "this-is-not-a-good-link"; the tag removes any PageRank assignment for that link. Google first proposed the "nofollow" for fighting spam on blog comments and trackbacks. Spammers cannot leach pagerank from high PR blogs with automated comment generators.

The measure is a double edged sword for Google though. Google relies on counting links to a website to judge its importance. If webmasters use "nofollow" links which do not count for pagerank, Google will have no links to count. The concept behind their algorithm looses power.

Technorati Tags:

September 14, 2006

Top 10 Official Google Search Engine Optimisation Tips

google

Matt Cutts, Google's official Search Engine Optimisation blogger, has been giving the SEO community do's and don't tips for 16 months now. Blogging daily the cumulative volume of knowledge and advice is impressive.

It is only on looking back that the value of Matt's advice becomes evident. Rather than trawl through thousands of Matt's blog entries, here are my top 10 favorite Google tips

  1. Use user-friendly URLs, or human readable URLs, e.g. rose-flowers.html, blog entry here
  2. Minimize the number of parameters in your URL - maximum 2 parameters
  3. Use natural links as far as possible; linkers to your web should be citing or referencing you for a reason; links should be earned and given by choice.
  4. Google penalizes bought links, Matt says the Google algorithm now identifies bought links, and considers them outside Google quality guidelines. High PageRank web sites that sell links widely have their links dampened such that their effetive pagerank is much reduced. blog entry here, here, and blog entry here
  5. Use the “id=” named parameter in a URL with care , i.e. only for a session ID
  6. Matt recommends using dashes to delimit URL words, rather than underscores
  7. Use 301 redirects wherever multiple URL point to the same page Where several URLs, eg: www.enclick.com, enclick.com or www.enclick.com/index.html point to the same place, you should 301 redirect to the URL where you want all the PageRank concentrated. Alternatively, use google sitemap to help Google “canonicalizes” to the url you want to use. Canonicalization blog entry here
  8. Google updates its index data continuously But, the toolbar backlinks and PageRank information is only refreshed every few months. blog entry here
  9. If you use flash, offer an html version as well
  10. Metadata still has value. Assign useful "title" and "description" tags and headings to every page.

Pretty obvious that Google is making progress towards its perfect citation model algorithm where only natural links and value-added references and citations are valued.

Technorati Tags: ,

September 8, 2006

Always Mind your Keywords

google
Matt Cutts, Google's webmaster relation point man, has unusually given a tip one the use of keywords to improve ranking on Google

having keywords from the post title in the url also can help search engines judge the quality of a page

including the keyword in the url just gives another chance for that keyword to match the user’s query in some way. That’s the way I’d put it.

Matt's comments must be taken in context, keeping in mind the extensive guidelines provided by Google as a warning that keyword abusers and unethical practices will lead to banning.

Nevertheless, keywords are a legitimate tool in structuring information. Communicating clearly requires you use a consistent vocabulary, in an attempt to avoid confusion and improving style. At Enclick we are careful with keyword selection and keyword usage on web pages. A keyword analysis identifies the list of keywords that are most relevant and important to a client's site. The keyword list is taken into account in the web page copy and URLs. Keyword density measurement tools are used to keep track of whether the web page is staying relevant to the right keywords.

The use of keywords is not about gaming Google, but about communicating clearly both with users and software programs like search engines.

Technorati Tags: , ,

August 31, 2006

Google's Video Broadcast of its Most Recent Update

Matt Cutts, Google's User Relations blogger, now does a video broadcast of the state of the Google nation

It says a lot that a PageRank update, or algorithm & data push, now warrants a video broadcast of its own.

The transcript of the broadcast is

Hey everybody, good to see you again! I thought I’d talk about datacenter updates, what to expect for the next few weeks in Google, and stuff like that this time. (...)

There is always an update going on, practically daily, if not daily. A pretty large fraction of our index is updated every day as we crawl the web.

We also have algorithms and data pushes that are going out on a less frequent basis. For example there was a data push on June 27th, July 27th, and then August 17th. And again, that’s not [recent?], that’s going on for 1.5 years. If you seem to be caught in that, you’re more likely to be reading on an SEO board. You might wanna think about ways that you could back your site off... think less about what the SEOs on the board are saying, sort of not be optimizing quite as much on your site. That’s about as much advice as I can give I’m afraid.

You can see the amount of short term blackhat (illegal) SEO work going on. Our analysis at Enclick is that Google search has taken a leap forward in quality, eliminating more spam from its results. Together with Adsense vastly improving the quality of its advertisers by effectively removing spam pages from its client list.

One nice thing is we have another software infrastructure update which improves quality as the main aspect, but also improves our site crawling estimates as well. It’s just sort of like a side benefit. I know that that is out on all datacenters in the sense that it can run in some experimental modes, but it’s not fully on in every datacenter. We were shooting for the end of the summer to have that live everywhere, but again that’s a hope, not a promise. So if things need more testing they’ll work for longer to make sure everything goes smoothly, and if everything goes great, then they might roll it out faster.

The whole notion of watching datacenters is going to get harder and harder for individuals going forward. Because number 1, we have so much stuff launching in various ways. I’ve seen weekly launch meetings where there are a double digit number of things, and these are things that are under the hood. So they’re strictly quality, they’re not changing the UI or anything like that. If you’re not doing a specific search in Russian, or Chinese, you might not notice the difference. But it goes to show that we’re always rolling out different things, and at different data centers you might have slightly different data.

The other reason why it’s not as much worth watching datacenters is because there’s an entire set of IP addresses. And if you’re a super-duper gung ho SEO, you’ll know “72.2.14.whatever”. That IP address will typically go to one datacenter, but that’s not a guarantee. If that one datacenter comes out of rotation – you know, we’re gonna do something else to it, we’re gonna actually change the hardware infrastructure (and everything I’ve been talking about so far is software infrastructure) – then that IP address can point to a completely different datacenter.

So the currency, the ability to really compare changes and talk to a fellow datacenter watcher and say, “What do you see at 72.7.14.whatever?” is really pretty limited. I would definitely encourage you to spend more time worrying about the results you rank for, increasing the quality of your content, looking for high-quality people that you think should be linking to you and aren’t linking to you (and not even know about that), stuff like that. (...)

The fact of the matter is, we’re always going be working on improving our infrastructure, so you can never guarantee a ranking, or a number 1 for any given term. Because if we find out that we can improve quality by changing our algorithms or data or infrastructure, or anything else, we’re going to make that change. The best SEOs in my experience are the ones that can adapt, and that say “OK, this is the way the algorithms look right now to me, if I want to make a good site that will do well in search engines, this is the direction I want to head in next.” And if you work on those sort of skills, then you don’t have to worry as much about being up at 3am, and talk on a forum about “What does this datacenter look like to you, did it change a whole lot?” and stuff like that.

There seem to be a lot of data-centers watchers, just like weather enthusiasts. Fortunately, we undertake longer term search engine optimsation. We focus on achieving good search positioning for long periods of time, building on the fundamentals, rather than short term weekly improvements. Can't imagine anybody in the team being this obsessive. [Transcript courtesy of Google Blogscoped]

Technorati Tags: ,

August 1, 2006

What is Natural Linking ? Five Tips on Google Guidelines

The success of Google's search algorithm centers round using inbound links to a website to judge its relevance and importance relative to a keyword.

Over the years, SEO companies have come up with link building techniques to up their client's websites in Google's ranking. Google's original algorithm is being manipulated by all these techniques, and many high ranking sites are often irrelevant and unimportant. But Google keeps fighting back.

Such have been Google's improvements, that some declare that SEO is dead. Google has become skilled at detecting un-natural linking patterns; links to a website whose only purpose is to improve Google ranking. Many SEO experts are going back to fundamentals: What is a natural link ?

A Natural Link: The Referencing of Good Content by an Informative Author

Google's algorithm, as filed in Google's US Patent Application #20050071741 - Information Retrieval Based on Historical Data, is based on the Garfield's Scientific Citation Index, which is used to judge importance of research papers in the world of academic publishing.

Scientists write their results in research papers. Each research paper references all the other work that has contributed to its results. Important research papers that make a big breakthrough, give rise to more investigations, and are referenced widely. This citing and referencing is what the Google algorithm is looking for. Natural value-added links.

So, what does the natural linking look like:

1. Natural links are deeplinks. Links that point to specific material deep within the body of knowledge.

Deep linking is linking that points to a specific page or image within another website, as opposed to linking to a website's main or home page. Deeplinking goes hand in hand with the long tail.

2. Natural linking is not reciprocal. Scientific papers are published sequentially in time. More recent papers reference older papers as they try and build the body of knowledge. Link exchanging, where websites exchange links, is not natural.

3. Natural links are built slowly over time. A seminal academic paper accumulates references, links, slowly over time. It is only well established highly regarded academic papers, like the discovery of DNA, that accumulate large number of references quickly.

4. Natural links make a point. Academics construct an argument around their references. So natural references are surrounded by relevant text and have specific anchor text, as each scientists tries to make his own point. The ratio of number of links to text has an upper limit, the density of links is relatively low, and the context surrounding the reference is relevant material.

5. Natural links come from everywhere. Scientists publish their research in many places. From important journals, like "Nature", to less important conference proceedings. So, natural links come from varied sources of relevant material.

Search Engine Optimisation is not dead, but it just got harder

Technorati Tags: , , ,

July 12, 2006

Google Battling for the Pay Per Click Long Tail

The number of keywords the average online shop bids on has risen sharply over the last few years; bidding on more than 30,000 keywords is common place.

long tail

When your popular keywords are too expensive, go for a large number of low popularity keywords, a standard pay per click strategy known as the long tail, shown in figure.

The cumulative traffic from low popularity keywords amounts to a lot of value for money clickthroughs, and the pay per click campaigns generally achieve high return on investment.

Many pay per click management programs are available manage this long tail of keywords. The SearchWorks Bidbuddy is one popular example. But the pay per click long tail suffers from spammers exploiting the low cost traffic, driving up CPC prices for legitimate businesses.

In addition to online shops and merchants, the keyword long tail has also attracted SPAM artists; the clickthrough arbitrage. The scheme is buy low priced keyword traffic on Adwords, and take the visitor to an Adsense populated landing page with an expensive keyword context. The SPAM site makes a margin from the difference in clickthrough price, at the expense of advertisers and visitors.

A recent move by Google has been to increase the minimum price for all keywords, in order to discourage clickthrough arbitrage merchants.

Google has just announced on its Inside AdWords blog that a new algorithm will start penalizing Adsense SPAM sites. Google is increasing the to Adwords landing page quality requirements: Inside AdWords: Landing page quality update. SPAM sites will be penalized with a high minimum CPC in Adwords.

As you may recall, we began incorporating advertiser landing page quality into the Quality Score back in December 2005. Following that change, advertisers who are not providing useful landing pages to our users will have lower Quality Scores that in turn result in higher minimum bid requirements for their keywords. We realize that some minimum bids may be too high to be cost-effective -- indeed, these high minimum bids are our way of motivating advertisers to either improve their landing pages or to simply stop using AdWords for those pages, while still giving some control over which keywords to advertise on. Although it is counter-intuitive to some who hear it, we'd rather show one less ad than to show an ad which leads to a poor user experience -- since long-term user trust in AdWords is of overarching importance.

From time-to-time, we improve our algorithms for evaluating landing page quality (often based on feedback from our end-users), and next week we're launching another such improvement. Thus, over the coming days a small number of advertisers who are providing a low quality user experience on their landing pages will see increases in their minimum bids. It is important to note, however, that the vast majority of advertisers will not be affected at all by this change, as they link to quality landing pages.

If you do see an increase in minimum bids and you feel that your landing page is providing a great user experience, please contact AdWords support and we'll take a look. Also, for useful guidelines which will help to define what users look for in a high quality site, we hope you'll take a look at the landing page and site quality guidelines, from the AdWords Help Center.

Good news for the long tail pay per click strategy. Good news also for Adsense Publishers, who may now enjoy higher CPCs.

In spite of Google's Landing Page Guidelines, the speculation wiil turn to whether Adsense adverts on your website are harming your Pay per Click traffic prices from Google; you either buy Google advertising or sell advertising, but not both?

Technorati Tags: , , , ,

June 22, 2006

Google answer's back: Bad Inbound Links Cause Shallow Googlebot Crawls

Matt Cutt answers some of the questions that have arisen lately:

Firstly, Google is not full. The problems of web sites dropping of the index and coming back during Google's BigDaddy upgrade were not due to lack of machines; the crawl/index team certainly has enough machines to do its job, and we definitely aren’t dropping documents because we’re “out of space.”, according to Matt.

The problem of pages dropping from the index were due to bad inbound link quality

The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling. The Bigdaddy update is independent of our supplemental results, so when Bigdaddy didn’t select pages from a site, that would expose more supplemental results for a site.

Enclick's philosophy about linkbuilding is: there is no such thing as a free lunch.. An easy link-exchange with somebody you don't know is more likely to be a bad quality link, which might even get you de-indexed.

Your inclusion ratio also depends on the strength of your inbound linking. Google has a threshold of inbound link strength below which Googlebot just shallow crawls your site.

From Matt Cutts: Gadgets, Google, and SEO ďż˝ Indexing timeline

Continue reading "Google answer's back: Bad Inbound Links Cause Shallow Googlebot Crawls" »

Technorati Tags: ,

June 16, 2006

Searching or Browsing - Ambient Findability

Peter Morville is an expert in "findability", making information findable; his book ambient findability is about designing user interfaces and information architecture such that anyone can find what they want quickly. Peter states that searching is interactive and iterative. Your query changes as you advance in your search, in step wise refinements.

In particular, how do you enable people to move between searching and browsing modes. Enclick and Hispavista have been analysing search usage patterns of millions of users, and over the years have arrived at a mix of directoriy and search results. The standard user has become more sophisticated in his search usage, expecting some sort of tagged or faceted search results.

Technorati Tags: , , ,

June 13, 2006

Google Launches Refine Results

As expected Google has started offering faceted or tagged search. Searching for a general term like Barcelona gives the following results

The refine results section allows you to drill down according to different tags. The result of drilling down is




Further assisting you in finding the areas you want.

Google seems to be providing the service for tourist destinations, and refining according to different activity tags. We provide the same tagged search solution for ecommerce and catalog sites, made to measure.

Technorati Tags: , , ,

June 9, 2006

Endeca Granted Patent on Guided Navigation

The advantages of enhanced search facility for finding information have just started to be exploited. Endeca, the leader in our sector has just been granted Guided Navigation Patent; Pioneering Innovation Has Played Key Role in Market's Evolution From Search to Information Access

Guided Navigation helps people using Web or enterprise applications to find relevant information by guiding them in an interface that always keeps information in context. The context is created by dynamically exposing the dimensions, attributes and other relationships in the underlying data set. This experience helps people navigate content, analyze information sets, refine search results lists, and combine search and browse behaviors. Guided Navigation encourages exploration and discovery because people no longer have to guess the perfect search query or anticipate the specific classification schema of the underlying data

Congratulations to Endeca, credit where it is due. Even when it competes with our offering of tagged site search. Some concern on the wording of the patent. As ever, defining an intangible intellectual idea can be frought with abuse with regards to the rest of society.

Technorati Tags: , ,

May 8, 2006

Google is Full ?

Webmasters are in a state of confusion over Google's behaviours over last few months. Being de-indexed from Google is an unpleasant and stressful experience, but with the Florida and Jagger algorithm updates, the de-indexing was justified and could be corrected. The last algorithm update has not shown this consistency. Some of our clients and web-sites have been de-indexed from the Google search results for discernable reason, only to re-appear a few days later.

Google has admitted, however, that the problems may be due to lack of enough machines. Full-up Google choking on web spam? | The Register

More worrying, the machines are apparently full of spam and robot generated blog pages.

Technorati Tags: ,


Search Engine Friendly Solutions - Shopping Channel - Online Survey - Site Search Solution - Shopping Data Feeds - Affiliate Network Data Feeds - Search Engine Indexer - Sitemap - Search Engine Optimisation