Search This Blog

Tuesday, July 24, 2007

Don't believe everything you read



One thing we've all learned from the Internet is that just because you see something in "print," it doesn't mean that's the whole story, or that it's entirely factual. You always need to check the source and make sure it's trustworthy. I was reminded of that point recently when I received a white paper published by Autonomy, one of our enterprise search competitors.

The topic of the paper was, oddly enough, Google. Let me first note that I was surprised that the Autonomy marketing folks took the time to write a whole paper (nicely formatted and all) on our enterprise search efforts.

That notwithstanding, the more I read, the more concerned I became. The paper would lead a customer or prospect to believe a number of things about Google that are just fundamentally not true. Inaccuracies about our enterprise ranking algorithms, and downright fabrications about our security and access control capabilities. The text is an amalgamation of hearsay and speculation attempting to push customers away from Google and toward their competitive product.

I decided the best course of action was to both set the record straight, and remind everyone of a key lesson. So for the record, let me call out some specific points:

1. Relevancy: The paper states that Google "relies on rich linking technology that was built for the Web to determine relevancy." This is false, and it's misleading. Google's enterprise search algorithms rely on hundreds of factors, only one of which is PageRank, to determine the most relevant content within an enterprise. We leverage the work of the largest engineering team focused on search and information retrieval in the world to solve this complex search problem.

2. Reach / Aggregation: Autonomy states in their paper that "Non secure web servers can be indexed out of the box but, integrating information from databases, file systems and content management applications into Google is considerably more complex -- and in some cases impossible." Google's appliance can natively reach into all content stores in an enterprise, including web servers, file servers, databases, document management systems, and business applications. All of this is offered as out of the box (or, ironically enough in the case of the appliance, "in the box") functionality. You can take a Google Search Appliance or Google Mini from its cardboard box to serving content from file systems and databases in less than 30 minutes. What's the setup time for other enterprise search systems?

3. Languages Support: The paper reports that Google's search is "language dependent technology that currently only supports 28 languages." It is true that we have a feature that supports the auto-detection of 28 languages, and if your query was in one of those 28, we'll offer you results in that language. And of course, offer you all results as well. This is a popular end-user feature on Google.com. However, our indexing and search is by no means restricted to those 28 languages.

4. Stemming: Autonomy states that "Google does not provide advanced language support such as stemming." This one is just wrong. A while back we added a query expansion feature which performs the same function as stemming, but just does it smarter. Anybody can do things like taking "park" and make it "parks" -- but in a lot of cases, we've seen that unintelligent stemming actually will make results worse. Drawing off of the intelligence derived from billions of queries, we know that a good solution will detect context, and expand a query like "city park" to also include "public park" but not "city parking." So, whether you want to call what the appliance does "smart stemming" or "Context Sensitive Query Expansion" (the latter being what our marketing team chose) it's a core feature of our product.

5. Security: In perhaps the most egregious statement in the whole document, the paper states that "Google provides open access to most documents -- a potential hazard for businesses needing to keep proprietary information under wraps." From the beginning, we have provided fast, accurate, and SECURE search within the enterprise. Our document-level security and access control capabilities ensure that users only see the content they are allowed to see, without requiring customers to deploy a new security system or undergo complex integrations. Google's appliances are used in the most secure environments including Fortune 500 and Global 1000 companies as well as numerous government agencies.

That's it for setting the record straight. I have by no means covered every point, but I think you get the picture. We have been working for more than 5 years with a team dedicated specifically to solving the enterprise search problem, and hold a market leadership position with over 9,000 enterprise search customers. We leverage the work and innovation of the world's largest search company, and deliver that consumer powered innovation to the enterprise. But you don't have to take my word for it: feel free to talk to any one of our thousands of delighted customers.

And about that lesson: Just because it's printed and looks official, doesn't mean it's accurate.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...