Microsoft SharePoint Server 2010 is just around the corner now and it will once again raise the bar in the SharePoint Enterprise Search space. I witnessed the new SP2010 search experience first time at the SharePoint 2009 conference in Vegas last year and was overall quite pleased with what I saw there. It looks like the search team in Redmond has really listened to the community and their customers and addressed many of the annoying pain points present in the SharePoint Server 2007 search experience.
However, having worked with SP2010 Beta 2 for a while now has revealed some pain points / annoyances left behind in the product. Some of them I can understand from a technical standpoint while others just make we wonder how they could miss it again. Don’t get me wrong here – I’m still a big fan of SharePoint Search and just want this part of the product to be mere perfect. Seriously, Enterprise search is an ever more vital part of most SharePoint deployments.
Anyway, I am dedicating this post to sharing the good news and the bad news I have learned so far by working with SharePoint Server 2010 Search. Please note that the FAST Search Server for SharePoint 2010 is not included in my evaluation here – it is technically and financially a whole other ball game.
The Good News
- Improved relevance. More parameters included in score calculation. One cool new parameter is click-through rate on search results also known as popularity ranking. Other parameters include URL fuzzy matching, social tags, inferred metadata, detected language and implicit phrase matching.
- Enhanced query syntax. Enables power users to build advanced queries using Boolean operators like AND, OR, NOT. Also support for the range operators <, >, <=, and >= for searching numeric ranges or date ranges.
- Wildcard search. Now possible to search for partial words using the wildcard character *. For example search for: Micro* author:bill*
- Enhanced multi-lingual support. Improved language detection from document text, better word breaker in more languages for better handling of compound words.
- Phonetic and nickname search. Useful in people search to match similar names with different spelling. E.g. a search for Chris also returns people named Kris or Christopher. I really like this new feature as it makes people search much more precise and useful.
- Faceted search aka. Refiners. Presents users with a list of relevant suggestions for refining the search results by document type, site, author, modified date, tags or any other managed property available in the index. The refiners as Microsoft like to call them, offers a very simple and intuitive way to filter results by metadata.
- Query suggestions. Presents a list of relevant search terms as you type – this is known as pre-query suggestions. There is also post-query suggestions, which is just a Web part listing related queries. The suggestions are based on past queries from other users.
- Improved did you mean suggestions? Support for more languages.
- View in browser. Link in the search results for viewing office documents with full fidelity directly in the browser. Requires Office Web Applications 2010 to be installed on the server. This feature is very useful to users who do not have the Office clients installed or do not want to always download the entire document.
- Open Web Parts. The OOB search Web parts are no longer sealed! Consequently, it will be much easier for developers to build their own custom search Web parts simply by extending the built-in ones. With SharePoint 2007 you would have to build your own from scratch, which is a very daunting task.
- New Connector Framework. This is the next evolutionary step of the Business Data Catalog introduced in SharePoint Server 2007. Indexing external content has become a lot easier thanks to much better tool support in the form of the new SharePoint Designer 2010. Hooking up SharePoint to index content from a database is now a no-brainer; point SPD to your database to automatically reverse engineer a BDC model, then deploy that model to the indexer via the admin UI. To index more complex and dynamic repositories, developers can now also build custom connectors in managed code (The old C++ Protocol Handler API is still supported). Other great improvements over the old BDC are the ability to index document attachments and item security (ACLs).
- Improved admin dashboard. Offers a few improvements to the search admin dashboard introduced in SharePoint 2007 with the infrastructure update.
- New health analysis tool. Can generate reports useful for performance monitoring, capacity planning and troubleshooting.
- PowerShell scripting. Enables administrator to automate virtually all search administration tasks by using Windows PowerShell 2.0 scripts.
- New and improved deployment architecture. The search system has been componentized a lot more for improved performance, scalability and availability. With enough servers, SharePoint Search now scales to about 100 million documents while maintain fresh indexes and sub-second query latency. The most welcome improvement is without doubt support for multiple stateless crawlers (aka. indexers) on the same content source. Another biggie is support for partial indexes, i.e. support for splitting a large index across multiple query servers.
- Better support for indexing case sensitive repositories.
- Improved Search Analytics. SP2010 more or less includes the same type of search analytics reports as we know from SharePoint Server 2007. But they have received some nice improvements like nicer graphics and the ability to view data in any date range. Also, it is now possible to create custom reports thanks to a new and documented Data Warehouse.
- Desktop search integration in Windows 7.
The Bad News
- The same old advanced search Web part. Looks like it was brought over from SharePoint 2007 as is - it still does not offer a good parametric search experience with property value drop-downs and the like. Users will still need to know and type possible metadata values. However, with the introduction of faceted search this shortcoming is not as severe as it was for SharePoint Server 2007 when it came out. Furthermore, the enhanced query syntax will also make it a hell lot easier for developers to create their own advanced search Web part to assist users in constructing complex queries.
- Sub-optimal navigation experience. If search is great, users will adopt it as a good navigation tool. But in a SP2010 search center there is no way of navigating from a document search result to the document library where the document lives. Also, it can be hard for users to navigate back from the search center to the site they initiated the search from. However, these issues can be fixed with a little customization. But it would have been nice to see a greater user experience out-of-the-box.
- Inconsistent search UI. The problem I am referring to here is that searching the "This site" search scope in the search box does not take the user to a search center; instead it takes her to the SP2010 Foundation search page in the _layouts folder. All other search scopes take her to the full search center. In other words you still have two different search interfaces in SharePoint. My recommendation will be to turn off the “This site” scope as results can anyway be refined by Site in the search center.
- No Visual Best Bets. I have heard Microsoft presenters get all exaggerated about the Visual Best Bet feature available with FAST search. But it is really nothing special – just a Best Bet with an image! Seriously, this feature should also be available with the built-in search engine. In other words, the Best Bets feature does not seem to have received any improvements from SharePoint 2007 whatsoever.
- Changes to managed properties still require a full crawl. Managed metadata properties are still there and managed exactly the same way as in SharePoint 2007. This is also good - but there is still one major annoyance that I had hoped MS would find a solution for. Adding a new managed property or changing an existing one, requires a new full crawl. This should not be necessary as the metadata is already indexed and searchable via a crawled property.
- No push based indexing. It is not possible to notify the search index about immediate/important changes to content. The index will still have to wait for the crawler to stop by and pick up the changes for the index.
- Incomplete indexing of system metadata. The indexer does not pick up all system metadata on documents. Forget about finding crawled properties for document information like CheckOutStatus, CheckedOutBy and CheckedOutDate. Then there is the ContentTypeId, which is indexed – but it seems to only happen for Office documents in the new 2007 format. A properly indexed ContentTypeId would make hierarchical searching on content types possible.
- No document preview with hit-highlighting. This feature is unfortunately only available for FAST Search and here it does not even look very convincing except for PowerPoint documents.
Fortunately the list of bad news is shorter than it was for SharePoint Server 2007 back then. But let us see if we identify more good/bad news as we get to learn SharePoint 2010 Search better. I would love to hear from you out there – have you found other good or bad news on the topic?
f656670d-f3b0-455d-9feb-be39d802dff4|3|5.0
SharePoint Search
SharePoint 2010, SharePoint Search, News