SharePoint 2010 Search: The Good News and the Bad News

29. January 2010

Microsoft SharePoint Server 2010 is just around the corner now and it will once again raise the bar in the SharePoint Enterprise Search space. I witnessed the new SP2010 search experience first time at the SharePoint 2009 conference in Vegas last year and was overall quite pleased with what I saw there. It looks like the search team in Redmond has really listened to the community and their customers and addressed many of the annoying pain points present in the SharePoint Server 2007 search experience.

However, having worked with SP2010 Beta 2 for a while now has revealed some pain points / annoyances left behind in the product. Some of them I can understand from a technical standpoint while others just make we wonder how they could miss it again. Don’t get me wrong here – I’m still a big fan of SharePoint Search and just want this part of the product to be mere perfect. Seriously, Enterprise search is an ever more vital part of most SharePoint deployments.

Anyway, I am dedicating this post to sharing the good news and the bad news I have learned so far by working with SharePoint Server 2010 Search. Please note that the FAST Search Server for SharePoint 2010 is not included in my evaluation here – it is technically and financially a whole other ball game.

The Good News

  • Improved relevance. More parameters included in score calculation. One cool new parameter is click-through rate on search results also known as popularity ranking. Other parameters include URL fuzzy matching, social tags, inferred metadata, detected language and implicit phrase matching.
  • Enhanced query syntax. Enables power users to build advanced queries using Boolean operators like AND, OR, NOT. Also support for the range operators <, >, <=, and >= for searching numeric ranges or date ranges.
  • Wildcard search. Now possible to search for partial words using the wildcard character *. For example search for: Micro* author:bill*
  • Enhanced multi-lingual support. Improved language detection from document text, better word breaker in more languages for better handling of compound words.
  • Phonetic and nickname search. Useful in people search to match similar names with different spelling. E.g. a search for Chris also returns people named Kris or Christopher. I really like this new feature as it makes people search much more precise and useful.
  • Faceted search aka. Refiners. Presents users with a list of relevant suggestions for refining the search results by document type, site, author, modified date, tags or any other managed property available in the index. The refiners as Microsoft like to call them, offers a very simple and intuitive way to filter results by metadata.
  • Query suggestions. Presents a list of relevant search terms as you type – this is known as pre-query suggestions. There is also post-query suggestions, which is just a Web part listing related queries. The suggestions are based on past queries from other users.
  • Improved did you mean suggestions? Support for more languages.
  • View in browser. Link in the search results for viewing office documents with full fidelity directly in the browser. Requires Office Web Applications 2010 to be installed on the server. This feature is very useful to users who do not have the Office clients installed or do not want to always download the entire document.
  • Open Web Parts. The OOB search Web parts are no longer sealed! Consequently, it will be much easier for developers to build their own custom search Web parts simply by extending the built-in ones. With SharePoint 2007 you would have to build your own from scratch, which is a very daunting task.
  • New Connector Framework. This is the next evolutionary step of the Business Data Catalog introduced in SharePoint Server 2007. Indexing external content has become a lot easier thanks to much better tool support in the form of the new SharePoint Designer 2010. Hooking up SharePoint to index content from a database is now a no-brainer; point SPD to your database to automatically reverse engineer a BDC model, then deploy that model to the indexer via the admin UI. To index more complex and dynamic repositories, developers can now also build custom connectors in managed code (The old C++ Protocol Handler API is still supported). Other great improvements over the old BDC are the ability to index document attachments and item security (ACLs).
  • Improved admin dashboard. Offers a few improvements to the search admin dashboard introduced in SharePoint 2007 with the infrastructure update.
  • New health analysis tool. Can generate reports useful for performance monitoring, capacity planning and troubleshooting.
  • PowerShell scripting. Enables administrator to automate virtually all search administration tasks by using Windows PowerShell 2.0 scripts.
  • New and improved deployment architecture. The search system has been componentized a lot more for improved performance, scalability and availability. With enough servers, SharePoint Search now scales to about 100 million documents while maintain fresh indexes and sub-second query latency. The most welcome improvement is without doubt support for multiple stateless crawlers (aka. indexers) on the same content source. Another biggie is support for partial indexes, i.e. support for splitting a large index across multiple query servers.
  • Better support for indexing case sensitive repositories.
  • Improved Search Analytics. SP2010 more or less includes the same type of search analytics reports as we know from SharePoint Server 2007. But they have received some nice improvements like nicer graphics and the ability to view data in any date range. Also, it is now possible to create custom reports thanks to a new and documented Data Warehouse.
  • Desktop search integration in Windows 7.

The Bad News

  • The same old advanced search Web part. Looks like it was brought over from SharePoint 2007 as is - it still does not offer a good parametric search experience with property value drop-downs and the like. Users will still need to know and type possible metadata values. However, with the introduction of faceted search this shortcoming is not as severe as it was for SharePoint Server 2007 when it came out. Furthermore, the enhanced query syntax will also make it a hell lot easier for developers to create their own advanced search Web part to assist users in constructing complex queries.
  • Sub-optimal navigation experience. If search is great, users will adopt it as a good navigation tool. But in a SP2010 search center there is no way of navigating from a document search result to the document library where the document lives. Also, it can be hard for users to navigate back from the search center to the site they initiated the search from. However, these issues can be fixed with a little customization. But it would have been nice to see a greater user experience out-of-the-box.
  • Inconsistent search UI. The problem I am referring to here is that searching the "This site" search scope in the search box does not take the user to a search center; instead it takes her to the SP2010 Foundation search page in the _layouts folder. All other search scopes take her to the full search center. In other words you still have two different search interfaces in SharePoint. My recommendation will be to turn off the “This site” scope as results can anyway be refined by Site in the search center.
  • No Visual Best Bets. I have heard Microsoft presenters get all exaggerated about the Visual Best Bet feature available with FAST search. But it is really nothing special – just a Best Bet with an image! Seriously, this feature should also be available with the built-in search engine. In other words, the Best Bets feature does not seem to have received any improvements from SharePoint 2007 whatsoever.
  • Changes to managed properties still require a full crawl. Managed metadata properties are still there and managed exactly the same way as in SharePoint 2007. This is also good - but there is still one major annoyance that I had hoped MS would find a solution for. Adding a new managed property or changing an existing one, requires a new full crawl. This should not be necessary as the metadata is already indexed and searchable via a crawled property.
  • No push based indexing. It is not possible to notify the search index about immediate/important changes to content. The index will still have to wait for the crawler to stop by and pick up the changes for the index.
  • Incomplete indexing of system metadata. The indexer does not pick up all system metadata on documents. Forget about finding crawled properties for document information like CheckOutStatus, CheckedOutBy and CheckedOutDate. Then there is the ContentTypeId, which is indexed – but it seems to only happen for Office documents in the new 2007 format. A properly indexed ContentTypeId would make hierarchical searching on content types possible.
  • No document preview with hit-highlighting. This feature is unfortunately only available for FAST Search and here it does not even look very convincing except for PowerPoint documents.

Fortunately the list of bad news is shorter than it was for SharePoint Server 2007 back then. But let us see if we identify more good/bad news as we get to learn SharePoint 2010 Search better. I would love to hear from you out there – have you found other good or bad news on the topic?

SharePoint Search , ,

In Vegas and Ready for the SharePoint Conference 2009

19. October 2009

The time has come that me and many other SharePointers have anxiously been looking forward to; The SharePoint 2009 conference in Las Vegas where we will learn about all the new goodies SharePoint 2010 has to offer. I am here together with about 230 fellow Danes! The conference will altogether accommodate about 7000 attendees!! This is a very impressive figure that shows just how much momentum SharePoint has gained.

This morning (after a tasty but unhealthy breakfast buffet at the Luxor) I went to register myself for the conference and to get the SWAG. I have to admit that I was a little disappointed that it did not include a new cool SharePoint 2010 bag and a t-shirt with the new SharePoint logo on it. The best item in the SWAG is a 250 page book with technical details about all the improvements in SharePoint 2010. I have studied it and will in a moment share some of its content here.

There is a lot of new stuff in SharePoint 2010 – too much for me to cover it all here. Hence I will stick to my favorite topic here, namely SharePoint Search. My plan is to try and cover the news on Search starting today with the stuff I have just learned from the book. Then the next four days offers a god deal of sessions on SharePoint Search and I will publish technical details from these on a daily basis. The conference offers about 230 different sessions across all SP2010 topics where 13 of them are specific to Enterprise Search.

New and Improved Features in SharePoint Server 2010 Search

Here is the news I got so far from studying the book handed out to all conference attendees.

  • Improved user experience.
  • Improved relevance ranking.
  • Faceted search. The search user experience now includes this concept available in MOSS 2007 via the free faceted search Web part on CodePlex.
  • Improved people search with support for Wildcard search and phonetic name search. Faceted search is also available here.
  • Improved Search based on Social Behavior. Support for social tagging and rating of content, which will influence relevance. Another cool thing is that SharePoint tracks the Click-through rate on results to detect popular results. This information is in turn used to adapt and improve the relevance score from user behavior.
  • Re-architected Search Architecture. SP2010 includes a new Query architecture and a new Crawling architecture supporting greater redundancy and improvements to scaling up and out. The biggest change is the introduction of index partitions allowing a huge index to spread across multiple query servers. Hence, SP2010 now scales to about 100 million documents compared to 50 million in SharePoint 2007.
  • Improved development experience. Improved APIs and tools allowing developers to extend search and build applications on it.

This list is by no means the complete list of improvements to SP2010 search – we will for sure learn more during the next four days. Also, please note that the news above applies to the standard search engine that ships with SharePoint 2010 – they do not apply to the FAST Enterprise Search engine. The book does not say much FAST other then there will be a new product named FAST Search for SharePoint. It will provide a more conversational and visual user experience including document previews and a more advanced faceted search component capable of delivering deep refinements with exact counts,

Sessions that I Plan to Attend

  • Monday 19/10: Enterprise Search Overview
  • Tuesday 20/10: SharePoint Server 2010 Search: Capabilities Deep Dive, FAST Search for SharePoint: Capabilities Deep Dive, Deploying FAST Search for SharePoint, Search Relevance and Relevance Tuning
  • Wednesday 21/10: A Tour of Great Enterprise Search Applications, Overview of Content Acquisition for Search in SharePoint 2010, Solving Information Chaos: Advanced Content Processing with FAST Search for SharePoint, Social Search in SharePoint 2010
  • Thursday 22/10: Customizing Search in SharePoint: Building Great Sites with Search

SharePoint 2010 , ,

Microsoft SharePoint 2010 news from TechEd US 2009

12. May 2009

The first day of TechEd 2009 in Los Angeles is wrapped up and there have been many great sessions here on everything but SharePoint. But I also came here to learn about other MS technologies and to network. However, being a SharePoint guy I still had my hopes up for learning just a little more what the SharePoint team in Redmond have cooking for the next version. Having to wait until the big SharePoint conference in October is a long wait for me.

Hence I went to the OFC202 session (MOSS 2007: Overview and Roadmap) presented by General Manager Thomas Rizzo. Unfortunately the session was more Overview than Roadmap making it quite boring with an overweight of old stuff. But Thomas was not in a position to reveal many details about the roadmap for SharePoint 2010. Nevertheless, I made a note of the few things he could say and will share this in a moment along with a few more news I picked up elsewhere

A very funny thing happened in another and later non SharePoint session that I attended. The presenters here actually used the new SharePoint 2010 in a few parts of their demo, giving us a quick glimpse of the new UI in SharePoint 2010. Let it be unsaid what session this was - I'm sure they might otherwise get shot by someone in the SharePoint team!

With all this fresh in mind, let me share with you what I do know about the next SharePoint:

Release Information

  • The official name is Microsoft SharePoint Server 2010. Yes, Office has been removed from the name as it must only refer to the Office client apps. But there is no official acronym for it yet as MSS is already taken by the Microsoft Search Server product.
  • RTM in H1 2010. Pretty vague commitment if you ask me - but this is actually what Thomas had on the slides today.
  • CTP in July 2009. The is good news and I will be looking forward to seeing some bits.

System Requirements

  • SharePoint 2010 will be a 64-bit only release. I think this is a sound decision as 32-bit makes little sense anymore.
  • SQL Server 2005/2008 64-bit. SQL Server 2000 and SQL Server 2005 32-bit are not supported.
  • Internet Explorer 7+ , IE6 not supported.

General News

  • Rewamped UI with a Web enabled Ribbon control to align it more with the Office clients.
  • More use of Silverlight controls.
  • Support for mapping lists to their own database tables, offering better performance and scalability on large SharePoint lists. This is actually no big secret as this was already announced at the SharePoint 2008 conference.
  • CMIS support to enable SharePoint to interact with other CMS systems and vice versa.
  • Painless upgrade. The SharePoint team is investing a lot to deliver a smooth upgrade experience. But it also seems the architecture does not fundamentally change as it did from SPS 2003 to MOSS 2007. However, I need to see this for myself before believing it.

Search News

  • Faceted Search. Thomas did not say anything about this - but you can be sure this will be included OOTB in SP2010.
  • FAST Search for SharePoint. A new version of FAST Search for SharePoint at a lower cost.
  • The SharePoint team have scrapped their efforts to make the SharePoint search engine scale beyond 50 million documents in a single index. The argument will be to move to the FAST search engine instead.

News on Development Tools

  • Visual Studio 2010 will ship with comprehensive support for developing Web parts, features, solutions, content types, etc.
  • VS 2010 will among other things ship with a visual Web part designer
  • You can now build, deploy and debug SharePoint applications directly from VS. Just hit F5 to deploy and start debugging your Web Part - now that is cool!
  • New server explorer inside Visual Studio that lets you explore sites, lists, documents and other SharePoint objects.

SharePoint , ,