Repost: The secrets of SQL Syntax Queries for Relevant Results in MOSS 2007 Enterprise Search

31. March 2009

This is a repost of an article on my old Mondosoft blog, which is not live anymore. For some time I thought the article had been lost forever as I did not have a local copy of it. But fortunately some other bloggers had copied the entire content to their own blog, which enabled me to recover it and now post it again.

This post is a tutorial on using the Microsoft.Office.Server.Search.Query.FullTextSqlQuery class for searching with the MOSS 2007 search engine with the Enterprise Search SQL syntax. I will at the end of this posting explain how to execute queries for optimal results rankings. But let me first establish the basics of using the FullTextSqlQuery class. You will only need to employ it for one of the following reasons:

  • You need to support Wildcard searches.
  • You need to search with date ranges.
  • You need to search properties with different operators (CONTAINS, =, >=, <=, <, >, LIKE).
  • You need to search properties for NULL values.
  • You need to use the NEAR operator.
  • You need nested Boolean queries.
  • You like the most flexible (or complicated ;-) ) solution.

 

Simple keyword and property queries are easier done with the Microsoft.Office.Server.Search.Query.KeywordQuery class. I will, however, not cover this class here. Ok, let us get started. I have outlined the format of the SQL Syntax accepted by the MOSS 2007 search engine below. After that I will show how to talk to the search engine via the official .NET API.

 

SQL Syntax

A SQL query is a string that must adhere to the following structure:

SELECT <columns>
FROM <content source>
WHERE <conditions>
ORDER BY <columns>

 

SQL Syntax Examples

Finds relevant results containing the keyword SharePoint.

SELECT WorkId,Path,Title,Write,Author,HitHighlightedSummary,
HitHighlightedProperties,CollapsingStatus
FROM Scope()
WHERE FREETEXT(defaultproperties, 'SharePoint')
ORDER BY Rank Desc

Finds relevant results containing at least one of the keywords SharePoint and Search.

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, 'SharePoint Search')
ORDER BY Rank Desc

Finds relevant results containing both the keywords SharePoint and Search.

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, '+SharePoint +Search')
ORDER BY Rank Desc

Finds relevant results containing the exact phrase SharePoint Search.

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, ' "SharePoint Search" ')
ORDER BY Rank Desc

Finds relevant results containing both the keywords SharePoint and Search but not the keyword WSS

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, '+SharePoint +Search -WSS')
ORDER BY Rank Desc

Finds relevant SharePoint results authored by persons named John.

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, 'SharePoint') AND CONTAINS(Author,' "John" ')
ORDER BY Rank Desc

Finds relevant SharePoint results modified within the last 30 days.

SELECT WorkId,Path,Title,Write,Author,...
FROM Scope()
WHERE FREETEXT(defaultproperties, 'SharePoint')
AND Write<=DATEADD(DAY,30,GETGMTDATE())
ORDER BY Rank Desc

 

.NET API

The following code should give you some inspiration how to execute a SQL query against the MOSS 2007 search engine.

// Execute Query
ResultTableCollection results = null;
using (FullTextSqlQuery query = new FullTextSqlQuery(ServerContext.Current))
{
query.StartRow = 0;
query.RowLimit = 10;
query.HighlightedSentenceCount = 3;
query.EnableStemming = true;
query.TrimDuplicates = true;
query.Culture = CultureInfo.CurrentCulture;

query.KeywordInclusion = KeywordInclusion.AnyKeyword;
query.SiteContext = new Uri("http://yourserver/sites/asite");
if (SPSecurity.AuthenticationMode != AuthenticationMode.Windows)
query.AuthenticationType = QueryAuthenticationType.PluggableAuthenticatedQuery;
else
    query.AuthenticationType = QueryAuthenticationType.NtAuthenticatedQuery;
query.QueryText = "SELECT ... FROM Scope() WHERE ... ORDER BY ...";
results = query.Execute();
}


// Parse results and create XML output
StringBuilder buffer = new StringBuilder(10240);
ResultTable relevantResults = results[ResultType.RelevantResults];
using (XmlTextWriter writer = new XmlTextWriter(new StringWriter(buffer)))
{
writer.Formatting = Formatting.Indented;
writer.WriteStartElement("Results");
writer.WriteAttributeString("hits", relevantResults.TotalRows.ToString());
while (relevantResults.Read())
{
writer.WriteStartElement("Result");
for (int i=0; i<relevantResults.FieldCount; i++)
{
writer.WriteStartElement(relevantResults.GetName(i));
object val = relevantResults.GetValue(i);
if (val != null)
writer.WriteString(val.ToString());
else
writer.WriteString("null");
writer.WriteEndElement();
}     
writer.WriteEndElement();
}
  writer.WriteEndElement(); string xml = buffer.ToString();
...
}

The using statement around the usage scope of the FullTextSqlQuery class is very important to remember as you will otherwise get an OutOfMemoryException after running a good number of queries. This is so because the search engine is a COM server that the .NET API maintains a handle to.

 

A closer look at the FREETEXT predicate

Take a look at following two keyword searches using the FREETEXT predicate (rest of SQL string omitted for simplicity):

FREETEXT(defaultproperties, '+sharepoint +search')
FREETEXT(defaultproperties, 'sharepoint') AND FREETEXT(defaultproperties, 'search')

Both queries are valid and yield the same results buth with different ranking. Which one yields the best results ranking? Answer: The first one! It is recommended that you only use one FREETEXT predicate in a search query. Results ranking will otherwise not be optimal.

The defaultproperties keywords references the default set of properties to include in the ranking algorithm. This is also recommended for optimal ranking of results. You can alternatively use the WITH predicate to define your own set of properties if you are not happy with the default one. This is helpful if you need to promote results containing the keywords on your own custom properties. Note: The standard MOSS 2007 search center simply uses the defaultproperties.

The FREETEXT predicate can also be configured for implicit AND search or implicit OR search. This means that a query like:

FREETEXT(defaultproperties, 'sharepoint search')

can respectively find results containing both the keywords sharepoint and search or find results containing at least one of the keywords. Use the the KeywordInclusion property on the FulltextSqlQuery object to control this behavior. See the code example earlier in this posting.

 

Conclusion

I have just described the key concepts of using the FulltextSqlQuery class. I have not described all of its features nor have I described the SQL Syntax in full. I refer you to the MOSS 2007 SDK for a complete reference.

SharePoint 2007, SharePoint Development, SharePoint Search , ,

SPListItem.SystemUpdate() drops version comment

5. February 2009

I recently discovered an unfortunate data loss bug in the Windows SharePoint Services 3.0 API. It turned out that calling the SystemUpdate method on an SPListItem object also deletes the version comment on the last version. This kind of defeats the idea behind the SystemUpdate method, which is designed to let applications change one or more properties on an item without leaving any trace like a new last modified time or creating a new version.

In general the method works as expected - but it does have this one unfortunate side effect. I have with no luck tried finding a work-around for it. For one of my clients it is an annoying problem as we have a SharePoint workflow that relies on the SystemUpdate method to update metadata on the associated list item.

Consequently, I ended up reporting the bug to Microsoft. Great news - they managed to reproduce and fix it very fast. It now looks like the fix will make it into the February CU. I must say MS did a very professional job all the way from the support team in Munich to the product team in Redmond. Kudos to you all!

SharePoint Development , ,

Visual Studio Unit Tests and 64-bit SharePoint does not play

3. August 2008

I like 64-bit SharePoint and have more or less forgotten about the 32-bit editions by now. Consequently, my development environment is also 64-bit all the way. Now, I am also a great fan of unit tests and find them very convenient when developing software for SharePoint. Besides the fact of having reproducible tests they are simply also a great instrument as you write code. It is simply great to run and test your central pieces of SharePoint code directly from Visual Studio before exposing them in a Web Part, Workflow or the like. I have been quite happy with the integrated unit testing facilities in Visual Studio until the day I switched my dev environment to 64-bit.

Then it suddenly stopped working. Most of my existing unit tests consistently failed with strange errors I had never seen before. In the old 32-bit environment the tests still ran like a dream. I then started researching some of the errors and it turned out that only unit tests resulting in calls to the SharePoint Object Model failed. For instance SPFarm.Local returned null and the next line of code then failed with a NullReferenceException! Yes, this is what I mean by strange errors - I never expected SPFarm.Local to return null when the farm is up and running just fine.

Well, to cut a long story short I finally figured out what was going on. The problem is simply Visual Studio that always runs the unit tests inside a 32-bit process named VSTESTHOST.EXE. This is fine as long as you are using a 32-bit edition of SharePoint but it is not fine for the 64-bit editions. The 32-bit VSTESTEST.EXE process cannot load the SharePoint assemblies that are compiled for the 64-bit platform. The problem was difficult to track down as the process did manage to load some assemblies like Microsoft.SharePoint.dll and did therefore not have any trouble locating the SPFarm class. I think the problem occurred deeper down under the hood when the SharePoint classes internally accessed other 64-bit only assemblies or 64-bit DCOM objects.

The solution? Use NUnit and forget about Visual Studio. I have tested with Visual Studio 2005 and 2008 and the story is unfortunately the same for both. I am not sure why Microsoft did not consider this scenario but I have a plea to the Visual Studio team: Please consider a fix for this problem in the next service pack.

SharePoint Development , ,