Examiness hints and tips from the trenches part 9 - Secure searching

In this post I am going to cover how to handle search results when you are working with a site that has a mix of secure and public content. By default out of the box Umbraco Examine will not index Umbraco protected content. However you may have a content rich site that needs to provide a fully functioning search both for public and extranet members. So there are a few things that need to be done to enable this.

Firstly you need to update the ExamineIndex.config file so for the index that you are interested in, enable supportUnpublished.

<add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
supportUnpublished="false"
supportProtected="true"
interval="10"
analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

This will now make this index support protected content. The next step is to inject into the index groups that have access to the content for this we need to make use of our old friend GatheringNodeData event.

using System;
using System.Text;
using System.Xml.Linq;
using Examine;
using Umbraco_Site_Extensions.Helpers;
using umbraco.BusinessLogic;
using UmbracoExamine;
using umbraco.cms.businesslogic.web;
namespace Umbraco_Site_Extensions.examineExtensions
{
public class ExamineEvents: ApplicationBase
{
public ExamineEvents()
{
ExamineManager.Instance.IndexProviderCollection[Constants.CogWorksIndexer].GatheringNodeData += ExamineEvents_GatheringNodeData;
}

void ExamineEvents_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
InjectGroups(e);
}

/// <summary>
/// munge into one field
/// </summary>
/// <param name="e"></param>
private void InjectGroups(IndexingNodeDataEventArgs e)
{
var node = new umbraco.NodeFactory.Node(e.NodeId);
if(umbraco.library.IsProtected(e.NodeId,node.Path))
{
var groups = Access.GetAccessingMembershipRoles(node.Id, node.Path);
var groupsAccess =new StringBuilder();
//we now have list of group ids that access to this content
foreach (var group in groups)
{
groupsAccess.Append(group);
groupsAccess.Append(" ");
}

e.Fields.Add("GroupAccess",groupsAccess.ToString().Trim());
e.Fields.Add("IsPublic","false");
}

else{

e.Fields.Add("GroupAccess","0");
e.Fields.Add("IsPublic","true");

}
}
}


So now in the index we will have a new field called GroupAccess which should have values like '1045' and '1080' where the ids correspond to member groups set up in members section if the page is not protected it will just have value 0.

Now that we have this data in the index we can use it as a filter in our queries. So as part of your search code you will firstly determine if a person executing the search is logged in or not. If they are logged in then using the membership api you can get the ids of groups they belong to. Next as part of your query you could use a filter like:-

query.And().Field("GroupAccess",memberGroupId.ToString();
query.Or().Field("IsPublic","true");

If they are not logged in then your query will look like

query.And().Field("IsPublic","true");

Media protect

If you happen to be using the excellent Media protect package by Richard Soeteman then you would in your GatheringNodeData event make use of the media protect api to get the groups that have access to the media item you are currently indexing,

MediaProtect.Library.AllowedGroups(nodeId, path);

The filtering is performed in the same way as content.