Level: 200
 

Sitecore OMS data querying – Most viewed pages

Graph

This article will describe how you query for data in Sitecore OMS analytics database, pulling out statistics about the visits on your site. In particular this article will show you, how you can use the analytics data to create a presentation of the most viewed pages on your site.

Written by: Jens Mikkelsen
Fri, Sep 11 2009

Introduction

 

This article will show sample code on how to query data from the Sitecore analytics database. It will focus on making the code as readable and simple as possible. You might be able to optimize the queries, if you focus on performance and you should definitely cache your presentations, as you don’t want to query the analytics database for each visitor coming to your site.

 

 

In the example in this article we will pull out the most viewed pages and present them in a sublayout. Content lists like these are used on many pages and they’re pretty simple to implement with Sitecore Analytics data.

 

 

The analytics database

 

There is no immediate API to query data in Sitecore OMS, so you have to query down in the database directly. To start out with I thought this would be a problem, but after trying it out, I am actually rather optimistic about this approach, as it gives you completely free hands to do whatever you like with the data.

However there isn’t really any documentation on the different tables and there are a lot of them. Luckily the most of them are pretty obvious, so you will probably understand what the different tables are used to. In this article we will only focus on a few of the tables, which we need to use to implement the “most viewed” list. In future articles we will look into a lot of other tables.

 

 

The easiest way to see a database diagram of the analytics database is by navigating to the database in Microsoft SQL Server Management Studio and clicks the database diagram. You will then discover the large number of tables. For now you just need to know that the database revolves around the Session table which tracks the different user sessions. The pages table holds the accessed pages, so we are going to use that to get the most viewed pages.

 

 

Querying for data using LINQ to SQL

 

You can access the data using LINQ to SQL. You can watch this video for an introduction http://www.microsoft.com/uk/msdn/nuggets/nugget/223/Introduction-to-LINQ-to-SQL.aspx.  And I also find these examples quite useful: http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx.

 

 

As we are building a presentation with access to a database, I am going to create a class representing  any functionality related to accessing the database; I will refer to this class as OMS_DAL. In here I can create the connection to the Analytics database. Sitecore provides a DataContext which gives the possibility to query the database using LINQ to SQL. You can get it like this:

 

 

private static SitecoreAnalyticsDataContext GetAnalyticsDataContext()

{

  return new Sitecore.Analytics.SitecoreAnalyticsDataContext(ConfigurationManager.ConnectionStrings["analytics"].ConnectionString);

}

 Note: From Sitecore 6.1 Sitecore doesn't provide a DataContext, so you should add it yourself, by creating a .dbml file in Visual Studio and connect to your database. The pull in the tables you need.

 

Where the “analytics” string represent the name of the connection string to the analytics database you have specified in the connectionStrings.config. I have wrapped the call in a private method in my data access class, so it is easy to retrieve the DataContext.

 

Now we are able to query the data. I have created a method for retrieving the data called GetPopularPages. To ensure we only get pages accessed in the same site as the list is being presented in, we need to pass in the name of the site. Further we need a specified database to retrieve the items from and an integer specifying the amount of items we want returned.

 

 

public static IEnumerable<PopulerItem> GetPopularPages(string website, Database database, int NoOfItems)

{

  //Get the analytics context

  SitecoreAnalyticsDataContext dataContext = GetAnalyticsDataContext();

 

  //Get all pages which has been visited in the current SiteContext

  var pagesInCurrentSite = from session in dataContext.Sessions

                           where session.MultiSite == website

                            join page in dataContext.Pages on session.SessionId equals page.SessionId

                           select page;

 

 

  //Select the item from the database and count the hits

  var popularPages = from page in pagesInCurrentSite

                     group page by page.ItemId into x

                     select new PopulerItem { ItemId = new ID(x.Key), Count = x.Count() };

 

 

  //Sort the items by how many hits it has received

  var popularPagesSorted = from popularPage in popularPages

                            orderby popularPage.Count descending

                           select popularPage;

 

 

  return popularPagesSorted.Take(NoOfItems);

 

 

}

 

 

  • In the first line of the method, we get the DataContext to query from.
  • In the next line we specify that it should only include hits from the specified site by using the session table. This tables stores the name of the Sitecore site, which was accessed during the session. After that we join the session table with the pages table.
  • Now we can group the same pages together, so we are able to count how many hits they have had. We do this by creating a small simple class holding the item accessed and the amount of hit it has had.
  • In the next line we sort the result after the amount of hits they have received.
  • And last but not least we only return the number of items needed.

 

You might want to add on to this code adding further constraints to the query. For instance you might not want to add the homepage to the most viewed pages, as it’ll probably always be at the top, which might not be what you want. Further you might ensure that the query only includes hits from the last month or so, ensuring that new pages can make the list.

 

 

It is really that simple to retrieve the data. You can look around in the database diagram and see if there is something interesting you want to present on your site.

 

 

Presenting the data 

 

To present the data we create a sublayout. In this we create a simple repeater, which we can bind the data to.

 

 

<%@ Control Language="C#" AutoEventWireup="true" CodeBehind="TestMostRead.ascx.cs" Inherits="TestProject.layouts.TestMostRead" %>

<ul>

<asp:Repeater ID="SessionRepeater" runat="server" OnItemDataBound="SessionRepeater_OnItemDataBound">

  <ItemTemplate>

    <li>

      <asp:HyperLink ID="ItemHyperLink" runat="server" />

    </li>

  </ItemTemplate>

</asp:Repeater>

</ul>

 

 

On the page_load event we need to retrieve the data and bind it to the repeater.

 

 

public partial class TestMostRead : System.Web.UI.UserControl

  {

    protected void Page_Load(object sender, EventArgs e)

    {

      IEnumerable<PopulerItem> sessions = OMS_DAL.GetPopularPages(Sitecore.Context.Site.Name,Sitecore.Context.Database, 5);

      SessionRepeater.DataSource = sessions;

      SessionRepeater.DataBind();

    }

 

    protected void SessionRepeater_OnItemDataBound(object sender, RepeaterItemEventArgs e)

    {

      if (e.Item.ItemType == ListItemType.Item || e.Item.ItemType == ListItemType.AlternatingItem)

      {

        PopulerItem popularItem = e.Item.DataItem as PopulerItem;

        HyperLink hyperList = e.Item.FindControl("ItemHyperLink") as HyperLink;

        ID itemId = popularItem.ItemId;

        Item item = Sitecore.Context.Database.GetItem(itemId);

        if (item == null)

          return;

        hyperList.Text = item["Title"];

        hyperList.NavigateUrl = LinkManager.GetItemUrl(item);

      }

    }

  }

 

 

What we do here, is getting the items from the DAL layer. We can bind them to a repeater. As the data in the analytics database might be out of sync with the live content, meaning that an item in the analytics database may have been deleted from the content database, we need to check whether the item exists or not. This could be added to the DAL layer ensuring that the list will always contain the specified number of items, but in this example we just handle it in the presentation.

 

 

You will now see that the list presents the most viewed pages. Easy right?

 

 

Caching

 

As you don’t want to query the analytics database each time you get a hit on your page, you want to ensure that the sublayout or the data used is cached. Failing to do so will probably take your site down, if you have a reasonable amount of visitors. Sitecores cache is bound to publishing, meaning that the cache will be cleared whenever a publish is performed. This rendering is not really related to new or changed items, but the data in the analytics database. Therefore I would recommend, that you use a custom cache for the data in the analytics database and then clear it every hour or so.

 

This should ensure that you don’t have too many queries to the analytics database and that the list is more or less up to date.

 

 

Please rate this article


7 rates / 3,86 avg.

  • About the author:

    Jens Mikkelsen

    Jens Mikkelsen is a partner at Inmento Solutions a Sitecore consulting firm. He works as a Sitecore specialist and consulting helping clients architect and build quality Sitecore solutions using the newest modules and tools. 

    Further he has been deeply envolved in various complex solutions and has built up a strong knowledge of Sitecore architecture and best practices. He has especially focused on and is specialized in debugging and analyzing Sitecore solutions.

     

    Jens is very interested in the technical mechanisms in the new marketing products such as Sitecore DMS and Sitecore ECM.

    My Sitecore Freelance CV

6 responses to "Sitecore OMS data querying – Most viewed pages"

Jens,

I just completed the OMS developer training which included this code snippet. I need to this for a client. My problem is I don't have SitecoreAnalyticsDataContext in my Sitecore.Analytics.dll? I can't find it via reflector either. I just Googled the term and their are only two hits, this page and the PPT. Am I missing something?
Posted: Saturday, October 30, 2010 7:35 PM
Hi Paul,

They removed the SitecoreAnalyticsDataContext in OMS 1.1. What the DataContext really is, is a LinqToSql datacontext (.dbml). You can create it on your own using Visual Studio.
This seem to be the way to do it from 1.1.

Hope that helps!

Cheers
Jens Mikkelsen
Posted: Sunday, October 31, 2010 10:59 PM
Hui
Hi Jens,

Thanks for your article.
If i want to select most viewed pages by one specific templateId , what should i do ? The page table just has an itemid in it, there is not templateId to relate , and i would not like to use itemId to find the item templateid, because there could be lots of items.
Posted: Thursday, March 24, 2011 7:10 AM
Hui
Hi Jens,

I updated the linq like:
.....
list<guid>.contain(page.itemID)
....
so this operation will filter data by a itemIDs list.
It works now, but i also want to know if it can query by a specific templateId condition, because some requirements could like: most popular pages under specific section. :(

Thanks
Posted: Thursday, March 24, 2011 9:24 AM
Hi Hui,

To my knowledge it is not possible to query on the template ID. :(

Cheers
Jens
Posted: Tuesday, March 29, 2011 11:57 PM
Has anyone come up with a similar code snippet/SQL Query for DMS?
Posted: Monday, May 20, 2013 11:06 PM

Leave a reply


Notify me of follow-up comments via email.
 
 
#nbsp;