In part one I gave you an introduction to MongoDB for Sitecore Developers. In this post I will walk you through the most useful out-of-the-box Sitecore MongoDB databases for xDB analytics. I have found several of these collections useful for many purposes, especially for accessing user contact cards and interaction data.
Sitecore reporting services still run against SQL server through Sitecore’s data aggregation processes. However the database of record is now, in fact, MongoDB.
When you need to access data from the Analytics database in your code, you should use the Sitecore Analytics API. It is possible that the API might not have a way to access the data you are looking for, or the API may be inefficient to get the data the way you need it. Under these conditions it is useful to have some understanding of what data is available should you decide to query it directly in MongoDB.
Below is a list of the MongoDB databases and collections in the Sitecore Analytics system that I found useful. In my personal experience, I have found that the most useful collections are: Contacts, Identifiers, and Interactions.
- This collection stores contact id as the document key.
- First, query this collection with the upper-cased Sitecore user name to quickly determine the contact id. Then use this id to query the Contacts or Interactions collections. Following this pattern to look up a contact card or interaction document results in two index scans, which is extremely fast.
- This collection stores all of the Sitecore contact cards for both anonymous and identified users.
- Identified users will have an “Identifiers.Identifier” property, while anonymous users will not.
- TIP: Avoid querying the “Identifiers.Identifier” property directly. This property is NOT indexed and will result in a slow collection scan.
- To query a contact card, use the contact id you obtained from the Interactions collection to query against the “_id” property. Doing so will result in a very fast retrieval of the contact card document via an index scan (as opposed to a collection scan).
- To view an interaction document, use the contact id obtained by first querying the Identifiers collection.
- This collection stores information about a user’s visit, including:
- Site name
- Pages viewed (including items and item ids)
- Analytics goals triggered & value of the visit
- GeoIP data
- Browser identifier & operating system
- User’s screen size
- User’s language
- This collection maps contacts to GeoIP, location, and user agent data.
- This collection maps a user device to a contact in the Contacts collection.
- This collection contains the user submitted Web Forms for Marketers (WFFM) data.
- This collection serves as a cache of GeoIP lookup data that Sitecore performed with the MaxMind service. This seems to reduce the number of MaxMind lookups that Sitecore performs.
- This collection is a repository of the country and business name pairs of user data obtained by GeoIP lookups.
- This collection holds the user agent strings that have browsed the website.
Continued Reading: Contact Tracking (official documentation)
Did you miss part one? Check it out here: Intro to MongoDB for Sitecore Developers
- Useful MongoDB queries of analytics data
- Creating custom indexes
- Maintenance scripts
- Creating custom MongoDB databases, collections, and documents