Keeping Track of Product Use with Logging

At my last job before Smartling – a big global ad agency with a 50-person developer team – I built a time-scheduling application that allowed project managers to schedule the efforts of us code monkeys. One of my great satisfactions was walking around the open floor of the department and seeing the application I had built up on many screens at any given time. It was a kind of poor man’s Google Analytics. If you want to track the use of your product, just hover creepily behind your user’s Aeron and see what he’s doing!

But what about the case where you can’t be in the same room as your users? Event tracking and analytics are not exactly new-fangled. Google Analytics, Omniture and their ilk have been around for many years. But one thing that has always frustrated me about these solutions is that while they give a good statistical big picture of the use of your application, a birds-eye-view, it’s not really possible to drill down into the specifics of each, solitary, tracked event. As in: give me all the click events and associated parameters that occurred for a specific user during a one minute stretch of time last Friday night. What language was he translating? If he saved a translation, what did the translation look like before he saved it? How long did the AJAX calls he invoked take to complete? Any errors? How many cans of beer did in he have in him at the time?

Logging: Not Just for Lumberjacks

That’s not analytics though. It’s logging. Yes, you want big picture analytics too (total counts for a day, etc.), but being able to rewind time and see the discrete data, line by line, as it rolls in is powerful.

To accomplish this easily, and without the help of our too-busy net-ops team, I’ve been using a service called Loggly. Via their pixel image or a JSONP call, you can send them your events and associated parameters straight from your JavaScript. They index it almost instantly, making it easily searchable. You log in to their site and view your events in what they call their “shell.” It’s browser-based, but looks just like a terminal you would normally use to inspect log files. Not only can you view each discrete log entry, with all its contextual parameters, but you can also generate analytics, counts and even graphs. There’s an API of course, if you want to get your data out and over to, say, Google Charts. They are working on an alert service called Alert Birds to send you a notification if some event occurs or threshold is reached. And they have a great crazed-beaver mascott.

If you send them your data in the form of JSON, you can then write queries against it as structured data! I like to send them the first name of the user so that when our customer support team comes to me saying that a translator named Jorge filed a support ticket last night, I can write a quick query: search json.firstName:”Jorge” from NOW-1DAY. See what that dude was up to! I always send the user’s current URL (at time of event) as a param so that I can instantly bring it up in a browser and try to reproduce the reported error. You don’t have to rely on Jorge to carefully document what went down at 2012-05-11 10:09:49.236.

What’s really new about all this is that such a massive quantity of data can be stored cheaply, and, more importantly, searched almost as soon as as it’s written to disk. Data is worthless unless you can search it. This is the phenomenon known as “big data.” In a nutshell, what “big data” means is that the same technology developed by Google to index the entire internet and make it instantly full-text searchable (it’s called Hadoop) can be used to search any source of data – your event logs, for example. As a front end developer, you probably wouldn’t want to set Hadoop up yourself, but that’s what companies like Loggly, Splunk and Mixpanel are doing for you.

About Andrey Akselrod

Andrey comes to Smartling from his role as VP, Technology at SpaFinder, where he developed and maintained the site and eCommerce platform in 6 languages. Previously, he held executive positions at RunTime Technologies and consulted for JP Morgan Chase, where he was responsible for eCommerce, digital asset management, data warehousing and other large-scale projects. He’s a native Russian speaker. He holds a BS in Computer Science from Brooklyn College and lives in New Jersey with his wife and children. He’s passionate about all things technology, and a good cup of coffee. If not in the office, you can probably find him at Stumptown.