Josh Haas's Web Log

Startup Metrics: Key Indicators vs Data Mining

without comments

This is coming out of a conversation I had with the CEO of Beer2Buds a few weeks ago. We started talking about the pros and cons of various technologies for reporting on how people are using the site, but it quickly became a conversation about what to measure. I think most people (myself included) start in the other direction: what can I capture, and how can I see it? But that’s really the tail wagging the dog; it should be “what do I need to see?” and then therefore “how do I capture it?”

A useful distinction to make is between key indicators (I may be using this buzzword incorrectly) and data mining (I’m sure I’m using this buzzword incorrectly). Key indicators are what you want to see 9 am every morning to tell you if you are on track. They should be as few and as narrow as possible, because ideally you want an unambiguous “you are going in the right direction” green light or “you are going in the wrong direction” red light. If you’re watching 40 different metrics, there’s no way of knowing if you’re getting good news or bad news. In contrast, data mining is what you do when you’re off track and trying to figure out what’s going wrong (or on track and trying to figure out what you’re actually doing right)! You don’t data mine every morning; you do it once every week or two or month or two when it’s time to assess tactical or strategic direction.

So how do you get to a key indicator that lets you summarize your business in one number? I’ve found Eric Reis’s thinking in The Lean Startup super-helpful in sorting this out. He makes the point that, counterintuitively, your most important numbers are not bottom-line numbers such as number of users or total revenue. Rather, in a startup, what you’re really trying to figure out is whether you have a sustainable growing business, so the most important number to watch is the sustainable rate of growth of your business.

That takes a bit of explaining. Eric’s theory is that every successful business has a primary growth engine that functions continuously; short term bumps such as a burst of users from a Super Bowl ad or a special promotion don’t matter in the long run. He breaks growth engines into three categories, each one of which has a different key indicator:

  • Sticky — you acquire more customers faster than you lose customers. This works for businesses where customers stick around for a while, either via subscriptions or via repeat sales. A sticky business is healthy if your rate of user acquisition over time is greater than your rate of attrition over time. So the number you want to watch is (customer growth % – customer attrition %): if that number is positive, you have a healthy business, and if it is negative, you don’t.
  • Paid — the value of a customer is greater than the cost it takes to acquire customers. This works for businesses where throwing more money at customer acquisition (ie, via buying advertising or hiring a sales force) increases the rate of customer acquisition without diminishing returns. The key indicator here is: (average expected lifetime value of customer – cost to acquire a new customer). If this number is positive, it means you can take the revenue from new customers and feed that revenue back into acquiring more new customers in a continuous cycle.
  • Viral — each new customer brings along more new customers. This works for businesses where your customers recruit other customers very heavily, generally as a natural result of using the service. For a viral engine of growth, the key indicator is your viral coefficient: the average number of new users that each user brings to the service. If your viral coefficient is greater than one, you’ll get exponential growth; if it’s less than one, this growth engine will gradually fizzle out.

Identifying your engine of growth is a great exercise because it creates a ton of clarity about your business model. You have to understand how you expect your business to grow and succeed.

In practice what I’ve found is that it’s pretty messy. For KeywordSmart, we decided that our best bet was to focus on the sticky engine of growth, because we’re a subscription service, and because buying advertising in our space is hard (not many people google for image keywording software). We can measure user acquisition pretty easily, as well as attrition, although it’s not clear to me what the relationship is between our acquisition rate and our number of existing users. So for now what we’ve been watching is the user acquisition as a flat amount instead of as a percent of our existing user base. That gives us enough clarity of focuse for now, and as KeywordSmart grows bigger, I think it’ll be easier to tell what we should be looking at.

Beer2Buds raised some challenges because, given the nature of it, sometimes a user might not use them at all for a year, and then use them again. Given that you generally don’t want to wait a year to measure how your business is doing (at least in startup-land), this is a little difficult. What we ended up deciding was the Beer2Buds was primarily using the viral engine, and that we’d count a user doing a repeat purchase of beer as them recruiting an additional user: ie, measuring how viral each transaction was instead of measuring how viral each user was. We decided that even though looking at the viral coefficient for any given transaction in a shorter time frame than a year would lead to an underestimate of the viral coefficient, it probably still made sense to focus on it as the key indicator: we wouldn’t be able to tell in a binary yes-or-no way whether the business was growing or shrinking, but we’d still be able to measure progress by seeing if the number was going up.

That leads into the next question which is, what do you do with your key indicator once you’ve identified it? The actual important thing is not where your indicator is at, but how your indicator is moving. (So in other words, if you’re still following this, the rate of change of the rate of change of the growth of your business — the second derivative). Because that is what tells you whether you’re learning and making progress, which tells you if you’re going down a good path or if you’re stuck. So actually, what you really want to be able to measure is your key indicator by cohort: for the users who first encountered you last month, what was the key indicator for them? For the month before? For the month before that? (Substitute weeks or days if you have enough users to get meaningful readings at those timescales). That graph — your key indicator changing over time by user cohort — should be going (more or less) linearly up in a startup that is learning and improving and heading in a good direction.

The other thing that’s useful to be able to measure is your key indicator across different experimental groups. So if you run an A / B test, or roll out a new feature to a fraction of your user base, you want to be able to see how the key indicator performed for each segment of that test. That gives you an easy thumbs-up or thumbs-down on the experiment, by tying the experiment to the things that drive your business.

So that’s key indicators. To summarize: figure out how your business is going to sustainably grow. Figure out what to measure to know if that’s actually happening. Build the technical ability to measure that number across your user base, segmented by cohort or by experiment. Profit!

I’ll get into data-mining another day because this post is already pretty long. But the two second version is: a) capture EVERYTHING, b) capture everything in terms of USERS (not page views or hits or clicks or whatever). I’ll go into that more, as well as give more concrete examples of what I actually did with KeywordSmart from a building-this-out point of view, in a future post!

Written by jphaas

December 19th, 2011 at 3:47 pm

Posted in Uncategorized