Ease/value ranking

It was early afternoon a few days after I'd started as Chief Product Officer, and I was rubbing my eyes blearily.

We had a list of product ideas as long as my arm - some ideas promised to really make a difference but required major engineering effort, while others looked like quick and easy wins. Their advocates were pointing to various possible benefits. Obviously we couldn’t do them all. We’d been talking for 90 minutes, and we were thrashing around in the face of so many competing possibilities.

I felt that I should somehow be able to offer some clarity, or some insight, but I was new and I didn’t trust my own judgment. In a way, I’m glad about this in retrospect - because the best answer is rarely the HiPPO’s (highest paid person’s opinion). It usually comes out of multiple perspectives.

So we took a break for 10 minutes. Every time we take a break in a long meeting, you can feel the group’s IQ bobbing back up afterwards.

When we came back, we ran an ease/value ranking:

  1. Start with your longlist of ideas
  2. Give each idea a low/medium/high score (1-3, higher is better) for how valuable it would be
  3. Give each idea a low/medium/high score (1-3, higher is better) for how *easy* it would be.
  4. Multiply the two scores together, and rank them.

We discussed and fine-tuned the prioritisation of the top handful, and ignored the rest. That made for a focused, efficient discussion, everyone felt heard, and we felt confident that our prioritisation was more than good enough. And then we revisited the ignored, remainder items at a later date when updating our prioritised backlog.

It was one of those moments where I realised that my job as a leader wasn't to be the smartest person in the room with all the answers. Instead, as a group, we were smarter than any one of us as individuals. And better still, the process felt inclusive.

Here's a Google Sheets template.

(More detail follows for the high-need-for-cognition folks in the room.)

FAQ

How should I handle things if there are more than a couple of people involved?

This process scales up well with larger groups. To benefit from the wisdom of the crowds, we need to aggregate independent, diverse, informed perspectives.

Use one of the more complicated template tabs with separate columns for multiple raters.

Apply the principles from the Idea Stampede, i.e. ask people to work quietly on their own for a while, scoring within their own column of the spreadsheet without looking at other people's, then aggregate and discuss.

It can help to ask people to write a comment for each score as they're scoring. This forces them to think things through, and helps a lot to understand the sources of disagreement - "ah, Person A is confident because they think the programming side will be easy, but Person B is worried about the information security risks...". Those comments can be really useful as a reference and reminder later. But use your judgment - if there are *lot* of ideas, maybe better to do a quick-and-dirty first-pass, and save the fine-grained head-scratching for the medium-length-list.

Why does this technique work so well?

Ease-value ranking is such a simple, obvious idea, but it works remarkably well. I think there are a few reasons:

  • Sometimes we're better at breaking down an overall, intuitive, multi-faceted judgment into a few narrower, more cleanly-defined 'structured criteria' first. Otherwise we have to try and judge and weigh all these facets against one another at the same time. For example, [structured criteria work well in hiring assessments](https://www.edbatista.com/2021/02/daniel-kahneman-on-conducting-better-interviews.html).
  • The ease-value scoring is quick and *feels* less effortful, especially if you're just skimming along doing 1-3 scoring.
  • It's fair. No one's voice is overpowering, and no one's views are getting over-represented.
  • We can decompose the scores in a few ways, e.g. to notice disagreement, or to choose a portfolio of both low-hanging fruit and risky, swing-for-the-fences big-bets.

How do you generate the longlist in the first place?

You can do this beforehand asynchronously, or in the room as part of the ease-value ranking.

Apply the principles from the Idea Stampede, i.e. ask people to write down ideas silently on their own, encouraging them to write clearly and provide enough context to be understood. It can help to add one or two obvious ideas at the top as exemplars, to model the style you're looking for.

Once people are starting to run dry, ask the group to scan over the list, merging duplicate ideas together and tidying things up.

Then take a break before starting the ease-value scoring.

Does this go by other names?

Yes, lots of people have suggested this idea before, e.g. cost/benefit, difficulty/value, pain/gain matrix, and ICE (impact, confidence, ease).

I mostly prefer ease-value because:

  • This way, higher is always better, i.e. high ease is good, whereas high difficulty would be bad. This is easier to think about and remember, makes calculations simpler, and if you plot on a 2x2 then we naturally expect "up and to the right" is good.
  • In ICE (impact, confidence, ease), I can see how a confidence estimate could be useful. But in practice, I find it works well enough to fold that in to the other metrics, e.g. if we're hoping that an idea will be straightforward, but recognise there's a chance of it turning into a complicated nightmare, we can reflect that with a lower Ease score. Confidence is more effortful to judge somehow, and so the process bogs down. And if we're including one confidence metric, shouldn't we have separate confidence scores for both Ease and Value? Does Confidence act as a multiplier somehow? etc. Much better to just stick with 2 scores, race through the process, and leave more time for discussion & refining the top handful.

But what do 'ease' and 'value' actually mean?

Ah, I was hoping you wouldn't ask.

Value

  • In the context of a product or business decision, 'value' usually means 'business value', i.e. value for customers, or impact on profitability, or progress towards your stated goals/OKRs.
  • Sometimes I tell people that 'high' value means something that the CEO will really care about, or will have a real impact on the bottom line, or will have a transformative effect for the company in a couple of years' time.
  • By contrast, 'low' value would mean something the CEO won't care about, won't really effect the bottom-line, and won't have a noticeable effect for more than a small fraction of users or employees.
  • For a narrow domain of ideas, you might be able to categorise by some single quantitative proxy, e.g. effect on revenue, or on number of users. But most difficult rankings involve heterogenous factors. So then you'd have to estimate a monetary value, say, for each idea - which can invite its own kind of arbitrariness.

Ease

  • For something to be easy, it means we think it'll be quick, relatively painless, we understand the problem well and how to solve it, we're good at it or we've done it before, we don't anticipate major surprises, the errorbars on our time estimates are pretty small, it's unlikely to screw things up, etc. In other words, it's not slow, laborious, difficult, complex, novel, or risky.
  • There are a lot of factors there. Maybe this task won't take long, but it'll be very boring. Maybe we think it's like stuff we've done before, but we're aware of ways it could dramatically surprise us. We could decompose all these factors further, but often it's enough to simply label things as 'easy', 'hard', and 'somewhere in the middle'.
  • Sometimes we can use a time estimate for the best-case as a proxy for 'Ease', e.g. 'at most a couple of hours', 'at most a couple of days', 'at most a month', etc. The realistic-case is usually double the best-case, and we can assume that all the other dimensions (complexity, laboriousness, uncertainty) correlate with the time estimate, i.e. if the best-case is 'more than a month', then there are probably a lot of separate sub-tasks and uncertainty.

In practice, it's helpful to have a very brief up-front discussion (or provide a rubric) with some examples about what makes for low/high ease and value, but it doesn't help to sweat the details (especially if you're only scoring 1-3).

What if you have a high-stakes decision, or a lot of ideas?

Ease-value is a great first step because it's quick. It frees up time, so you can to focus a more fine-grained discussion on a narrow range of topics.

Not every decision boils down to 'ease' and 'value'.

  • For a risk assessment, you'll probably use the analogous '*Likelihood*' (probability that this risk will happen) and '*Criticality*' (how bad it would be if it did).
  • For business strategy, you might judge based on the [hedgehog concept](https://www.jimcollins.com/concepts/the-hedgehog-concept.html) (*what can you be the best at*? *what are you passionate about*? and what *will make money*?), aka head/heart/wallet as [Marc](https://marczaosanders.com/) puts it.
  • For hiring, you'll probably use a few structured criteria, e.g. *ability/experience*, *good to work with/culture fit*, and *hunger/passion*, etc.

If it's really high-stakes, or nuanced, or you have a long list of ideas, or you want the prioritisation to be better, sometimes it makes sense to use a 1-5 scoring instead of 1-3. This might lead to a more fine-grained ranking. For example, I usually use a 1-5 score for Risk Assessments, because they're higher-stakes, and there's less discretion to re-rank based on discussion.

But be aware that it is much more effortful to score 1-5 than 1-3 - it just requires a lot more internal deliberation to decide with finer-grained categories. So in practice, I find it's often better to score 1-3, and leave more time for discussion.

Further analysis

For high-stakes decisions with multiple diverse perspectives, it can help to break down the aggregated scoring in a few ways. Use the 'Richer breakdowns' template.

  • If you want to know whether there's disagreement, you could look at the standard deviation (or just the range between lowest and lowest scores) between people for each structured criterion for each idea. In other words, if the scores for Ease on Idea 1 range from very low to very high, it's important to notice that discrepancy, and talk it through.
  • Maybe you only want to work on low-hanging fruit (i.e. it has to be at least medium-easy). Or indeed, maybe you're looking to swing for the fences, and so it helps to filter for highly-valuable ideas. Or in practice, perhaps you want a portfolio that balances a few of each.
  • Sometimes it makes sense to multiply rather than average the criteria scores. For example, if you're scoring 1-5, and you have multiple criteria (i.e. not just Ease and Value), then multiplying will penalise ideas that have *any* very-low scores.
  • What about individual variation between raters? Perhaps Person A is much more optimistic than Person B? It helps to have a *short* discussion beforehand about what you mean by 'Ease' and 'Value' - most useful of all, work through a few reference examples to help people calibrate. But beyond that, don't worry too much about noise. Systematic inter-person variability (e.g. one person being more optimistic) will affect the absolute scores, but won't much distort the ranking. And with multiple, *independent* raters (i.e. not being biased by each other's answers), the noisiness will mostly wash out.

How to run the post-ranking discussion?

I don't have such neat, prescriptive advice for how to run the discussion after the ranking.

If it's low-stakes, you could:

  • Just stick with the aggregated rankings.
  • Give a single person the final say to slightly reorder the top handful.
  • Borrow from Planning Poker rules, i.e. ask the person with the highest and the person with the lowest scores to each briefly share their reasons, and then use a simple decision procedure to adjust scores/rankings following that.

If it's high-stakes, maybe the discussion focus should be on what further information would inform the decision, then reconvene after gathering that information.

Notion vs Google Sheets

As a rule, I've come to prefer Notion to Google Workspace. But for ease/value rankings, I favour Google Sheets, for a few reasons:

  • Cell-level comments. In Notion, you can only comment at the level of a row. This makes it harder for each person to comment on their particular criterion and score (e.g. my score for Ease for Idea 1).
  • Conditional formatting. Google Sheets can colour cells automatically based on a colour-scale you define, relative to a range or other cells, so that you can [see at a glance](https://en.wikipedia.org/wiki/Parallel_processing_(psychology)) which idea-rows look best, or where there's disagreement.
  • Easier to add new users. The sheets in the template are built for 3 people, but it's fairly easy to add more people in Google Sheets. Just right-click on the columns for Person 2 (the middle of the 3), and insert a new column to one side. That'll automatically update the aggregation functions (which include all the cells between a range). You still have to do this in three places (for Ease, for Value, and for Overall), but it's fairly straightforward. In contrast, I think this would involve a lot more fiddliness to add new people for Notion, including updating all the aggregation equations.

I'm sure Microsoft Excel or similar would be fine too, as long as they support multi-user collaborative editing.

I have a better way of doing things!

If you've found a way to improve things, I would love to hear from you.

In practice, experiment, and use your judgment. Context matters when picking the best approach for a given decision, team, or moment.