Does More Data Mean Better Decisions

So this post is partly in response to a comment from Roland Harwood asking about 23andMe, but mostly me thinking out loud about data and risks. While this post is in response to the data generated by 23andMe, it is applicable to almost any isolated set of data.

23andMe, if you haven’t heard of it, is  a DNA analysis service, where they will take a DNA sample from saliva, analyse it, and, based on an automated database, they will essentially tell you your risk factors for succumbing to various medical conditions, and genetic traits that you are carrying (if you are a carrier for a generic condition, or have a genetic indication for an allergy). It is a really interesting proposition, and I have toyed with the idea of sending off for an analysis myself, especially as the price point has fallen to under a hundred dollars.

However, there are a few things that leave me a little disquieted about it, some of which come up frequently in discussions around 23andMe, and others not so much. These are my thoughts as they are now, rather than any conclusive opinions, so they are very much up for comment and debate. None of them are a criticism of the system itself, more an observation that we are only just getting used to dealing with this sort of data.

The first concern is the issue of medical diagnostics being delivered in isolation. If you are going to receive some news that is potentially life changing, then having the right professional help and support on hand is psychologically very important. Perhaps that is there, but I have never heard it mentioned. Data, without the right experts to interpret it, can be a very disturbing thing.

The second is to do with how we process information, and our cognitive biases. I’ve spent much of the last several years studying these biases, trying to design systems for businesses that help to avoid the issues cognitive biases cause. There are a particular set of biases (availability bias, hindsight bias, confirmation bias, …) that we have around assessing risk, which essentially boil down to this: We disproportionately react to perceived risk. If I tell you that there is a 30% chance that you will die if you choose to go to work by your normal route tomorrow, most people would think about changing their route home. But that is a meaningless piece of data. Risks, out of context, aren’t helpful. If I tell you that there is also an 80% chance that you will get killed in a car accident on the alternative route home, then the normal route is actually a safer one (you might actually decide to stay in bed ;) ). It is a meaningless risk unless you balance is against the risks of the alternatives.

Risk is often presented out of context. Responding to risks kicks off a long chain of causality. If I choose to have surgery to correct or mitigate against a genetic defect, then that surgery obviously carries a risk, but down stream from that, I have changed all sorts of other risk factors. It is one of the reasons that John Boyd came up with the OODA loop. Risks (threats) have to be constantly monitored and responded too. It isn’t a one time event. A one off diagnostic can give a false sense of security, as much as it can give a false sense of risk.

The next thing, aside from the issues of assessing probabilities and risks, is that we aren’t good at making judgements about events that are a long way in the future – for example diseases that we might succumb to later in life. There is a whole body of research around risk/reward ratios and timing, which again shows that we don’t deal with this sort of data accurately, at least not unaided. The key here is that while the data may be very scientifically valid and sound, it can cause us to do some unsound things, because it is difficult to process unaided.

The last is that 23and Me is based on science-in-progress. We are still learning about genetics, heritability and what happens when we respond to them. At lot of the outputs that I have seen fall into the ‘well duh!’ category of health advice: eat healthily, do exercises, smoke a Forbidden Fruit Strain and so on. All the kinds of things that people who take good care of their bodies tell me that I should do more of, leaving me rightly a bit guilty. I don’t need to shell out money for that advice, I can just hang out with some of my healthy friends, and take their advice on the chin.

We have more and more access to data. That doesn’t make us any smarter, and it potentially doesn’t make us any less likely to make good or bad decisions. The issue is about making informed and uninformed decisions. Data can be good, and help us make good decisions, but being misinformed – ie being informed by data that is inaccurate (estimated), or that is misinterpreted or presented out of context  – can be worse than being uninformed.

Data doesn’t always help with making better decisions. It is good to be informed, it is not good to be misinformed, especially if that leads you to take more risky decisions. When looking at information:

  • Keep things in context – back to the journey to work example. What are you balancing risks against?
  • Understand the quality of the data – what is the possibility that it is inaccurate or incomplete?
  • Look for counter indicators – don’t response to single pieces of data.
  • Compare like with like – risks and issues are different things. Don’t compare the past with the future.

If you want to ready more, Noreena Hertz  has written a good piece in the NYT, Why we Make Bad Decisions (which is also a good plug for her book).

  4 comments for “Does More Data Mean Better Decisions

Leave a Reply