From Big Data to Bioinformatics and Genomics

A few people have asked me what areas I am working on currently, so here’s a quick synopsis. I am still speaking regularly (I even got called a “science communicator” recently – I’m not entirely sure about that one!), but the bulk of my time continues to be spent on my executive roles, applying big data skills and psychology to business challenges, and creating and managing new products and services. Since last year I have become involved in the intersection of data science and the world of genomics.

If you started school when my parents did (no offence mum and dad), you probably didn’t learn too much about DNA (the structure of DNA wasn’t understood until 1953), and even if you went to school when I did, you probably wouldn’t have learnt very much about how it worked. These days it is part of the science curriculum, but for those of us who are more mature in years, this BBC Explainer video from 2013 will help get you up to speed (it is already a little out of date, that is how fast the space is moving).



In 1990 the Human Genome Project was launched. It was a massive international project to decode the genetic information in human DNA (a sequence of data over 3 billion items long). The efforts had started many decades before, before the structure of DNA was understood, but huge advances in technology had to be made to complete the task, and it wasn’t until June 2000 that President Clinton and Tony Blair were able to announce that a draft had been completed, and it wasn’t until 2003 that the work was declared complete.

Current research into genetics and treating diseases stems from the information produced by the Human Genome Project, and a bulk of that sequencing effort was completed on what is now the Wellcome Genome Campus in Cambridge, in the UK. The cost of sequencing human DNA has fallen dramatically since the start of the Human Genome Project, and sequencing a person’s DNA is now a very economically viable form of research, and Genotyping, which is the process of determining which genetic variants an individual has, is within the reach of the general population. This means researchers now regularly have petabytes of data to work on – truly Big Data, which has its own unique set of challenges when it comes to analysing and understanding it.

DNA isn’t just data, it is a fully functioning machine, imagine a computer running a program that could build a copy of itself, as well as being able to build and program thousands upon thousands of different sorts of accessories. Our bodies are astonishing molecular machines, as this talk by Drew Berry at TEDxSydney shows (it on the same YouTube channel as my TEDx talk):



If you want some heavy biology and are interested in the machines inside of our bodies, watch the inner life of the cell. If you want something less biological and more emotional, then this next video is for you. It is a videomercial from Momondo captures the social side well – We are all related:


Although the participants were sourced via an agency, their stories are real – this is the background. The exploration of human DNA hasn’t just helped us to understand that we are all closer than we might have thought, it has also improved cures for cancer and is transforming the world of medicine. That, however, is just the start. We now have the technology to analyse the DNA from our microbiome (the BacteriaArchaeaFungiProtistsViruses and Microscopic Animals that live on and in our bodies), which creates data that is giving researchers insight into conditions including many increasingly common health issues (including asthma, irritable bowel syndrome, obesity, and many others).


Data isn’t just for creating pretty data visualizations (or even creating fantastic holographic experiences), it has the power to transform lives…



Leave a Reply