Monday, 27 February 2012

East meets West (Science)

A couple of weeks ago, in sunny florida at AGBT, recovering from a somewhat too hard evening's ... "discussion" I chatted to someone about the future of sequencing. We ranged across the models of large scale centres vs smaller units; of the never-ending need of bioinformatics and trying to stay on top of everything. "Oh well" my lunch companion continued, "we're all going to be competing with a billion chinese, so that will solve the informatics problem". I was reminded about this only this weekend when someone from the Oil and Gas industry asked me about the scientific impact of China, citing the opportunity but also perceived threat to business.

This is a common theme I meet in particular amongst policy people. They cite the growth of the East, in particular China, and the shift "upwards" in the food chain from unskilled manufacturing to skilled, graduate work, as a major "challenge" for the west. And of course, this being a discussion about sequencing and chinese, we have to talk about BGI, where sometimes the somewhat tired stories of thousand bioinformaticians working an old shoe factory of Shenzhen come out.

But I think this a profoundly wrong concern. Although we of course will be changing what we do in the next 5 years this will be far more due to technology and scientific progress than a change in the geographic spread of scientists. Furthermore the rise of Chinese/Indian/Brazilian science is far more an opportunity than a threat.

The first thing to realise is that science is a very international process; it is nearly totally conducted in one language (English, thankfully for those people who have English as a mother tongue). Scientists move between countries regularly, conferences draw in speakers from the world, journals accept publications from any country and collaborations are sparked up between any willing investigator. Perhaps 10 or 15 years ago there was the concept of being a "Nationally" competitive scientist, but this I think has almost completely gone. Being one of the top cod geneticists in Norway, when Norway leads in cod genetics is relevant whether you are "1st" or "20th" in Norway; in contrast, the "leading" cod geneticist from Brazil (if he or she exists!) I presume is not making a big impact (apologies if she/he is and this is not a good analogy). The question is not your national standing, it is your international standing. Every (good) scientist knows this and judges him or herself by this.

This means that scientists are already being "assessed" on a worldwide stage. I am sure that China and India will continue to produce more and more graduates, some brilliant, many very good, and many more "reasonable" graduates; talent in science is fundamentally a flat, even process given the right institutional environment. Indeed there are already brilliant and great Chinese and Indian (and Brazilian, and Russian and South African...) scientists, some of which work in their home countries, and many have moved, as scientists do, elsewhere - just as there are great American scientists in the UK, or great French scientists in the US.

Even if one pro-ratas the number of scientists to the population size, 1.4 billion in China, and 1.1 billion in India, we are already dealing with a worldwide baseline "input" population of 300 million US and 700 million Europeans, so this "just" dilutes out the base 2 fold. You might be scared by that, but a scientist who is trying to be top of his or her game should be less so - each scientist is already competing in this large pool, and usually competing by being a bit different from his or her neighbourgh in how they approach things. And a crude prorata view is not right - firstly there are good Chinese and Indian scientists now, and secondly the rise wont happen overnight.

But even this view is far too pessimistic. Science is not a zero-sum game; aspects of it look like a zero sum game (in particular the grant funding process, in which quite explicitly the amount of money is divided up amongst participating scientists, with some getting zero) but the way science works overwhelmingly favours the scientists who work collaborative with other groups. The ideas that come out from this, from publications through to patents are far better for the joint work than any small loss in relative competitiveness of having a big pool that you are ranked in.

The thought experiment is to imagine that there was an information firewall between the US and Europe over the last fifty years. What would have happened? Well - probably pretty much roughly the same scientists would have been funded (all things being equal), and in many cases a similar set of discoveries probably made - but unequally over time, and so for the cases where one needed to discover both A and B to work out C, one would have to wait longer in the separated scenario than the joint. In the worst case working out C was not an "inevitable" part of discovery but rather needed a rather unique set of people in a unique environment - some meeting, or a workshop or a bar, or an airport lounge (all of which are places I've struct up lasting collaborations). These would never have been realised. Some - perhaps most - of these discoveries would have been the everyday part of science; things which scientists make progress on but don't make an impact outside a specific field. But, occassionally, the missing discovery would have been profound, or perhaps more worryingly, the critical bit before a profound discovery. Who knows what it could have been - the development of monoclonal antibodies? The development of Sanger style capillary sequencing? Or of a 2-D surface sequencing chemistry? Or for full bone marrow transplants for leukemia? Or of Oncogenes such as RAS? In each of these cases, the scientific progress worldwide weaved between US and Europe, and the output of the joint system is far, far higher than the sum of the two.

Now imagine if we don't exchange information with China/Brazil/Russia. We will be missing a huge opportunity in this synergy. Of course it is fair and sensible for this opportunity (and so the results of this synergy) to be shared equally - from the fjords in Norwary through to Cerrado of Brazil to the plains of the US mid West and the mountains in China - it should be shared equally. This is not a hard thing to do in Science (it is more complex in the tangled world of business with IP, Trademarks and other things) as for the most part the "rule" is to share (on publication) what you know. You might claim that these countries aim to more consume and parasitise than to contribute, but that's not how science works - a passive, consuming mode means you are always 6 months behind the real work, and almost always incapable of exploiting the work for your own discoveries. Science is an open process, and works in the open.

So, critically, as scientists we should not fear the rise of China, India and Brazil. For every discovery made by a Chinese group, there will be 3 more discoveries made in collaboration with Chinese groups; for the developments made in Brazil based on current science they will return the favour in 3 or 5 years when we read their papers. Indian scientists visiting Europe will both learn from us but soon enough we will also learn from them. There is too much not too little to learn about in this world, and we need more brains on the problems, not less.

We should embrace it, enjoy it, and focus (as ever) on doing world-leading science.

Thursday, 2 February 2012

Career path: a bit more detail.

Last month, Rolf Apweiler and I were kindly asked by Nature Jobs to comment on our career paths to becoming Associate Directors at the EBI (which will happen in April). The final piece focuses on just one of our career paths (mine). You can read the article here:

It is very positive about me, about the EBI and about the other institutions I went through. It's a good thing, mainly because it raises the EBI's profile. However, some important things were missed that I would like to mention here - in particular, I think it's important to acknowledge the people I've been lucky enough to work with over the years, and who have made a positive impact on my career.

So, in the manner of "Have I got a bit more news for you", ie, the slightly more expanded view, here's my additions to the interview.

How did you launch your scientific career?

I was always drawn to biology. After secondary school, I went to Cold Spring Harbor Laboratory (CSHL) in New York for my gap year and really fell in love with science. This was mainly because of the atmosphere, getting stuck into hard problems and knowing that there was no such thing as stupid question. I worked in Adrian Krainer's laboratory, where I did both experimental and computational work and ended up publishing my very first paper. A year later, when I was at Oxford, Adrian and I published a more substantial paper.

Although Jim Watson at CSHL did have a direct impact on my career, I'd like to emphasise that all the other people at CSHL - Adrian Krainer, Rich Roberts, Winship Herr - had just as much influence, if not more. I spent quite some time with Jim, who is undoubtedly a very interesting character (eccentrically, I lived in his house for most of that year). By all accounts Jim was a big part of CSHL's transformation from a sleepy summer camp for science to a place with a unique environment and attitude that I found really inspiring.

Describe the most significant turning point in your career?

I was a graduate student at the Wellcome Trust Sanger Institute with my mentor Richard Durbin (it was called the Sanger Centre then...) and was involved in the Human Genome Project. It was a totally crazy time because of the competition between the public project and the private one. Because of this, I effectively skipped the postdoc phase, going straight from being a PhD to being a Team Leader (PI level) at EBI.

What was your biggest accomplishment there?

Three of us - Michele Clamp, Tim Hubbard and myself - started the Ensembl project to provide annotation of the human genome. All three of us worked like idiots, sometimes having to come up with entirely novel solutions to a problem within weeks.

How did you come to bioinformatics?

Crafting a computer method to do precisely what you want - in a search or analysis - is at the heart of bioinformatics. In general I think about what problem I would like to address, then try to fit a method to it. Ever since I was taught C programming by Sanjay Kumar at CSHL, I've used whatever methods or tools I can find to resolve a biological question. For example, very early I adapted a programme to scan databases of expressed sequence tags, short bits of DNA that help identify genes. I continue to teach myself programming and statistics. Often I find myself rediscovering a more fundamental bit of computer science or statistics. For example, de Bruijn graphs are used by computer scientists to understand concepts in combinatorial mathematics - but I had to play around with another topic (data structure compression) to truly understand its potential applications to assemblies. Once I did, it opened up new opportunities in solving assembly problems, which my PhD student Daniel Zerbino and I then exploited.

There is this curious interplay between learning statistics and computational methods, and rediscovering them to tackle a specific problem. Very often we use established computational or statistical methods, but they may need a slight twist to make them really fit to the biological problem. While I make no claim to being a hard-core statistician, I think that 'slight twist' is perhaps my niche. Bioinformatics is not quite as easy as leafing through standards stats and choosing the right published method; an element of creativity to find the way to tweak a method or recast a problem is absolutely critical.

Bioinformatics is part of biology, rather than a stand-alone discipline. The bioinformatics/computational biology revolution happening now is similar to the 'molecular biology revolution' in the 1970s and 80s, which adapted a very physical, protein-oriented biochemistry to one that made use of cloning, expression, Southerns/Northerns &c. to address biological problems. This new methodology does not invalidate the older proteins/columns/kinetics view of biochemsitry; it simply gives another dimension to the toolkit. I would argue that bioinformatics has already become an everyday part of modern biology, and that it has begun to filter through the entire system.

What will be your managerial focus at the EBI?

In 2006 I handed off Ensembl to my colleague Paul Flicek on the EBI side, who has developed the resource far better than I would have. It is wrenching to let go of something you've helped to create. But you have to if you want it to flourish.

Hiring Paul was one of the best decisions I've made. In terms of playing a more managerial role, I think the most important things are hiring good people and taking mentoring seriously. The hard part is shifting from making tactical decisions to making strategic ones.

I also want to add that one thing I'll regret as I take on the new co-Associate Director job is working on the kind of big scientific consortia I've been so involved with over the past decade - human, mouse, chicken and above all ENCODE. That work has been a huge part of my career and I'm sure I will miss being so deeply involved in that kind of research.

What is the secret of your success?

I'm very happy with how this turned out in Nature Jobs: "I just really enjoy the science. And that has helped me to get through some difficult times, when I've pushed myself and other perhaps too hard. The other thing that leads to success is trusting collaborators and the people you hire; I think that often we don't put enough trust in the scientists around us and it hinders progress. I am very lucky to be a part of the EBI and to moving into a more central role. I enjoy nearly everything about it, although there are one or two meetings I could do without."

But I do want to stress that the people in my career have made all he difference. My mentors Adrian Krainer and Richard Durbin; peers like Tim Hubbard, Michele Clamp and Paul Flicek; students like Daniel Zerbino. It is almost painful to read any profile of my career that does not include all of these names. Others to shout from the rooftops would be Janet Thornton, who has had - and continues to have - a big influence on me. Iain Campbell at Oxford and Toby Gibson at EMBL also played pivotal roles in my development. Other big gaps include: peers like Alex Bateman, Jason Staijch and Lincoln Stein, along with whole ENCODE crew... Students Laurence Ettwiller, Benedict Paten, Michael Hoffman, Daniel Zerbino, Alison Meynert, Dace Ruklisa and Markus Hsi-yang Fritz... But then, to give the journalist credit, I can see how an article with so many names might start to look like a phone book.

A career profile is usually about one person. But I think there was really a missed opportunity here because Rolf and I have had such different career paths and approaches (understatement), and have now ended up with a shared goal but bringing different skills to the task.

Actually, I think a side-by-side Q&A with me and Rolf would be a good idea for a future blog post!