Tuesday, 9 April 2013

Structural Biology - the business end of life.


As part of my Biochemistry degree at Oxford, I had to spend a year focusing on a single research project. My obsession with bioinformatics was already firmly established when Iain Campbell, a leading NMR spectroscopist and structural biologist, took me under his wing. At the time, structural biology was definitely the most computational area of molecular biology, so I was looking forward to getting stuck into a computational project.
Type III fibronectin determined by NMR, 
from I. Campbell's group 
It was great to be immersed in this technical world – the COSY and NOESY spectra, triggered by a series of radio pulses to link up atoms; the rather impressive cooling process, with liquid helium being poured into huge superconducting magnets sunk into the ground; the somewhat scary signs warning people with pacemakers to turn back (the magnetic fields are insanely strong in and around an NMR machine)...
I learned a lot about NMR and structural biology, from the technical aspects of chemical shifts, coupling constants, distance restraints, disordered regions and hydrophobic cores to the elegance of protein structures that manage to fold perfectly to do something so absolutely specific.

Seeing is believing

But … I had already heard the siren call of simpler, linear protein and DNA sequences, and there was this wonderful new institute going to be set up – the Sanger Centre – and I had a chance to work there on sequencing the human genome…
… fast forward 20 years, when I had the pleasure of sitting in the back of the room for the Protein Data Bank in Europe (PDBe) Scientific Advisory Board, now as Associate Director of the EBI. I was still just in awe of the incredible beauty and precision of protein structures, and the skills of structural biologists in uncovering their details.
Cryo electron tomography of sensory cilia 
Some things had not changed in 20 years: dihedral angles are still important, transitions from ordered to disordered are still being explored and the methods are still extremely technically detailed. But other areas have progressed so much they are almost unrecognisable: the ability to look at larger complexes, with electron microscopy (EM) techniques – single-particle averaging and, even more impressive to me, electron tomography. Electron tomography allows you to reconstruct a single 3D sample to ~40 Å from images taken at a series of sample tilts – no crystal, no averaging, just for this particular sample  – like a high-resolution 3D microscopy image. These are spine-tingling images.
So often we have to conceptualise and imagine what is going on in cells. Electron tomography is the closest thing I’ve seen to actually seeing molecular biology in action. One can see little ribosomes, microtubules and proteasomes and complex membrane-associated structures in a bacterial cell, in a single 3D volume.

Keeping up

The wealth of structural data has grown incredibly over the past decade. New techniques such as EM are constantly emerging and structural biology’s workhorse, X-ray crystallography, is continually being refined with better production and crystallisation techniques and tuneable high-energy X-rays from synchrotrons. Light microscopy has also improved vastly, with techniques such as super-resolution technology.
Integrating all the data being produced with these techniques to gain an overall view is an impressive task. It involves fitting X-ray structures into EM maps and then into tomograms, with NMR measurements to provide the dynamics at the atomic level and light microscopy to illuminate the dynamics at the complex and organelle level. There are still so many more protein structures to determine and integrate, and endless discoveries to be made.

Bringing structure to genomics?

All this progress is not just for the benefit of structural biologists. Gerard Kleywegt, who leads PDBe, has a passion for making this information accessible to the broader biology community. Molecular biologists, developmental biologists, geneticists and systems biologists can all make use (or more use...) of structural data.
All too often we forget that linear sequence shows only how information is encoded, not how it is used. The majority of things that happen outside of the nucleus, and certainly the vast majority of the “doing” of life, is executed by either proteins or RNAs folded up into specific structures and collaborating in specific complexes. We know a lot about these structures and complexes – 4,717 proteins (23% of protein coding  genes) have at least one structure (many proteins have far more than one structure), and this accounts for 42% of residues in these proteins (around 11% of protein residues overall). When we expand this to things we can confidently model, this goes up.
I am sure that in my own research area – genomics – we’re not taking enough advantage of this information. We might think about structural biology as the final mechanistic determination of why one allele has an effect or not, but can we integrate structural information to make our statistical genetic tests more powerful? Can we use the collection protein structures of transcription factors (often bound to DNA) to help interpret DNaseI footprinting results? Or use protein-complex information to inform epistasis models, potentially at a residue/patch-of-protein, not just at the gene level?
Many fields use structural information in all sorts of ways but I am sure the integration of different structural techniques, and the integration of that structural information with other experiments and knowledge – chemistry, pathways, gene expression, proteomics – is going to be amazing.
Part of me wonders why I chose to stick with the “boring” world of linear, four-base DNA sequence some 20 years ago. I guess there’s always time to learn some new tricks…

Monday, 4 March 2013

The EBI's new websites...


Two years ago a revolution started within the EBI. We took a good, hard look at all of our interfaces, and decided to put our users at the centre of our design. The first fruits of this revolution are out today, as our fleet of websites is re-launched with a unified look and feel and consistent navigation. The "top" page is at www.ebi.ac.uk 

What prompted the change?

Our users, of course! We do a lot of usability testing, but usually one resource at a time. When we began to systematically study the way people consume data from the EBI overall, we could see that both our regular and casual users were dipping in and taking little bits of data, rather than navigating deeper. Overall, most of the people using our services aren’t aware than we have more than one database or tool (and if they were, they normally did not have enough time to going exploring). Our interfaces have been stopping people in their tracks of what users wanted – the very opposite of what we’re trying to do - and at the same time not helping them broaden their perspective when they did have a coffee in their hand and spare 10 minutes.

The first part of our mission is enabling discovery. One of the most important things that we do is to make public data available to researchers – but it’s almost just as important to make it accessible. The redesign hasn’t been just about making things look better – the main reason we did it is to help scientists find the information they want, explore how biological objects are interconnected and take every opportunity to move their science forward.

Who needs philosophy, anyway?

It’s hard to put a face to an ‘EBI user’, because each of our resources – not to mention the global site – aims to serve a different community. So one of the sea changes that occurred at the EBI two years ago was that we adopted a user-centred-design philosophy in our web development.

Yes, a philosophy. Honestly, I was really curmudgeonly about this at first. After all, who needs a fancy name for “listening to users”? And who needs all this wishy-washy stuff like making personas, comics and storyboards?

Well, actually, we did. We needed a single, simple, driving philosophy and every tool available to communicate clearly with one another and achieve consensus. If nothing else, adopting UX methods helped us get our web developers talking to one another – and believe me, that doesn't always happen naturally. We couldn’t even think about giving our users a unified experience until we were all speaking the same visual language.

Test early, and test often

Involving users has to be done right, with observation rather than instruction, and it needs the guidance of user experience (UX) professionals. The kind of testing we did involved carefully setting out specific tasks, building scenarios (e.g., “You’ve just run an RNAi experiment and the top gene on your list is ASF1, and you’d like discover more about it using EBI services,”), and observing what happens, listening as the user talks about what they are seeing and what choices they’re making. This has been very revealing.

We tested many potential designs using paper mock-ups (very low-fi), with someone playing the part of ‘web server’, swapping out drawings or web pages as the user ‘navigates’ by pointing at sketches of buttons or menus. It seems a bit comic at first – certainly the ‘web server’ might feel a bit silly – but it’s effective and … strangely liberating. You can generate and feel happy to throw away lots of different test designs.

Eventually we got to the ‘hi-fi’ testing, using html pages, where the details start to become important. This requires a much bigger time investment, so it could be a bit of a flash point. It’s easy to get attached to a design you put a lot of time into, and to cling to elements you particularly like. Testing a lo-fi version is easier for both the developer and the user, because it’s about high-level layout (and things that don’t work are easy to throw out). But at this point the iconography and presentation have to be just right, and the stakes are a bit higher.

So what did the users do? Sometimes they’d jump out to look at Google or Wikipedia (and we didn’t say, “No, look over here! It’s on the left!”), and sometimes they’d go where we’d hoped (i.e., “This looks interesting… I think I’ll click here…”). A lot of their decisions to explore the site further were prompted by visuals (thumbnails, appealing icons). We went through a lot of layouts before we could see a clear path to consistent design.

One useful exercise to come out of this was the development of personas for each of our main areas. You can’t always have a group of users on hand to test things, but having a name to put to a fictional type of user in any given scenario (i.e., “Joe would need more detail there,”) gives a much-needed structure to on-going development processes. It helps people try to see things from the user perspective in a consistent way, any time a change is needed.

Big Brands, Little Brands

Although a lot of the work was focused on the deeper things (layout and content), we really grappled with visual flow. We were hearing criticism – from users and advisors alike – that EBI resources were ‘poorly integrated’. Well, we had actually done a lot of integration between the resources so this was a bit puzzling. But when we dug a bit deeper we could see that the problem was really that the resources didn’t offer a consistent experience - and some of this was the "boring" business of fonts, colours and layout. We know how that came about (many of our resources are developed with their own rhythm, often in collaboration with other institutes), but we don’t need our users to be distracted by it.

We could have gone about changing this in a really naïve way, trying to force everyone to use, say, green, and to put a giant stripe of navigation along the side. But that would be counterproductive, and we’d lose all the ground each of the services had gained by doing their own user outreach. InterPro, ArrayExpress, PDBe: these names mean something to people, and their brands are important to the community they serve. So how to square it all?

The BBC comes to the EBI

The EBI has a huge amount of data to serve to a lot of different communities. Sound familiar? We took great inspiration from the BBC – another ‘multi-brand’ operation serving up different data types (radio, TV programmes, Web sites... ) to different audiences through different channels. Anyone who’s lived in the UK can tell you there is a world difference between BBC Radio 1 (Music Radio for teenagers and 'young adults'). BBC Radio 5 (a.k.a. “Radio Bloke” - sports and music) and BBC 4 (more highbrow)

We were very fortunate to have Bronwyn Van Der Merwe come and talk to a big group of EBI developers about how the BBC went about revamping their sprawling website. They first set out to pinpoint how their customers knew that the websites for Doctor Who (a sci-fi TV programme with lots of episodes) and the shipping forecast (a very niche radio programme with a cult following in the UK) belonged to the same company. They wanted people to pay attention to the content but to sense somehow that it was BBC content, without having it shouted at them.

What they achieved was an interesting blend of strong individual brands (e.g. Doctor Who) and a unified, consistent identity. They became a real ‘brand family’.  Spend some time wandering around BBC websites and you will get this sense of continuity as you move between distinctly different sub-sites (news, weather, sport, shows, etc.). They don’t look the same at all. They just feel… same-ish.

Guidance

Our redesign, like the BBC’s, is based on very fundamental principles, rather than on trappings: grid, font, navigation, iconography and colour hierarchy. And, like the BBC, these guidelines are being applied to our many, many sites over time, as they implement changes dictated by the needs their own communities.

In some ways these guidelines have given our resources more freedom: they can choose their colours (but have rules about where to use them), they can keep their logo (but have guidelines about where to put it), and their content remains their own. But also, it is very useful for the developers working on the scientific content of these resources to have guidelines about all the things they don’t really want to spend time thinking about or making. (Honestly, who wants to go and make yet another email icon, and argue about where to put it?)

We are really pleased at the result: our resource brands have actually become even more prominent than before. There is a consistent look and feel throughout the EBI websites, but you definitely know where you are when you’re in Metabolights.

Probably the thing we spent the most time working out, with each other and with users, was how to encourage people to search across the EBI without distracting them from their local (expected) search. I’ll be interested to see what people have to say about this, and what happens in future rounds of testing.

Rollout

You might notice that not all EBI websites have made the big change. We’ve given each of them time to switch over, knowing they have to see to the immediate needs of their communities first. Some have high-priority features to develop; some need to change their internal APIs, etc. These sites are sporting the new global header and footer but, by and large, they are reusing their old framework (compare the compliant Metabolights site to the ‘mitigated’ ChEMBL site).

I think the moment it clicked for me that our strategy had worked was when my colleague Henning Hermjakob, who is responsible for a number of ‘mitigated’ (and politically complex) websites, said that surely we could have a central piece of html for the header and footer that everyone would use straightaway. He didn’t even question that we should share consistent headers and footers, complete with the new EBI logo and black band up top.

I know all the mitigated sites are keen to shift to the new web guidelines, and we aim to have everything switched over the next year or two.

What makes a good website?

The techniques aren’t really the important thing here (although they’re fun to see in action). What we are really trying to do is to think and develop with the user’s perspective in mind. It’s not as simple as it sounds. The people leading, running and developing websites – from strategy to pixel shifting – are about as far as you can get from a sample of users. Everyone has strong, instinctive feelings about what makes a ‘good’ or ‘bad’ website, but when it’s your website, you are simply too deeply involved to make those calls on behalf of your users. At best you have blind spots; at worst your instincts can be downright misleading.

Probably the nicest thing about UX is that it’s about making data-driven decisions in design. (The reality is a bit messier – there are other, very real drivers to consider in the design of public services.) But at base, the approach is to apply common sense and data to question every detail, right from the start. This appeals to my science mind - I hate having arguments without data.

We hope to gain more than good user interactions from this shift.  We’ve put the user front-and-centre in our website design not because we want to look good (although that would be nice, too), but because we want to help people do good science. So take a look at what we’ve done, and don’t compare it to what we used to do. Just keep looking, and finding connections between the world of biological data right here on your doorstep