This month we’re meeting Carla from our comparative genomics team (which we call compara).
What is your job in Ensembl?
I am a developer in the comparative genomics team. Our job is to compute any resource in Ensembl that involves comparing species. These include whole genome alignments, gene trees and homology predictions.
What do you enjoy about your job?
It’s quite challenging. Most days, it feels like I’m solving a very complex puzzle – one that took many years to learn how to even begin. But I like puzzles!
What are you currently working on?
I’ve just finished writing a pipeline to compute a species tree for all Ensembl species. Previously, we could manually add new species to the tree, but as we import more and more species each release, this has become infeasible. Our new pipeline uses a combination of taxonomic information and Mash distances.
What is your typical day?
There are two types of typical day in our team. During the release cycle, running and debugging production pipelines constitutes the majority of the work. The rest of the time, a typical day would include making improvements to our API code and methods.
How did you end up here?
But seriously, I completed both a BSc and a PhD in computational biology back home in Ireland and have been working in the area ever since. It was actually my mother’s idea – she saw my aptitude for biology and computing and suggested the BSc course to me. I really wanted to go to art school and become an animator, but that got shot down by a particularly unenthusiastic careers advisor.
Before working at Ensembl, I worked at the Sanger Institute providing bioinformatic support for the pathogen research team. While there, our team managed to stem a
C. difficile outbreak at the local hospital. By sequencing and comparing the genomes of each patient’s infection, we were able to identify the source of infection and, armed with this new information, hospital staff were able to contain and treat it. Turns out the source was the ward’s janitor – and he would have gotten away with it too if it weren’t for us pesky kids..
What surprised you most about Ensembl when you started working here?
The sheer size of the API was quite a shock. Even after 2.5 years of working in the team, I still haven’t used (or even seen) all of it.
What is the coolest tool or data type in Ensembl that you think everybody should know about?
The new species tree pipeline is pretty sweet ;P
We’ve also recently implemented a means to QC our orthologue calls, which gives users some useful extra information about our confidence in any given orthologous pair.