To summarize, I’m interested in looking at interbreeding oaks, using their chloroplast DNA, nuclear DNA, and their geographic location. Since I work with DNA sequences, file sizes can get quite large, quite fast! For example, my chloroplast DNA dataset is the smaller of the two, and it contains 91 different samples, each of which have a 130,000 base-long sequence. If you do the math, that’s almost 12 million individual bases!
On one hand, most professionals work with much more data than this, but on the other hand, I can’t settle down on the couch with my trusty abacus and a pencil and paper to work, either.
The data in this study was previously generated for other studies that my mentors and collaborators have been working on, so I’ve been working solely on the data analysis. Everything from processing the gene sequences, to building the phylogenetic trees, to modeling the equation takes computational power, from a few second’s worth to a few days.
This is where the second part of why my research is important, at least to me, comes in. Back in the old days, computer science and biological science were separate fields with little in common. Of course, that’s drastically changed with the shifts that biological science has taken towards doing molecular work. Biological studies can generate so much data, especially using DNA sequences, that there’s no way to handle the volume except by computer. Ecological fields have also benefited from computers, which can help us link geography to years of study data more easily than ever. With computers, we can tackle problems and answer questions that couldn’t be answered just thirty years ago.
I want to work at the interface of computers and biology, so this project is important to me, personally, because it brings me closer to that goal. I’ve learned so much from Andrew and my other collaborators about how we manage biological data and what we can glean from it.
Of course, working with computers isn’t always so glamorous. This is my usual sight for seven and a half hours of my day:
I spend the majority of my time coding (read: hunched over a computer).
My interest in computers, however, is fairly recent. Last summer, I didn’t know how to code at all. Heck, I didn’t know what the purpose of coding was. The extent of my computer knowledge was that if my computer froze, I should turn it off and turn it back on. (I’ll admit, this is still my most important computer skill set.)
Yet, I needed to totally re-evaluate my goals and interests when Andrew tossed me a project where coding was a necessity. I found that I needed to learn how to code, and fast, if I wanted to finish my project. I wasn’t expecting to find that coding was a puzzle, almost like learning a new language, or that it was actually fun. All for a college student who’d never so much as tried to jailbreak her phone.
My experience taught me that you don’t need to be a “computer person” to code. Until I’d had no choice, I had vehemently denied my mother’s encouragement to try computer science using that very excuse. Yet, if I had to choose anything else to study now, it would be computer science for sure. I’d definitely encourage anyone who’s even vaguely curious to give comp sci a chance! You may be pleasantly surprised, not just at how useful coding is, but more importantly, how fun it can be.
While I may not have many thrills from getting to see trees growing before my eyes, or testing branch strength by tearing them off dead trees with heavy machinery, I do get to experience the thrill of finally getting a script to work, after hours of frustration. I get to see patterns come out of the woodwork of a sea of data.
And that thrill? It’s totally worth the glowing screen eye strain.