Log in

No account? Create an account
entries friends calendar profile Elf Sternberg's Pendorwright Projects Previous Previous Next Next
Serendipity 2009: Keep learning! - Elf M. Sternberg
Serendipity 2009: Keep learning!

I eventually tired of working at the video streaming service. It wasn’t paying very well at all, and the challenge of bootstrapping a video streaming website soon gave way to the tedium of maintaining a media-centered CMS with a customized catalog. Weekly specials, micro-sites for film festivals, director’s specials, interviews, events and new releases were all bread-and-butter boring.

So when I went looking, I stumbled upon a bioinformatics startup. These guys had one of the cheapest human genome alignment toolkits in the world; it could be run on AWS and would do a human being in less than three days, a miracle at a time when normal sequencing software frequently took weeks. Part of the miracle was figuring out how to map/reduce the problem; I never understood the algorithm, but apparently it was a big freaking deal.

The engine was written in C++ with a Python front-end that spoke REST. It was all they had; you could give it the S3 address of a pool of genetic data and tell it what animal you thought the pool was from and a short while later it would tell you a lot about the beastie (human, yeast, E coli, that kind of thing; yeast was popular to test against because yeast have fairly small genomes; humans took a weekend, yeast was done in minutes). And that was all they had.

When I signed on, my duty was two-fold: first, write a front-end that spoke to the sequencing engine and let customers configure and launch processing jobs, managed their data sets, and control jobs that were unproductive; second, write a Django app that took people’s money and gave them access to the sequencing engine. The guy who wrote the REST layer did me two favors; he wrote a plug-in that, before a job would launch, would make a query to the Django engine saying “Does this guy have enough money?”; secondly, he made his own endpoint take a list of processing tickets, so data could be sequenced, aligned, filtered, and analyzed all in one command.

I had just mastered the art of credit card processing at the video streaming service, as we had pressed out the site’s functionality. I had also recently heard about this “Backbone” thing, which let you handle REST APIs with relative ease. I’d written a little tutorial on it, mostly as an educational thing.

I got the Django end up quickly, then wrote simple front-ends to list out the data, jobs, and processing “pipelines” (that’s what they’re called in bioinformatics), then refined them reiteratively to show more and more, and to handle more and more. Along the way, I recalled my university training as an accountant (yes) and looked up how to handle encumbrances; the user’s account could be “encumbered” (temporarily debited) and if a failure occured that was our fault, the credits could be unencumbered efficiently and returned. The system would only report to the engine “credits available” to prevent cheating. It was remarkably robust.

The serendipity here was (a) being a deep hard SF geek, I knew enough biology not to be utterly, completely lost, (b) I’d just come off a Django assignment that mirrored what they needed, and (c) I’d just run into this Backbone thing and learned it, and it was exactly what they needed.

I also did the site in Coffee, HAML, and Less, which made actually coding the site much faster and easier than doing it in the traditional tools. The lead engineer later said my choices made it “difficult to find people who knew them, but incredibly easy to train people to use,” so it was half a win.

Leave a comment