In this blog post, I’ll be discussing my experience as a software engineering intern at Ginkgo, what I’ve worked on, and what I’ve learned. For perspectives from earlier this summer, see this post.
How did I learn about Ginkgo?
One day last Fall, I came across
this video of one of Ginkgo’s founders, Reshma, talking about Ginkgo and the story of its growth at the YC Female Founder’s Conference. The company’s mission of making biology easier to engineer stuck with me, and soon I was down an internet rabbit hole of Ginkgo-related content (
this Seeker video and this
old-timey Ginkgo promotional video are two I especially recommend!). A biotechnology company like Ginkgo might seem like an unconventional choice for someone who’s interested in computer science, but as you’ll see here and in other
GinkgoBits blog posts, software engineering is as central to Ginkgo as it is to other traditional tech companies
Project 1: The Strand Microservice
My project focuses on building the first of a series of services that Ginkgo has planned to handle bioinformatics operations outside of the legacy app where their logic is currently housed. This service, called “Strand,” anneals DNA (joins complementary strands). In my contribution to the
earlier internship post, I mentioned that this project consists of two phases:
Using the CDK to stand up architecture (an ECS Fargate cluster, API gateway, Lambda, SQS queue, and DynamoDB instance) for this service.
Connecting these components, incorporating the anneal logic, and ensuring that the service can handle batching several operations (the average batch size is a few hundred, but can grow to thousands).
As I work towards wrapping up this second phase, here are a few lessons that this project has reinforced:
The importance of separation of concerns. As you might imagine, this project involves a lot of code-- both code that I wrote, and logic that was written previously and already existed in Ginkgo’s code base. Creating Python packages to isolate code that performs functions like setting up the architecture with the CDK from the code containing all of the anneal logic’s dependencies made maintaining the service much easier.
Trying out any new technology comes with potential issues, but is always really exciting.
As I mentioned before, this project involves using the AWS CDK, a relatively new technology (launched in the Summer 2019), that hadn’t been used in Ginkgo projects before. Aside from a few minor issues I noticed with the CDK documentation, working with this technology has been great-- it supports developer-friendly languages (this project is in Python), it’s easy to integrate with CI/CD, and it can be used to quickly provision resources with relatively few lines of code
Project 2: The Central Event Bus for Concentric
Approximately halfway through my internship, I had the opportunity to pause my project for three weeks and spend that time working with my mentor, Ann, on a project to support
Concentric, Ginkgo’s COVID-19 testing service. The Central Event Bus is the backbone for data flow and eventing for the service, and its importance is amplified by the Ginkgo’s plan
to scale COVID-19 testing to millions of samples per day. Having never worked with eventing systems before, this was a great opportunity to learn more from my mentor. Some of the most interesting aspects of this project were learning about and working with shard management in
Kinesis data streams and managing versioning and
schema evolution in
Avro.
Final Thoughts
Throughout my internship, I’ve gotten to do more than just code-- I’ve been able to serve on Ginkgo’s Diversity Council, tweet for
GinkgoBits, participate in escape rooms and trivia tournaments with other members of the Digital Tech team, and much more. Despite the internship being remote, I’ve gotten to connect with and learn from other Bilobans just as well as if it had been in person. Special thanks to everyone who made this internship a great experience!
(Feature photo by Les Triconautes on Unsplash)
Posted by Vidya Raghvendra