Ph.D. Computer Science & Engineering
University of Washington, Seattle, WA
Visiting PhD student, DTU Center for Biosustainability
Technical University of Denmark, Lyngby, Denmark
M.S. Computer Science & Engineering
University of Washington, Seattle, WA
B.A. Computational Biology
Carleton College, Northfield, MN
Scan Design Foundation Fellowship
Technical University of Denmark (DTU) 2022 Support for research and cultural exchange between Danish and American students
NSF Graduate Research Fellow
University of Washington 2019 Three years of research funding from the National Science Foundation
Marilyn Fries Fellow
University of Washington 2017-2018 First year funding from UW CSE
Carleton College 2014
Magna Cum Laude Received "Distinction" on Senior Thesis
Clare Boothe Luce Scholar
Carleton College Summer 2012 Summer research funding for women in physics and computer science
My early scientific career has focused on metabolic engineering: the science of using microorganisms as tiny biological factories that can more sustainably produce everyday materials. To do this, we can edit microorganism genomes to convert renewable feedstocks (e.g., sugar) or waste streams (e.g., methane) into a desired target molecule such as medicine, biofuel, or theoretically any molecule found in nature.
My PhD research aimed to develop computational methods to better understand the "genetic grammar" underlying how methane-consuming bacteria control gene expression and use these insights to more efficiently engineer them to convert methane waste into useful materials.
Using deep learning approaches to identify regulatory motifs involved in Methanotroph gene regulationwith Mary Lidstrom & Dave Beck
Built a framework to apply deep learning models to predict RNA-seq expression levels directly from DNA sequences (upstream promoter regions) in the methanotroph M. buryatense. The goal was to identify the sub-sequences within promoters that are particularly important for influencing expression changes across a variety of growth conditions, as these sequences may be candidates for further development as metabolic switching tools. After initial modeling approaches were unsuccessful, we shifted towards a deeper characterization of deep learning model performance in data-limited regimes. Workshop Proposal Tutorials exploring simple and complex synthetic prediction tasks Git Repo
Predicting iModulon membership from promoter regionswith Mary Lidstrom (UW), Dave Beck (UW) & Lars Nielsen (DTU)
Used Independent Component Analysis (ICA) to identify independently regulated gene modules (iModulons) from a compendium of bulk RNA-seq data in M. buryatense. Subsequently, we are working to build deep learning models to predict each gene's iModulon label from its promoter region in order to learn sequence patterns relevant to iModulons' regulation.
Developed a computational framework to idenitfy strong promoters in non-model organismswith Mary Lidstrom & Dave Beck
Analyzing genomic sequence and RNA-seq data in the methanotroph Methylotuvimicrobium buryatense 5GB1 to identify promoter sequence patterns that confer constitutive, strong expression. Our compuational pipeline may be similarly applied to other non-model organisms that lack extensive genetic characterization to help identify key pieces of their regulatory grammars. Publication Project Page Git Repo
Decoding yeast gene regulation from millions of random sequenceswith Georg Seelig
Trained machine learning models on massively parallel reporter data from millions of randomized promoter sequences to characterize gene regulation in yeast.
Understanding gene expression patterns in developing heart tissuewith Georg Seelig
Analyzed single-cell RNA-sequencing data to understand gene expression patterns in differentiating cardiomyocytes. (In collaboration with the Allen Institute for Cell Science)
I began my research journey as a biologist and have since grown into a computer scientist with an interest in understanding biological data. I am excited about opportunities that allow me to span across fields and require computational skillsets to dig into outstanding challenges in biology and global sustainability. While my graduate work focused on decoding the genetic grammar of methane-eating bacteria, increasingly, I'm interested in the broader landscape of solutions for mitigating climate change and fostering a more circular bioeconomy. I am eager to pursue an interdisciplinary career where I can be a part of these solutions by bridging across ideas and skillsets in machine learning, life sciences, and sustainable technology.
University of Washington, Graduate Researcher, Computer ScienceSeattle, WA 2017 - 2023
Built computational frameworks to accelerate genetic engineering efforts in methane-consuming bacteria, a promising carbon removal platform; established a suite of new methanotroph promoter tools; characterized the effectiveness of machine learning models for discovering influential genetic patterns from RNA-seq experiments in microorganisms with limited data
Zymergen, Intern, Data ScienceSeattle, WA June 2018 - August 2018
Prototyped machine learning and convolutional neural network models for predicting DNA regulatory features in non-standard microbe genomes.
Amyris, Associate Scientist, Scientific ComputingEmeryville, CA July 2014 - July 2017
In my 3 years in the Scientific Computing group at Amyris, I applied my background in genetics and computer science to various computational projects in R&D. My roles ranged from the designated computational resource for a given project, to a member on a team of computational experts, and a communication bridge between software engineers and biologists. Several specific projects I worked on include:
- characterizing the genomic impact of chemical mutagens
- maintaining the company's whole genome sequencing pipeline
- developing and training the Amyris community in Genotype Specification Language (a DNA design tool invented at Amyris)
- building a Genotype Generator tool to translate high level designs for metabolic pathways into concrete build instructions for strains that can carry out pathway designs
Amyris, Intern, Scientific ComputingEmeryville, CA December 2013
Coded a data visualization tool to help strain engineers overlay experimental data onto yeast metabolic pathways.
University of Minnesota, Research Assistant, Myers Lab (Computational Biology)Minneapolis, MN June 2013 - August 2013
Used genetic interaction and chemical genetic interaction data to code a target prediction pipeline in Python. Developed a benchmark standard for accurately predicting gene targets for chemicals of interest.
Carleton College, Research Assistant, Goings Lab (Evolutionary Computing)Northfield, MN June 2012 - August 2012
Performed experiments on evolving populations of digital organisms to examine the effects of limited CPU resources on the populations’ ability to evolve complex Boolean logic functions.
UCSF, Research Assistant, Ahituv Lab (Genetics)San Francisco, CA June 2011 - August 2011
Perfomed chromatin immunoprecipitation sequencing experiments on mouse limb tissue to find enhancer candidates involved in limb patterning and development.
- L. He, J. D. Groom, E. H. Wilson, J. Fernandez, M. C. Konopka, D. A. C. Beck, M. E. Lidstrom. “A methanotrophic bacterium to enable methane removal for climate mitigation.” (2023) PNAS. [Article] [Interactive viz gallery]
- A. H. Singh, B. B. Kaufmann-Malaga, J. A. Lerman, D. P. Dougherty, Y. Zhang, A. L. Kilbo, E. H. Wilson, C. Y. Ng, O. Erbilgin, K. A. Curran, C. D. Reeves, J. E. Hung, S. Mantovani, Z. A. King, M. J. Ayson, J. R. Denery, C. Lu, P. Norton, C. Tran, D. M. Platt, J. R. Cherry, S. S. Chandran, A. L. Meadows. (2023) “An Automated Scientist to Design and Optimize Microbial Strains for the Industrial Production of Small Molecules” [bioRxiv]
- E. H. Wilson, M. E. Lidstrom, and D. A. C. Beck. (2021) "A multi-task learning approach to enhance sustainable biomolecule production in engineered microorganisms." Tackling Climate Change with Machine Learning, workshop at ICML 2021. [Video Recording] [Proposal]
- E. H. Wilson, J. D. Groom, M. C. Sarfatis, S. M. Ford, M. E. Lidstrom, and D. A. C. Beck. (2021) "A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets." ACS Synthetic Biology. [Article]
- E. H. Wilson, C. Macklin, and D. Platt. (2018) "Engineering genomes with Genotype Specification Language." In Methods in Molecular Biology, Synthetic Biology. J.C. Braman, ed. Springer Publishing Company, New York, NY. [PubMed]
- S. W. Simpkins, J. Nelson, R. Deshpande, S.C. Li, J. S. Piotrowski, E. H. Wilson, A. A. Gebre, R. Okamoto, M. Yoshimura, M. Costanzo, Y. Yashiroda, Y. Ohya, H. Osada, M. Yoshida, C. Boone, C. L. Myers. (2018) “Predicting bioprocess targets of chemical compounds through integration of chemical-genetic and genetic interactions.” PLoS Computational Biology. [PubMed]
- E. H. Wilson, S. Sagawa, J. Weis, M. Shubert, M. Bissell, B. Hawthorne, C. Reeves, J. Dean, and D. Platt. (2016) "Genotype Specification Language." ACS Synthetic Biology. 5(6), pp 471-478. [PubMed]
- "Modeling DNA Sequences with PyTorch" (2022) Towards Data Science, Medium.
Presentations & Posters
- E. H. Wilson., M. E. Lidstrom, D. A. C. Beck. “Probing the limits of deep learning methods for predicting gene expression in non-model microbes.” Rapid talk and poster at SBFC. Portland, OR, April 2023.
- E. H. Wilson, M. E. Lidstrom, D. A. C. Beck. “Methane, Microbes, and Machine Learning: Engineering biology to combat climate change.” Poster at Industry Affiliates Research Symposium at the University of Washington, November 2022. [Poster PDF]
- "Using microorganisms to mitigate macro problems." Oral talk at Virtual Women's Research Day, University of Washington, 2020. [Video Recording]
- "Using microorganisms to solve macro problems: untangling the genetic circuitry of methane-eating bacteria." Oral talk at MIDAS Data Science Symposium, University of Michigan, 2019.
- "Can deep learning help us program biology?" Oral talk at Industry Affiliates Research Day, University of Washington, 2018.
- E. H. Wilson, D. Platt. “Genotype Specification Language: Programming in DNA!” Poster at Synthetic Biology, Engineering, Evolution & Design (SEED) conference in Chicago, July 2016.
Science Communication for General Audiences
- "The Light Side of Genetic Engineering." (2019) OneZero, Medium.
- "Genetic Constructor and GSL - Best of Both Worlds." (2016) Autodesk Bionano Research Blog.
Leadership and Volunteering
- Research mentor for undergraduate student (2020-2023)
- Pre-application review mentor for prospective graduate students from diverse backgrounds (2021-2022)
- Peer mentor for groups of incoming graduate students (2018-2022)
- Graduate Peer Mentorship Program Organizer (2019-2020)
- TGIF Social Chair (2018-2019)
- New Grad Orientation Organizer (2018)
- Programming Organisms with DNA Puzzles! - Developed an interactive activity to teach elementary/middle schoolers about genetic engineering.
- Engineering Discovery Days, University of Washington
- Introduce a Girl to CoRDS (Coding, Robotics, and Data Science), University of Washington
Goofy data analysis
- "Mistborn: The Final Eyebrow." (2021) Towards Data Science, Medium. An analysis of social dynamics in the fantasy novel Mistborn: The Final Empire by Brandon Sanderson as conveyed through characters' eyebrow raising behavior. Data Viz Compilation
Puns and Lyrics
Sometimes my brain makes jokes. Often they involve Disney songs and/or science puns. Here are a few of them :)
Inspired by working at Amyris
Inspired by UW classes (ML, NLP, SynBio)
I'm always excited to learn more about how a computer/data scientist can help solve problems in biology and sustainability! Feel free to connect :)
Also, if you're considering exploring the intersection of Biology and Computer Science, I'd be happy to chat about my experience navigating undergrad, working in industry, and transitioning back to grad school.
You can reach me at erinhwilson gmail.com.
I also have a LinkedIn.