Career Opportunities for Physicists
Machine Learning Scientist I - Broad institute
We seek a Machine Learning Scientist to operate at the interface of Broad’s Optical Profiling Platform (OPP) and Data Sciences Platform (DSP). The Optical Profiling Platform focuses on developing and implementing new imaging techniques at scale, generating high-throughput data of previously inaccessible phenotypes. The Data Sciences Platform is a world leader in developing analysis pipelines and infrastructure for large-scale biological data. In this project, we seek to understand the connection between genetic and transcriptomic signature and the electrophysiological phenotype of cells such as neurons, cardiomyocytes, and beta-cells. This problem has stood intractable for decades, but data generated by cutting-edge all-optical physiology and spatial transcriptomic methods within OPP finally bring it within our grasp.
The Machine Learning Scientist will participate in research and development efforts aimed at solving problems in analyzing massive-scale biological data, in particular high-time resolution voltage and calcium imaging data combined with in situ genomics and transcriptomics sequencing. Work will be housed primarily within DSP, with extensive close collaboration with OPP and many labs in the Broad, including those of Aviv Regev, Steven McCarroll, Bernardo Sabatini, Patrick Elinor, Anna Greka, and Bridget Wagner.
The candidate joins a strong team of data scientists to work with, has access to vast amounts of omics and imaging data, and is encouraged to publish new methods and results in academic journals and conferences. This position is suited to a person who is excited by the prospect of learning, adapting and applying modern machine learning techniques to solve the key challenges for emerging biological data modalities, with revolutionary implications in advancing the state-of-the-art clinical practice. The ideal candidate has both a theoretical and practical understanding of machine learning techniques and has a proven track-record in areas such as computational biology, probability, statistics, complex networks data analysis, statistical physics, or high-performance computing.
- Adapting and applying existing machine learning techniques to imaging, genomic, and transcriptomic datasets
- Developing novel machine learning methods for understanding and organizing unstructured datasets
- Developing robust and generalizable inference algorithms that advance the state-of-the-art
- Writing well-crafted, maintainable, scalable, and performant machine learning code
- Designing, developing, and maintaining tests framework for machine learning code
- Ph.D. in Computer Science, Electrical Engineering, Physics, Computational Biology, Mathematics, Statistics, or related quantitative fields
- Strong communication skills and ability to collaborate with biologists, computational biologists, data scientists, and software engineers on model requirements and design
- Scientific and numerical programming in Python or R
- Strong bash/shell scripting and proficiency with UNIX operating systems
- Familiarity with one object-oriented programming language (e.g. Java, C++, Go)
- 0-2 years post-Ph.D. experience working on machine learning or a related area
- Working knowledge of existing machine learning and probabilistic programming infrastructures (e.g. PyTorch, TensorFlow, Theano)
- Fluency with version control, including distributed version control and Git in particular
EOE / Minorities / Females / Protected Veterans / Disabilities
[posted September 21, 2018]