Senior Machine Learning Scientist, LLMs for Chemistry

  • South San Francisco, CA
  • Posted 19 days ago | Updated 7 hours ago

Overview

On Site
USD 165,200.00 - 306,800.00 per year
Full Time

Skills

Health care
Large Language Models (LLMs)
FOCUS
Design
Collaboration
Publications
Workflow
Open source
Science
Physics
Chemical engineering
Computer science
Statistics
Applied mathematics
Research
Chemistry
Modeling
Fluency
Python
Deep learning
PyTorch
Cloud computing
Amazon Web Services
Google Cloud
Google Cloud Platform
Microsoft Azure
Version control
Git
Continuous integration
Continuous delivery
Algorithms
GitHub
GitLab
Software development
Machine Learning (ML)
GCS

Job Details

The Position

The Position

A healthier future. It's what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love. That's what makes us Roche.

At Prescient Design, we are seeking a highly motivated Machine Learning Scientist to help drive research on Machine Learning for Drug Discovery. The successful candidate will collaborate extensively with computational and experimental scientists and researchers across gRED to deploy and deliver machine-learning solutions for small-molecule drug discovery.

The Opportunity

We are seeking a highly motivated Machine Learning Scientist to join Prescient Design within Genentech Research and Early Development (gRED) to help drive research on Machine Learning for Drug Discovery, with a focus on Large Language Models (LLMs) and Chemistry/Small Molecule Drug Discovery (SMDD). The successful candidate will focus on creating novel LLM applications that interface directly with medicinal chemistry and molecular design processes, as well as enhancing predictions of biochemical properties, chemical reactivity, and synthesizability. By collaborating extensively with computational and experimental scientists across gRED, the candidate will develop, deploy, and deliver innovative machine-learning solutions that significantly advance our small-molecule drug discovery pipeline.
  • You will fine tune modern LLM architectures and implement agentic workflows for applications in computational chemistry and small molecule drug discovery to deliver technical solutions for molecular design across gRED and pRED (Roche).
  • You will design, build, and maintain robust and scalable ML pipelines and workflows that integrate machine learning solutions with internal and external cheminformatics and computational chemistry tools and APIs.
  • You will closely collaborate with other scientists and researchers within Prescient to build impactful technologies for drug discovery research.
  • You will spearhead the creation and curation of high-quality, diverse datasets that fuel cutting-edge ML models, directly impacting the success of drug discovery projects.
  • You will contribute to and drive publications, present results at internal and external scientific conferences, and help make code and workflows open source.


Who you are
  • You will have a BS/MS with 3 - 5 years of experience or recent PhD in the physical sciences (e.g., Chemistry, Physics, Chemical Engineering) or quantitative field (e.g., Computer Science, Statistics, Applied Mathematics) or equivalent industry research experience.
  • You will have experience working with medicinal chemistry or chemical reaction datasets and familiarity with chemistry, cheminformatics, and small molecule drug discovery concepts. Experience with toolkits for chemical modeling is a plus (e.g., RDKit, OpenEye Toolkits).
  • You will be fluent in Python and demonstrated experience with modern Python frameworks for deep learning (e.g., PyTorch, Hugging Face Transformers).
  • You will have experience with cloud platforms (e.g., AWS, Google Cloud Platform, Azure), version control systems (e.g., Git), and CI/CD pipelines.
  • You will have a proven track record in developing and deploying scalable ML models and algorithms.
  • You will have a record of scientific excellence as evidenced by at least one publication in a scientific journal or conference.
  • You will have a public portfolio of projects available on GitHub/GitLab.


Preferred
  • You will have strong experience in scientific software development and ML engineering, including fine-tuning modern LLM architectures and building ML pipelines.


#gCS

#teclifeAI

Relocation benefits are available for this posting

The expected salary range for this position based on the primary location of California or New York is $165,200 - 306,800. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors.



Genentech is an equal opportunity employer, and we embrace the increasingly diverse world around us. Genentech prohibits unlawful discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin or ancestry, age, disability, marital status and veteran status.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.