<?xml version="1.0" encoding="UTF-8"?>
<source>
  <jobs>
    <job>
      <externalid>e121da52-304</externalid>
      <Title>Research Engineer, Human Understanding</Title>
      <Description><![CDATA[<p>We are seeking a highly motivated Research Engineer with a strong background in multi-modal modelling for humans and a focus on speech &amp; audio/visual to join the effort within Google DeepMind&#39;s Frontier AI unit.</p>
<p>This role is pivotal in developing foundational multimodal AI capabilities to understand, generate, and protect human likeness. As a key contributor, you will design and implement cutting-edge models and frameworks, pushing the boundaries of AI to enable foundational capabilities for human-centric understanding and generation.</p>
<p>This is a unique opportunity to contribute to impactful research and advance Google DeepMind&#39;s mission towards Artificial General Intelligence (AGI).</p>
<p><strong>Key Responsibilities</strong></p>
<ul>
<li>Advance multimodal human representations &amp; understanding: Research and implement novel models and other multimodal techniques for a more holistic understanding of humans across visual, audio, and textual data.</li>
<li>Conduct applied research: Conduct experimental research cycles from hypothesis to deployment.</li>
<li>Drive technical projects: Take ownership of substantial technical projects within the effort, from ideation and design to implementation and evaluation, often involving cross-functional collaboration.</li>
<li>Contribute to Infrastructure: Inform and contribute to the development of scalable and efficient research infrastructure for multimodal human understanding models and datasets.</li>
<li>Design and execute strategies for tuning and adapting VLMs and other foundation models for specific tasks</li>
</ul>
<p><strong>Requirements</strong></p>
<ul>
<li>PhD degree in Computer Science, Machine Learning, or a related technical field with 3+ years of relevant experience.</li>
<li>Experience in developing machine learning models, such as audio &amp; speech-visual models.</li>
<li>Experience in working with and tuning large-scale vision language models.</li>
<li>Strong programming skills in Python and experience with at least one major deep learning framework (e.g., JAX)</li>
<li>Experience conducting independent research and development, including experimental design, implementation, and analysis.</li>
</ul>
<p><strong>Salary</strong></p>
<p>The US base salary range for this full-time position is between $174,000 USD - $252,000 USD + bonus + equity + benefits.</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>onsite</Workarrangement>
      <Salaryrange>$174,000 USD - $252,000 USD</Salaryrange>
      <Skills>Python, JAX, Machine Learning, Deep Learning, Vision Language Models, Audio &amp; Speech-Visual Models, Generative AI, Reinforcement Learning, Alignment Methods, Multimodal Learning, Privacy-Preserving Machine Learning</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>Google DeepMind</Employername>
      <Employerlogo>https://logos.yubhub.co/deepmind.com.png</Employerlogo>
      <Employerdescription>Google DeepMind is a technology company that specializes in artificial intelligence and machine learning.</Employerdescription>
      <Employerwebsite>https://deepmind.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/deepmind/jobs/7669433</Applyto>
      <Location>Los Angeles, California, US; Mountain View, California, US</Location>
      <Country></Country>
      <Postedate>2026-04-18</Postedate>
    </job>
  </jobs>
</source>