<?xml version="1.0" encoding="UTF-8"?>
<source>
  <jobs>
    <job>
      <externalid>dc17980d-461</externalid>
      <Title>Research Engineer, Interpretability</Title>
      <Description><![CDATA[<p>JOB TITLE: Research Engineer, Interpretability \n LOCATION: San Francisco, CA \n DEPARTMENT: AI Research &amp; Engineering \n \n JOB DESCRIPTION: \n \n When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; \n \n The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe. \n \n Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs. \n \n More resources to learn about our work: \n - Our research blog - covering advances including Monosemantic Features and Circuits \n - An Introduction to Interpretability from our research lead, Chris Olah \n - The Urgency of Interpretability from CEO Dario Amodei \n - Engineering Challenges Scaling Interpretability - directly relevant to this role \n - 60 Minutes segment - Around 8:07, see a demo of tooling our team built \n - New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems \n \n Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model: \n - Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips \n - Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot; \n - Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission \n \n The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI. \n \n RESPONSIBILITIES: \n - Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application \n - Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams \n - Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers \n - Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations \n - Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling \n \n YOU MAY BE A GOOD FIT IF YOU: \n - Have 5-10+ years of experience building software \n - Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python \n - Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks \n - Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions \n - Prefer fast-moving collaborative projects to extensive solo efforts \n - Are curious about interpretability research and its role in AI safety (though no research experience is required!) \n - Care about the societal impacts and ethics of your work \n - Are comfortable working closely with researchers, translating research needs into engineering solutions. \n \n STRONG CANDIDATES MAY ALSO HAVE EXPERIENCE WITH: \n - Optimizing the performance of large-scale distributed systems \n - Language modeling fundamentals with transformers \n - High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization \n - Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs \n - Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges \n \n REPRESENTATIVE PROJECTS: \n - Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations \n - Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them \n - Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization \n - Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research) \n \n ROLE SPECIFIC LOCATION POLICY: \n - This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis. \n \n The annual compensation range for this role is listed below. \n For sales roles, the range provided is the role&#39;s On Target Earnings (\&quot;OTE\&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. \n Annual Salary:\\$315,000-\\$560,000 USD</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$315,000-$560,000 USD</Salaryrange>
      <Skills>Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, High Performance LLM optimization, memory management, compute efficiency, parallelism strategies, inference throughput optimization, large-scale distributed systems, language modeling fundamentals, transformers, collaborating closely with researchers, building tooling to support research teams</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>Anthropic</Employername>
      <Employerlogo>https://logos.yubhub.co/anthropic.com.png</Employerlogo>
      <Employerdescription>Anthropic is a company that creates reliable, interpretable, and steerable AI systems.</Employerdescription>
      <Employerwebsite>https://www.anthropic.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/anthropic/jobs/4980430008</Applyto>
      <Location>San Francisco, CA</Location>
      <Country></Country>
      <Postedate>2026-04-18</Postedate>
    </job>
    <job>
      <externalid>97212bdf-dd1</externalid>
      <Title>Research Engineer, Interpretability</Title>
      <Description><![CDATA[<p>Job Title: Research Engineer, Interpretability</p>
<p>About the Role:</p>
<p>When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.</p>
<p>Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs.</p>
<p>More resources to learn about our work:</p>
<ul>
<li>Our research blog - covering advances including Monosemantic Features and Circuits</li>
</ul>
<ul>
<li>An Introduction to Interpretability from our research lead, Chris Olah</li>
</ul>
<ul>
<li>The Urgency of Interpretability from CEO Dario Amodei</li>
</ul>
<ul>
<li>Engineering Challenges Scaling Interpretability - directly relevant to this role</li>
</ul>
<ul>
<li>60 Minutes segment - Around 8:07, see a demo of tooling our team built</li>
</ul>
<ul>
<li>New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems</li>
</ul>
<p>Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model:</p>
<ul>
<li>Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips</li>
</ul>
<ul>
<li>Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot;</li>
</ul>
<ul>
<li>Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission</li>
</ul>
<p>The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI.</p>
<p>Responsibilities:</p>
<ul>
<li>Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application</li>
</ul>
<ul>
<li>Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams</li>
</ul>
<ul>
<li>Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers</li>
</ul>
<ul>
<li>Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations</li>
</ul>
<ul>
<li>Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling</li>
</ul>
<p>You may be a good fit if you:</p>
<ul>
<li>Have 5-10+ years of experience building software</li>
</ul>
<ul>
<li>Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python</li>
</ul>
<ul>
<li>Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks</li>
</ul>
<ul>
<li>Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions</li>
</ul>
<ul>
<li>Prefer fast-moving collaborative projects to extensive solo efforts</li>
</ul>
<ul>
<li>Are curious about interpretability research and its role in AI safety (though no research experience is required!)</li>
</ul>
<ul>
<li>Care about the societal impacts and ethics of your work</li>
</ul>
<ul>
<li>Are comfortable working closely with researchers, translating research needs into engineering solutions.</li>
</ul>
<p>Strong candidates may also have experience with:</p>
<ul>
<li>Optimizing the performance of large-scale distributed systems</li>
</ul>
<ul>
<li>Language modeling fundamentals with transformers</li>
</ul>
<ul>
<li>High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization</li>
</ul>
<ul>
<li>Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs</li>
</ul>
<ul>
<li>Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges</li>
</ul>
<p>Representative Projects:</p>
<ul>
<li>Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations</li>
</ul>
<ul>
<li>Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them</li>
</ul>
<ul>
<li>Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization</li>
</ul>
<ul>
<li>Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research)</li>
</ul>
<p>Role Specific Location Policy:</p>
<ul>
<li>This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis.</li>
</ul>
<p>The annual compensation range for this role is listed below.</p>
<p>For sales roles, the range provided is the role&#39;s On Target Earnings (&quot;OTE&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>
<p>Annual Salary: $315,000-$560,000 USD</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$315,000-$560,000 USD</Salaryrange>
      <Skills>Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, Transformers, High Performance LLM optimization, Memory management, Compute efficiency, Parallelism strategies, Inference throughput optimization, Optimizing the performance of large-scale distributed systems, Language modeling fundamentals, Collaborating closely with researchers and building tooling to support research teams</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>Anthropic</Employername>
      <Employerlogo>https://logos.yubhub.co/anthropic.com.png</Employerlogo>
      <Employerdescription>Anthropic is a company that creates reliable, interpretable, and steerable AI systems.</Employerdescription>
      <Employerwebsite>https://www.anthropic.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/anthropic/jobs/4980430008</Applyto>
      <Location>San Francisco, CA</Location>
      <Country></Country>
      <Postedate>2026-04-18</Postedate>
    </job>
  </jobs>
</source>