{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/language-modeling-fundamentals"},"x-facet":{"type":"skill","slug":"language-modeling-fundamentals","display":"Language Modeling Fundamentals","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dc17980d-461"},"title":"Research Engineer, Interpretability","description":"<p>JOB TITLE: Research Engineer, Interpretability \\n LOCATION: San Francisco, CA \\n DEPARTMENT: AI Research &amp; Engineering \\n \\n JOB DESCRIPTION: \\n \\n When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; \\n \\n The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe. \\n \\n Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs. \\n \\n More resources to learn about our work: \\n - Our research blog - covering advances including Monosemantic Features and Circuits \\n - An Introduction to Interpretability from our research lead, Chris Olah \\n - The Urgency of Interpretability from CEO Dario Amodei \\n - Engineering Challenges Scaling Interpretability - directly relevant to this role \\n - 60 Minutes segment - Around 8:07, see a demo of tooling our team built \\n - New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems \\n \\n Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model: \\n - Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips \\n - Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot; \\n - Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission \\n \\n The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI. \\n \\n RESPONSIBILITIES: \\n - Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application \\n - Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams \\n - Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers \\n - Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations \\n - Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling \\n \\n YOU MAY BE A GOOD FIT IF YOU: \\n - Have 5-10+ years of experience building software \\n - Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python \\n - Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks \\n - Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions \\n - Prefer fast-moving collaborative projects to extensive solo efforts \\n - Are curious about interpretability research and its role in AI safety (though no research experience is required!) \\n - Care about the societal impacts and ethics of your work \\n - Are comfortable working closely with researchers, translating research needs into engineering solutions. \\n \\n STRONG CANDIDATES MAY ALSO HAVE EXPERIENCE WITH: \\n - Optimizing the performance of large-scale distributed systems \\n - Language modeling fundamentals with transformers \\n - High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization \\n - Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs \\n - Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges \\n \\n REPRESENTATIVE PROJECTS: \\n - Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations \\n - Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them \\n - Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization \\n - Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research) \\n \\n ROLE SPECIFIC LOCATION POLICY: \\n - This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis. \\n \\n The annual compensation range for this role is listed below. \\n For sales roles, the range provided is the role&#39;s On Target Earnings (\\&quot;OTE\\&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. \\n Annual Salary:\\\\$315,000-\\\\$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dc17980d-461","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","High Performance LLM optimization","memory management","compute efficiency","parallelism strategies","inference throughput optimization"],"x-skills-preferred":["large-scale distributed systems","language modeling fundamentals","transformers","collaborating closely with researchers","building tooling to support research teams"],"datePosted":"2026-04-18T15:53:01.682Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, High Performance LLM optimization, memory management, compute efficiency, parallelism strategies, inference throughput optimization, large-scale distributed systems, language modeling fundamentals, transformers, collaborating closely with researchers, building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_97212bdf-dd1"},"title":"Research Engineer, Interpretability","description":"<p>Job Title: Research Engineer, Interpretability</p>\n<p>About the Role:</p>\n<p>When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.</p>\n<p>Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs.</p>\n<p>More resources to learn about our work:</p>\n<ul>\n<li>Our research blog - covering advances including Monosemantic Features and Circuits</li>\n</ul>\n<ul>\n<li>An Introduction to Interpretability from our research lead, Chris Olah</li>\n</ul>\n<ul>\n<li>The Urgency of Interpretability from CEO Dario Amodei</li>\n</ul>\n<ul>\n<li>Engineering Challenges Scaling Interpretability - directly relevant to this role</li>\n</ul>\n<ul>\n<li>60 Minutes segment - Around 8:07, see a demo of tooling our team built</li>\n</ul>\n<ul>\n<li>New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems</li>\n</ul>\n<p>Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model:</p>\n<ul>\n<li>Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips</li>\n</ul>\n<ul>\n<li>Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot;</li>\n</ul>\n<ul>\n<li>Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission</li>\n</ul>\n<p>The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application</li>\n</ul>\n<ul>\n<li>Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams</li>\n</ul>\n<ul>\n<li>Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers</li>\n</ul>\n<ul>\n<li>Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations</li>\n</ul>\n<ul>\n<li>Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 5-10+ years of experience building software</li>\n</ul>\n<ul>\n<li>Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python</li>\n</ul>\n<ul>\n<li>Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks</li>\n</ul>\n<ul>\n<li>Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions</li>\n</ul>\n<ul>\n<li>Prefer fast-moving collaborative projects to extensive solo efforts</li>\n</ul>\n<ul>\n<li>Are curious about interpretability research and its role in AI safety (though no research experience is required!)</li>\n</ul>\n<ul>\n<li>Care about the societal impacts and ethics of your work</li>\n</ul>\n<ul>\n<li>Are comfortable working closely with researchers, translating research needs into engineering solutions.</li>\n</ul>\n<p>Strong candidates may also have experience with:</p>\n<ul>\n<li>Optimizing the performance of large-scale distributed systems</li>\n</ul>\n<ul>\n<li>Language modeling fundamentals with transformers</li>\n</ul>\n<ul>\n<li>High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization</li>\n</ul>\n<ul>\n<li>Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs</li>\n</ul>\n<ul>\n<li>Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges</li>\n</ul>\n<p>Representative Projects:</p>\n<ul>\n<li>Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations</li>\n</ul>\n<ul>\n<li>Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them</li>\n</ul>\n<ul>\n<li>Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization</li>\n</ul>\n<ul>\n<li>Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research)</li>\n</ul>\n<p>Role Specific Location Policy:</p>\n<ul>\n<li>This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis.</li>\n</ul>\n<p>The annual compensation range for this role is listed below.</p>\n<p>For sales roles, the range provided is the role&#39;s On Target Earnings (&quot;OTE&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>\n<p>Annual Salary: $315,000-$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_97212bdf-dd1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","Transformers","High Performance LLM optimization","Memory management","Compute efficiency","Parallelism strategies","Inference throughput optimization"],"x-skills-preferred":["Optimizing the performance of large-scale distributed systems","Language modeling fundamentals","Collaborating closely with researchers and building tooling to support research teams"],"datePosted":"2026-04-18T15:46:01.999Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, Transformers, High Performance LLM optimization, Memory management, Compute efficiency, Parallelism strategies, Inference throughput optimization, Optimizing the performance of large-scale distributed systems, Language modeling fundamentals, Collaborating closely with researchers and building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}}]}