{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/adversarial-ml"},"x-facet":{"type":"skill","slug":"adversarial-ml","display":"Adversarial Ml","count":5},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_40d7c45c-7eb"},"title":"Researcher, Loss of Control","description":"<p><strong>Compensation</strong></p>\n<p>Estimated Base Salary $295K – $445K</p>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the team</strong></p>\n<p>The Safety Systems org ensures that OpenAI’s most capable models can be responsibly developed and deployed. We build evaluations, safeguards, and safety frameworks that help our models behave as intended in real-world settings.</p>\n<p><strong>About the role</strong></p>\n<p>As frontier AI systems become more capable, they are increasingly able to pursue long-horizon goals, use tools, adapt to feedback, and operate with greater autonomy. These advances create enormous potential benefits, but they also introduce the risk that models may behave in ways that are misaligned, deceptive, or difficult to supervise or contain. Reducing loss of control risk is therefore a core challenge for safely developing and deploying advanced AI systems.</p>\n<p>As a Researcher for loss of control mitigations, you will help design and implement an end-to-end mitigation stack to reduce the risk of intentionally subversive or insufficiently controllable model behavior across OpenAI’s products and internal deployments. This role requires strong technical depth and close cross-functional collaboration to ensure safeguards are enforceable, scalable, and effective. You’ll contribute directly to building protections that remain robust as model capabilities, deployment patterns, and threat models evolve.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design and implement mitigation components for loss of control risk,spanning prevention, monitoring, detection, containment, and enforcement,under the guidance of senior technical and risk leadership.</li>\n</ul>\n<ul>\n<li>Integrate safeguards across product and research surfaces in partnership with product, engineering, and research teams, helping ensure protections are consistent, low-latency, and resilient as usage and model autonomy increase.</li>\n</ul>\n<ul>\n<li>Evaluate technical trade-offs within the loss of control domain (coverage, robustness, latency, model utility, and operational complexity) and propose pragmatic, testable solutions.</li>\n</ul>\n<ul>\n<li>Collaborate closely with risk modeling, evaluations, and policy partners to align mitigation design with anticipated failure modes and high-severity threat scenarios, including deceptive alignment, hidden subgoals, reward hacking, and attempts to evade oversight.</li>\n</ul>\n<ul>\n<li>Execute rigorous testing and red-teaming workflows, helping stress-test the mitigation stack against increasingly capable and potentially subversive model behaviors,such as sandbagging, monitor evasion, exploit-seeking, unsafe tool use, or strategic deception,and iterate based on findings.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have a passion for AI safety and are motivated to make cutting-edge AI models safer for real-world use.</li>\n</ul>\n<ul>\n<li>Bring demonstrated experience in deep learning and transformer models.</li>\n</ul>\n<ul>\n<li>Are proficient with frameworks such as PyTorch or TensorFlow.</li>\n</ul>\n<ul>\n<li>Possess a strong foundation in data structures, algorithms, and software engineering principles.</li>\n</ul>\n<ul>\n<li>Are familiar with methods for training and fine-tuning large language models, including distillation, supervised fine-tuning, and policy optimization.</li>\n</ul>\n<ul>\n<li>Excel at working collaboratively with cross-functional teams across research, policy, product, and engineering.</li>\n</ul>\n<ul>\n<li>Have significant experience designing and evaluating technical safeguards, control mechanisms, or monitoring systems for advanced AI behavior.</li>\n</ul>\n<ul>\n<li>(Nice to have) Bring background knowledge in alignment, control, interpretability, robustness, adversarial ML, or related fields.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_40d7c45c-7eb","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://openai.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/20d85859-8f7e-4e13-a992-b801a34780e5","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"Full time","x-salary-range":"$295K – $445K","x-skills-required":["deep learning","transformer models","PyTorch","TensorFlow","data structures","algorithms","software engineering principles","large language models","distillation","supervised fine-tuning","policy optimization"],"x-skills-preferred":["alignment","control","interpretability","robustness","adversarial ML"],"datePosted":"2026-04-24T12:23:48.372Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"deep learning, transformer models, PyTorch, TensorFlow, data structures, algorithms, software engineering principles, large language models, distillation, supervised fine-tuning, policy optimization, alignment, control, interpretability, robustness, adversarial ML","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":445000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7cc85573-4a2"},"title":"Technical Policy Manager, Cyber Harms","description":"<p>We are seeking a Technical Policy Manager, Cyber Harms to lead our efforts to prevent AI misuse in the cyber domain. As a member of our Safeguards team, you will be responsible for designing and overseeing the execution of capability evaluations to assess the cyber-relevant capabilities of new models. You will also create comprehensive cyber threat models, including attack vectors, exploit chains, precursor identification, and weaponization techniques.</p>\n<p>This is a unique opportunity to shape how frontier AI models handle dual-use cybersecurity knowledge,balancing the tremendous potential of AI to advance legitimate security research and defensive capabilities while preventing misuse by malicious actors.</p>\n<p>In this role, you will lead and grow a team of technical specialists focused on cyber threat modeling and evaluation frameworks. You will serve as the primary domain expert on cyber harms, advising cross-functional teams on threat landscapes and mitigation strategies.</p>\n<p>You will collaborate closely with internal and external threat modeling experts to develop training data for safety systems, and with ML engineers to train these systems, optimizing for both robustness against adversarial attacks and low false-positive rates for legitimate security researchers.</p>\n<p>You will also analyze safety system performance in traffic, identifying gaps and proposing improvements. You will conduct regular reviews of existing policies and enforcement systems to identify and address gaps and ambiguities related to cybersecurity risks.</p>\n<p>You will develop rigorous stress-testing of safeguards against evolving cyber threats and product surfaces. You will partner with Research, Product, Policy, Security Team, and Frontier Red Team to ensure cybersecurity safety is embedded throughout the model development lifecycle.</p>\n<p>You will translate cybersecurity domain knowledge into actionable safety requirements and clearly articulated policies. You will contribute to external communications, including model cards, blog posts, and policy documents related to cybersecurity safety.</p>\n<p>You will monitor emerging technologies and threat landscapes for their potential to contribute to new risks and mitigation strategies, and strategically address these.</p>\n<p>You will mentor and develop team members, fostering a culture of technical excellence and responsible AI development.</p>\n<p>To be successful in this role, you will need to have:</p>\n<ul>\n<li>An M.S. or PhD in Computer Science, Cybersecurity, or a related technical field, OR equivalent professional experience in offensive or defensive cybersecurity</li>\n<li>5+ years of hands-on experience in cybersecurity, with deep expertise in areas such as vulnerability research, exploit development, network security, malware analysis, or penetration testing</li>\n<li>2+ years of experience managing technical teams or leading complex technical projects with multiple stakeholders</li>\n<li>Experience in scientific computing and data analysis, with proficiency in programming (Python preferred)</li>\n<li>Deep expertise in modern cybersecurity, including both offensive techniques (vulnerability research, exploit development, penetration testing, malware analysis) and defensive measures (detection, monitoring, incident response)</li>\n<li>Demonstrated ability to create threat models and translate technical cyber risks into policy frameworks</li>\n<li>Familiarity with responsible disclosure practices, vulnerability coordination, and cybersecurity frameworks (e.g., MITRE ATT&amp;CK, NIST Cybersecurity Framework, CWE/CVE systems)</li>\n<li>Strong analytical and writing skills, with the ability to navigate ambiguity and explain complex technical concepts to non-technical stakeholders</li>\n<li>Experience developing policies or guidelines at scale, balancing safety concerns with enabling legitimate use cases</li>\n<li>A passion for learning new skills and an ability to rapidly adapt to changing techniques and technologies</li>\n<li>Comfort working in a fast-paced environment where priorities may shift as AI capabilities evolve</li>\n<li>Track record of translating specialized technical knowledge into actionable safety policies or enforcement guidelines</li>\n</ul>\n<p>Preferred qualifications include:</p>\n<ul>\n<li>Background in AI/ML systems, particularly experience with large language models</li>\n<li>Experience developing ML-based security systems or adversarial ML research</li>\n<li>Experience working with defense, intelligence, or security organizations (e.g., NSA, CISA, national labs, security contractors)</li>\n<li>Published security research, disclosed vulnerabilities, or participated in bug bounty programs</li>\n<li>Understanding of Trust &amp; Safety operations and content moderation at scale</li>\n<li>Certifications such as OSCP, OSCE, GXPN, or equivalent demonstrating technical depth</li>\n<li>Understanding of dual-use security research concerns and ethical considerations in AI safety</li>\n</ul>\n<p>The annual compensation range for this role is $320,000-$405,000 USD.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7cc85573-4a2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.co/","logo":"https://logos.yubhub.co/anthropic.co.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5066981008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["Cybersecurity","Vulnerability research","Exploit development","Network security","Malware analysis","Penetration testing","Detection","Monitoring","Incident response","Scientific computing","Data analysis","Programming (Python)","Responsible disclosure practices","Vulnerability coordination","Cybersecurity frameworks (MITRE ATT&CK, NIST Cybersecurity Framework, CWE/CVE systems)"],"x-skills-preferred":["AI/ML systems","Large language models","ML-based security systems","Adversarial ML research","Defense, intelligence, or security organizations","Published security research","Disclosed vulnerabilities","Bug bounty programs","Trust & Safety operations","Content moderation at scale","Certifications (OSCP, OSCE, GXPN)"],"datePosted":"2026-04-18T15:56:47.739Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly (Travel-Required) | San Francisco, CA | Washington, DC"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cybersecurity, Vulnerability research, Exploit development, Network security, Malware analysis, Penetration testing, Detection, Monitoring, Incident response, Scientific computing, Data analysis, Programming (Python), Responsible disclosure practices, Vulnerability coordination, Cybersecurity frameworks (MITRE ATT&CK, NIST Cybersecurity Framework, CWE/CVE systems), AI/ML systems, Large language models, ML-based security systems, Adversarial ML research, Defense, intelligence, or security organizations, Published security research, Disclosed vulnerabilities, Bug bounty programs, Trust & Safety operations, Content moderation at scale, Certifications (OSCP, OSCE, GXPN)","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1e992e68-7cd"},"title":"Staff Engineer, Offensive Security","description":"<p>As a Staff Engineer, Offensive Security at Twilio, you will act as a Technical Lead and design complex attack chains that demonstrate systemic risk. You will spend as much time writing custom code and researching new bypasses as you do executing tests.</p>\n<p>In this role, you will:</p>\n<p>Perform manual and automated testing of web applications, APIs, and mobile apps (iOS/Android). Conduct network and cloud level assessments with various tooling. Triage and validate reports from automated scanners or bug bounty hunters to eliminate false positives and escalate true positives. Perform initial prompt injection and jailbreak tests on AI prototypes, services, and applications using established checklists (OWASP Top 10 for LLMs). Draft high-quality reports that detail the &quot;path to compromise&quot; with clear, reproducible steps for developers. Manage and update the team&#39;s testing infrastructure (e.g., Burp Suite, and basic C2 listeners). Provide direct technical guidance to engineering teams on how to patch vulnerabilities like XSS, SQLi, and IDOR. Design and lead multi-week Red Team operations that mimic specific threat actors (APTs) to test the SIRT detection capabilities. Build custom payloads, droppers, and obfuscated scripts to bypass EDR/AV and maintain stealth. Build automated testing frameworks for AI systems (e.g., using PyRIT, Promptfoo, or Garak) to test for models related to sensitive data leakage. Execute sophisticated attacks against AWS/Azure/K8s, focusing on IAM misconfigurations and container escapes. Collaborate with SIRT and Detection Engineering to tune SIEM alerts based on the techniques used during an engagement. Oversee the organization&#39;s bug bounty program, identifying trends in submissions to suggest broad architectural security changes.</p>\n<p>Twilio values diverse experiences from all kinds of industries, and we encourage everyone who meets the required qualifications to apply.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1e992e68-7cd","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Twilio","sameAs":"https://www.twilio.com/","logo":"https://logos.yubhub.co/twilio.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/twilio/jobs/7622285","x-work-arrangement":"remote","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Offensive security","Penetration testing","Bug bounty","AppSec","Vulnerability exploitation","MITRE ATT&CK matrix","OWASP Top 10 for web applications","OWASP Top 10 for LLMs","Post exploitation","Adversarial ML","Burp Suite professional","Nmap","Metasploit","Wireshark","LangChain","TensorFlow","C2 frameworks","Python","Bash","C++"],"x-skills-preferred":["Telecom expertise","Excellent written and verbal communication skills","Ability to influence and build effective working relationships with all levels of the organization","Proficiency in multiple languages applicable to the region"],"datePosted":"2026-04-18T15:49:45.138Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote - Ireland"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Offensive security, Penetration testing, Bug bounty, AppSec, Vulnerability exploitation, MITRE ATT&CK matrix, OWASP Top 10 for web applications, OWASP Top 10 for LLMs, Post exploitation, Adversarial ML, Burp Suite professional, Nmap, Metasploit, Wireshark, LangChain, TensorFlow, C2 frameworks, Python, Bash, C++, Telecom expertise, Excellent written and verbal communication skills, Ability to influence and build effective working relationships with all levels of the organization, Proficiency in multiple languages applicable to the region"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c76d0c6d-ec7"},"title":"Technical Policy Manager, Cyber Harms","description":"<p><strong>About the Role:</strong></p>\n<p>We are looking for a cybersecurity expert to lead our efforts to prevent AI misuse in the cyber domain. As a Cyber Harms Technical Policy Manager, you will lead a team applying deep technical expertise to inform the design of safety systems that detect harmful cyber behaviours and prevent misuse by sophisticated threat actors.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Lead and grow a team of technical specialists focused on cyber threat modelling and evaluation frameworks</li>\n<li>Design and oversee execution of capability evaluations (&#39;evals&#39;) to assess the cyber-relevant capabilities of new models</li>\n<li>Create comprehensive cyber threat models, including attack vectors, exploit chains, precursor identification, and weaponization techniques</li>\n<li>Develop and iterate on usage policies that govern responsible use of our models for emerging capabilities and use cases related to cyber harms</li>\n<li>Serve as the primary domain expert on cyber harms, advising cross-functional teams on threat landscapes and mitigation strategies</li>\n<li>Collaborate closely with internal and external threat modelling experts to develop training data for safety systems, and with ML engineers to train these systems, optimising for both robustness against adversarial attacks and low false-positive rates for legitimate security researchers</li>\n<li>Analyse safety system performance in traffic, identifying gaps and proposing improvements</li>\n<li>Conduct regular reviews of existing policies and enforcement systems to identify and address gaps and ambiguities related to cybersecurity risks</li>\n<li>Develop rigorous stress-testing of safeguards against evolving cyber threats and product surfaces</li>\n<li>Partner with Research, Product, Policy, Security Team, and Frontier Red Team to ensure cybersecurity safety is embedded throughout the model development lifecycle</li>\n<li>Translate cybersecurity domain knowledge into actionable safety requirements and clearly articulated policies</li>\n<li>Contribute to external communications, including model cards, blog posts, and policy documents related to cybersecurity safety</li>\n<li>Monitor emerging technologies and threat landscapes for their potential to contribute to new risks and mitigation strategies, and strategically address these</li>\n<li>Mentor and develop team members, fostering a culture of technical excellence and responsible AI development</li>\n</ul>\n<p><strong>You may be a good fit if you have:</strong></p>\n<ul>\n<li>An M.S. or PhD in Computer Science, Cybersecurity, or a related technical field, OR equivalent professional experience in offensive or defensive cybersecurity</li>\n<li>5+ years of hands-on experience in cybersecurity, with deep expertise in areas such as vulnerability research, exploit development, network security, malware analysis, or penetration testing</li>\n<li>2+ years of experience managing technical teams or leading complex technical projects with multiple stakeholders</li>\n<li>Experience in scientific computing and data analysis, with proficiency in programming (Python preferred)</li>\n<li>Deep expertise in modern cybersecurity, including both offensive techniques (vulnerability research, exploit development, penetration testing, malware analysis) and defensive measures (detection, monitoring, incident response)</li>\n<li>Demonstrated ability to create threat models and translate technical cyber risks into policy frameworks</li>\n<li>Familiarity with responsible disclosure practices, vulnerability coordination, and cybersecurity frameworks (e.g., MITRE ATT&amp;CK, NIST Cybersecurity Framework, CWE/CVE systems)</li>\n<li>Strong analytical and writing skills, with the ability to navigate ambiguity and explain complex technical concepts to non-technical stakeholders</li>\n<li>Experience developing policies or guidelines at scale, balancing safety concerns with enabling legitimate use cases</li>\n<li>A passion for learning new skills and an ability to rapidly adapt to changing techniques and technologies</li>\n<li>Comfort working in a fast-paced environment where priorities may shift as AI capabilities evolve</li>\n<li>Track record of translating specialised technical knowledge into actionable safety policies or enforcement guidelines</li>\n</ul>\n<p><strong>Preferred Qualifications:</strong></p>\n<ul>\n<li>Background in AI/ML systems, particularly experience with large language models</li>\n<li>Experience developing ML-based security systems or adversarial ML research</li>\n<li>Experience working with defence, intelligence, or security organisations (e.g., NSA, CISA, national labs, security contractors)</li>\n<li>Published security research, disclosed vulnerabilities, or participated in bug bounty programs</li>\n<li>Understanding of Trust &amp; Safety operations and content moderation at scale</li>\n<li>Certifications such as OSCP, OSCE, GXPN, or equivalent demonstrating technical depth</li>\n<li>Understanding of dual-use security research concerns and ethical considerations in AI safety</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c76d0c6d-ec7","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5066981008","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"The annual compensation for this role is not specified in the job posting.","x-skills-required":["cybersecurity","vulnerability research","exploit development","network security","malware analysis","penetration testing","scientific computing","data analysis","programming (Python)","threat modelling","policy frameworks","responsible disclosure practices","vulnerability coordination","cybersecurity frameworks (e.g., MITRE ATT&CK, NIST Cybersecurity Framework, CWE/CVE systems)"],"x-skills-preferred":["AI/ML systems","large language models","ML-based security systems","adversarial ML research","defence, intelligence, or security organisations","NSA, CISA, national labs, security contractors","published security research","disclosed vulnerabilities","bug bounty programs","Trust & Safety operations","content moderation at scale","OSCP, OSCE, GXPN, or equivalent certifications","dual-use security research concerns","ethical considerations in AI safety"],"datePosted":"2026-03-08T13:50:25.823Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA, Washington, DC"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cybersecurity, vulnerability research, exploit development, network security, malware analysis, penetration testing, scientific computing, data analysis, programming (Python), threat modelling, policy frameworks, responsible disclosure practices, vulnerability coordination, cybersecurity frameworks (e.g., MITRE ATT&CK, NIST Cybersecurity Framework, CWE/CVE systems), AI/ML systems, large language models, ML-based security systems, adversarial ML research, defence, intelligence, or security organisations, NSA, CISA, national labs, security contractors, published security research, disclosed vulnerabilities, bug bounty programs, Trust & Safety operations, content moderation at scale, OSCP, OSCE, GXPN, or equivalent certifications, dual-use security research concerns, ethical considerations in AI safety"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d83abc11-64e"},"title":"Researcher, Misalignment Research","description":"<p><strong>Location</strong></p>\n<p>New York City; San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Safety Systems</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$380K – $445K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong><strong>About the Team</strong></strong></p>\n<p>Safety Systems sits at the forefront of OpenAI’s mission to build and deploy safe AGI, ensuring our most capable models can be released responsibly and for the benefit of society. Within Safety Systems, we are building a misalignment research team to focus on the most pressing problems for the future of AGI. Our mandate is to identify, quantify, and understand future AGI misalignment risks far in advance of when they can pose harm.</p>\n<p>The work of this research taskforce spans four pillars:</p>\n<ol>\n<li><strong>Worst‑Case Demonstrations</strong> – Craft compelling, reality‑anchored demos that reveal how AI systems can go wrong. We focus especially on high importance cases where misaligned AGI could pursue goals at odds with human well being.</li>\n</ol>\n<ol>\n<li><strong>Adversarial &amp; Frontier Safety Evaluations</strong> – Transform those demos into rigorous, repeatable evaluations that measure dangerous capabilities and residual risks. Topics of interest include deceptive behavior, scheming, reward hacking, deception in reasoning, and power-seeking, along with other related areas.</li>\n</ol>\n<ol>\n<li><strong>System‑Level Stress Testing</strong> – Build automated infrastructure to probe entire product stacks, assessing end‑to‑end robustness under extreme conditions. We treat misalignment as an evolving adversary, escalating tests until we find breaking points even as systems continue to improve.</li>\n</ol>\n<ol>\n<li><strong>Alignment Stress‑Testing Research</strong> – Investigate why mitigations break, publishing insights that shape strategy and next‑generation safeguards. We collaborate with other labs when useful and actively share misalignment findings to accelerate collective progress.</li>\n</ol>\n<p><strong><strong>About the Role</strong></strong></p>\n<p>We are seeking a Senior Researcher who is passionate about red‑teaming and AI safety. In this role you will design and execute cutting‑edge attacks, build adversarial evaluations, and advance our understanding of how safety measures can fail—and how to fix them. Your insights will directly influence OpenAI’s product launches and long‑term safety roadmap.</p>\n<p><strong><strong>In this role, you will</strong></strong></p>\n<ul>\n<li>Design and implement worst‑case demonstrations that make AGI alignment risks concrete for stakeholders, focused on high stakes use cases described above.</li>\n</ul>\n<ul>\n<li>Develop adversarial and system‑level evaluations grounded in those demonstrations, driving adoption across OpenAI.</li>\n</ul>\n<ul>\n<li>Create automated tools and infrastructure to scale automated red‑teaming and stress testing.</li>\n</ul>\n<ul>\n<li>Conduct research on failure modes of alignment techniques and propose improvements.</li>\n</ul>\n<ul>\n<li>Publish influential internal or external papers that shift safety strategy or industry practice. We aim to concretely reduce existential AI risk.</li>\n</ul>\n<ul>\n<li>Partner with engineering, research, policy, and legal teams to integrate findings into product safeguards and governance processes.</li>\n</ul>\n<ul>\n<li>Mentor engineers and researchers, fostering a culture of rigorous, impact‑oriented safety work.</li>\n</ul>\n<p><strong><strong>You might thrive in this role if you</strong></strong></p>\n<ul>\n<li>Already are thinking about these problems night and day, and share our mission to build safe, universally beneficial AGI and align with the OpenAI Charter.</li>\n</ul>\n<ul>\n<li>Have 4+ years of experience in AI red‑teaming, security research, adversarial ML, or related safety fields.</li>\n</ul>\n<ul>\n<li>Possess a strong research track record—publications, open‑source projects, or high‑impact internal work—demonstrating creativity in uncovering and exploiting system weaknesses.</li>\n</ul>\n<ul>\n<li>Are fluent in modern ML / AI techniques and comfortable hacking on large‑scale codebases and evaluation infrastructure.</li>\n</ul>\n<ul>\n<li>Communicate clearly with both technical and non‑technical audiences, translating complex findings into actionable recommendations.</li>\n</ul>\n<ul>\n<li>Enjoy collaboration and can drive cross‑functional projects that span research, engineering, and policy.</li>\n</ul>\n<ul>\n<li>Hold a Ph.D., master’s degree, or equivalent experience in computer science, machine learning, security, or a related field.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d83abc11-64e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/7055f010-99f4-4c76-8361-ba5b5f9af1d0","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$380K – $445K","x-skills-required":["AI red-teaming","security research","adversarial ML","safety fields","modern ML / AI techniques","large-scale codebases","evaluation infrastructure"],"x-skills-preferred":["publications","open-source projects","high-impact internal work","creativity in uncovering and exploiting system weaknesses"],"datePosted":"2026-03-06T18:37:33.422Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City; San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI red-teaming, security research, adversarial ML, safety fields, modern ML / AI techniques, large-scale codebases, evaluation infrastructure, publications, open-source projects, high-impact internal work, creativity in uncovering and exploiting system weaknesses","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":380000,"maxValue":445000,"unitText":"YEAR"}}}]}