{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/valuation-methodologies"},"x-facet":{"type":"skill","slug":"valuation-methodologies","display":"Valuation Methodologies","count":13},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_465e2cfb-ddc"},"title":"Staff Machine Learning Research Scientist, LLM Evals","description":"<p>As a Staff Machine Learning Research Scientist on the LLM Evals team, you will lead the development of novel evaluation methodologies, metrics, and benchmarks to measure the capabilities and limitations of frontier LLMs.</p>\n<p>Your primary responsibilities will include:</p>\n<ul>\n<li>Driving research on the effectiveness and limitations of existing LLM evaluation techniques.</li>\n<li>Designing and developing novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.</li>\n<li>Communicating, collaborating, and building relationships with clients and peer teams to facilitate cross-functional projects.</li>\n<li>Collaborating with internal teams and external partners to refine metrics and create standardized evaluation protocols.</li>\n<li>Implementing scalable and reproducible evaluation pipelines using modern ML frameworks.</li>\n<li>Publishing research findings in top-tier AI conferences and contributing to open-source benchmarking initiatives.</li>\n<li>Mentoring and guiding research scientists and engineers, providing technical leadership across cross-functional projects.</li>\n<li>Staying deeply engaged with the ML research community, tracking emerging work and contributing to the advancement of LLM evaluation science.</li>\n</ul>\n<p>The ideal candidate will have 5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development.</p>\n<p>You will thrive in a high-energy, fast-paced startup environment and be ready to dedicate the time and effort needed to drive impactful results.</p>\n<p>Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_465e2cfb-ddc","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4628044005","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$264,800-$331,000 USD","x-skills-required":["large language model","NLP","Transformer modeling","evaluation methodologies","metrics","benchmarks","instruction following","factuality","robustness","fairness"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:59:31.100Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; Seattle, WA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large language model, NLP, Transformer modeling, evaluation methodologies, metrics, benchmarks, instruction following, factuality, robustness, fairness","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":264800,"maxValue":331000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_60a7e1e6-b51"},"title":"Tech Lead/Manager, Machine Learning Research Scientist- LLM Evals","description":"<p>As the leading data and evaluation partner for frontier AI companies, we&#39;re dedicated to advancing the evaluation and benchmarking of large language models (LLMs). Our Research teams work with the industry&#39;s leading AI labs to provide high-quality data and accelerate progress in GenAI research.</p>\n<p>We&#39;re seeking a Tech Lead Manager to lead a talented team of research scientists and research engineers focused on developing and implementing novel evaluation methodologies, metrics, and benchmarks to assess the capabilities and limitations of our cutting-edge LLMs.</p>\n<p>Key responsibilities:</p>\n<ul>\n<li>Lead a team of highly effective research scientists and research engineers on LLM evals.</li>\n<li>Conduct research on the effectiveness and limitations of existing LLM evaluation techniques.</li>\n<li>Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.</li>\n<li>Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.</li>\n<li>Collaborate with internal teams and external partners to refine metrics and create standardized evaluation protocols.</li>\n<li>Implement scalable and reproducible evaluation pipelines using modern ML frameworks.</li>\n<li>Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives.</li>\n</ul>\n<p>Ideal candidate has 5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development. Experience supporting and leading a team of research scientists and research engineers is also required.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_60a7e1e6-b51","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4304790005","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$264,800-$331,000 USD","x-skills-required":["large language model","NLP","Transformer modeling","research and engineering development","team leadership","cross-functional collaboration","evaluation methodologies","metrics and benchmarks","scalable and reproducible evaluation pipelines","modern ML frameworks"],"x-skills-preferred":["published research in top-tier AI conferences","open-source benchmarking initiatives","customer-facing role"],"datePosted":"2026-04-18T15:59:10.794Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; Seattle, WA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large language model, NLP, Transformer modeling, research and engineering development, team leadership, cross-functional collaboration, evaluation methodologies, metrics and benchmarks, scalable and reproducible evaluation pipelines, modern ML frameworks, published research in top-tier AI conferences, open-source benchmarking initiatives, customer-facing role","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":264800,"maxValue":331000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0fc8ace0-e57"},"title":"Finance Fellow - Human Frontier Collective (US)","description":"<p>This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension.</p>\n<p>As an HFC Fellow, you&#39;ll apply your academic and professional expertise to help design, evaluate, and interpret advanced generative AI systems.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Collaborative Work: Engage in high-impact projects with partnered AI labs and platforms, developing and evaluating complex financial scenarios to test and enhance AI accuracy in valuation, forecasting, and analysis.</li>\n<li>HFC Community: Become part of a supportive, interdisciplinary network of innovators and thought leaders committed to advancing frontier AI across domains.</li>\n<li>Contribute to Research Publications: Collaborate with Scale&#39;s research team to co-author technical reports and research papers.</li>\n</ul>\n<p>To be eligible, candidates must be authorised to work in the United States; visa sponsorship is not available for this role.</p>\n<p>Requirements include:</p>\n<ul>\n<li>Education: PhD, Master, BBA in Finance, Economics, or related degrees.</li>\n<li>Certifications: CFA or CPA strongly preferred.</li>\n<li>Professional Background: 5+ years in FP&amp;A, Investment Banking, Portfolio Management, Financial Consulting, or equivalent research experience.</li>\n<li>Skills: Strong proficiency in financial modelling, valuation methodologies, forecasting, asset pricing, DCF analysis, market analysis, and risk assessment.</li>\n</ul>\n<p>Professional mindset: detail-oriented, innovative thinker with a passion for financial technology and a commitment to collaboration.</p>\n<p>Why join the HFC?</p>\n<ul>\n<li>Advance Finance AI Solutions: Apply your financial expertise to solve complex, real-world challenges where your judgment shapes advanced AI reasoning and decision-making.</li>\n<li>Professional Development: High-impact finance experts expand their influence through review projects, advisory roles, and research.</li>\n<li>Join a Top-Tier Network: Collaborate with a global network of academics and experts to advance responsible AI through impactful, flexible research and training.</li>\n<li>Flexible Schedule: Set your own schedule, with flexible 10–40 hour weeks that fit around your life and other commitments.</li>\n<li>Competitive Pay: Project pay rates vary across platforms and are depending on a number of factors, including but not limited to; projects, scope, skillset, and location.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0fc8ace0-e57","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Human Frontier Collective","sameAs":"https://humanfrontiercollective.com/","logo":"https://logos.yubhub.co/humanfrontiercollective.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4565836005","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"contract","x-salary-range":null,"x-skills-required":["financial modelling","valuation methodologies","forecasting","asset pricing","DCF analysis","market analysis","risk assessment"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:55:13.359Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"CONTRACTOR","occupationalCategory":"Finance","industry":"Finance","skills":"financial modelling, valuation methodologies, forecasting, asset pricing, DCF analysis, market analysis, risk assessment"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_10bf8d86-b30"},"title":"Research Engineer, Safeguards Labs","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re hiring research engineers to define and execute the Labs research agenda. You&#39;ll scope your own projects, run experiments end-to-end, and decide when an idea is ready to hand off to a production team , or when to kill it and move on.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Lead and contribute to research projects investigating new methods for detecting misuse of Claude, identifying malicious organisations and accounts, strengthening model safeguards, and other safety needs.</li>\n</ul>\n<ul>\n<li>Design and run offline analyses over model usage data to surface abuse patterns, build classifiers and detection systems, and evaluate their effectiveness.</li>\n</ul>\n<ul>\n<li>Develop and iterate on prototypes that could eventually feed signals into the real-time safeguards path, partnering with engineers on tech transfer.</li>\n</ul>\n<ul>\n<li>Contribute to a broader research portfolio investigating methods for detecting abusive behaviour in chat-based or agentive workflows, and for training the model to robustly refrain from dangerous responses or behaviours without over-refusing.</li>\n</ul>\n<ul>\n<li>Build evaluations and methodologies for measuring whether safeguards actually work, including in agentic settings.</li>\n</ul>\n<ul>\n<li>Write up findings clearly so they inform decisions across Trust &amp; Safety, research, and product teams.</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have a track record of independently driving research projects from ambiguous problem statements to concrete results, ideally in AI, ML, security, integrity, or a related technical field.</li>\n</ul>\n<ul>\n<li>Are comfortable scoping your own work and switching between research, engineering, and analysis as a project demands.</li>\n</ul>\n<ul>\n<li>Have working familiarity with how large language models operate , sampling, prompting, training , even if LLMs aren&#39;t your primary background.</li>\n</ul>\n<ul>\n<li>Are proficient in Python and comfortable working with large datasets.</li>\n</ul>\n<ul>\n<li>Care about the societal impacts of AI and want your work to directly reduce real-world harm.</li>\n</ul>\n<p><strong>Strong candidates may also have:</strong></p>\n<ul>\n<li>Experience building and training machine learning models, including classifiers for abuse, fraud, integrity, or security applications.</li>\n</ul>\n<ul>\n<li>Knowledge of evaluation methodologies for language models and experience designing evals.</li>\n</ul>\n<ul>\n<li>Experience with agentic environments and evaluating model behaviour in them.</li>\n</ul>\n<ul>\n<li>Background in trust and safety, integrity, fraud detection, threat intelligence, or adversarial ML.</li>\n</ul>\n<ul>\n<li>Experience with red teaming, jailbreak research, or interpretability methods like steering vectors.</li>\n</ul>\n<ul>\n<li>A history of taking research prototypes and transferring them into production systems.</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<ul>\n<li>Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience</li>\n</ul>\n<ul>\n<li>Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience</li>\n</ul>\n<ul>\n<li>Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive compensation and benefits</li>\n</ul>\n<ul>\n<li>Optional equity donation matching</li>\n</ul>\n<ul>\n<li>Generous vacation and parental leave</li>\n</ul>\n<ul>\n<li>Flexible working hours</li>\n</ul>\n<ul>\n<li>Lovely office space in which to collaborate with colleagues</li>\n</ul>\n<p><strong>Visa Sponsorship</strong></p>\n<ul>\n<li>We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_10bf8d86-b30","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5191785008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000-$850,000 USD","x-skills-required":["Python","Machine learning","Large language models","Security","Integrity"],"x-skills-preferred":["Experience building and training machine learning models","Knowledge of evaluation methodologies for language models","Experience with agentic environments","Background in trust and safety","Experience with red teaming"],"datePosted":"2026-04-18T15:55:10.055Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Machine learning, Large language models, Security, Integrity, Experience building and training machine learning models, Knowledge of evaluation methodologies for language models, Experience with agentic environments, Background in trust and safety, Experience with red teaming","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1d170ed5-b32"},"title":"Finance Fellow - Human Frontier Collective (UK)","description":"<p>This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension.</p>\n<p>As a Finance Fellow at the Human Frontier Collective, you will apply your academic and professional expertise to help design, evaluate, and interpret advanced generative AI systems.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Collaborative Work: Get invited to engage in high-impact projects with our partnered AI labs and platforms. Develop and evaluate complex financial scenarios to test and enhance AI accuracy in valuation, forecasting, and analysis; collaborate in interactive sessions to assess model outputs for market relevance and compliance; provide detailed feedback to improve AI capabilities; and contribute to specialized projects in financial modeling, risk assessment, and equity research.</li>\n</ul>\n<ul>\n<li>HFC Community: Beyond the work, you will become part of a supportive, interdisciplinary network of innovators and thought leaders committed to advancing frontier AI across domains.</li>\n</ul>\n<ul>\n<li>Contribute to Research Publications: Collaborate with Scale&#39;s research team to co-author technical reports and research papers,boosting your academic visibility and professional recognition.</li>\n</ul>\n<p>To be eligible, candidates must have a PhD, Master, BBA in Finance, Economics, or related degrees, and 5+ years in FP&amp;A, Investment Banking, Portfolio Management, Financial Consulting, or equivalent research experience.</p>\n<p>Key skills include strong proficiency in financial modeling, valuation methodologies, forecasting, asset pricing, DCF analysis, market analysis, and risk assessment.</p>\n<p>Project pay rates vary across platforms and are depending on a number of factors, including but not limited to; projects, scope, skillset, and location.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1d170ed5-b32","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Human Frontier Collective","sameAs":"https://humanfrontiercollective.com/","logo":"https://logos.yubhub.co/humanfrontiercollective.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4613327005","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"contract","x-salary-range":null,"x-skills-required":["financial modeling","valuation methodologies","forecasting","asset pricing","DCF analysis","market analysis","risk assessment"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:54:32.617Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United Kingdom"}},"jobLocationType":"TELECOMMUTE","employmentType":"CONTRACTOR","occupationalCategory":"Finance","industry":"Finance","skills":"financial modeling, valuation methodologies, forecasting, asset pricing, DCF analysis, market analysis, risk assessment"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8ffa2041-8b1"},"title":"Collateral Review Associate (RTL)","description":"<p><strong>About the role</strong></p>\n<p>Behind many of life&#39;s most important transactions , buying a house, applying for a mortgage, getting a small business loan, or refinancing a credit card , is a network of credit relationships. Setpoint provides critical infrastructure for relationships between the world&#39;s largest banks, credit funds, and capital markets counterparties.</p>\n<p>We&#39;re looking for a Collateral Review Associate (RTL) to join our team!</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Conduct reviews of and validate loan package documentation and materials related to purchasing of property, documents include but are not limited to: credit, background, appraisal, title, HUD-1, flood cert, mortgage, note, guaranty, repair budgets, and entity documents.</li>\n<li>Update internal web applications related to loan package materials and documentation.</li>\n<li>Validate all materials have been updated in the system of record.</li>\n<li>Perform daily operational audits for documentation requirements and data processing.</li>\n<li>Provide assistance in communication with internal and external clients with questions or concerns on collateral.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>3+ years&#39; experience in title, valuations, or mortgage loan underwriting.</li>\n<li>Strong computer abilities including intermediate to advanced Excel skills, Microsoft Word, and experience with software interfaces.</li>\n<li>Ability to work extended hours based on the flow needs of the client - this includes month end, quarter end, and year end increases in volume.</li>\n<li>Knowledge of mortgage and business loans, including underwriting and documentation standards, valuation methodologies and secondary loan markets and servicing and monitoring practices.</li>\n<li>Ability to work independently, prioritize, and plan work activities while also being effective in a group setting.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<p>We offer a comprehensive benefits package that includes competitive salaries, stock options, medical, dental, and vision coverage, 401(k), short term and long term disability coverage, and flexible vacation.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8ffa2041-8b1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Setpoint","sameAs":"https://setpoint.com","logo":"https://logos.yubhub.co/setpoint.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/setpoint/jobs/4288575007","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Excel","Microsoft Word","Software interfaces","Mortgage and business loans","Underwriting and documentation standards","Valuation methodologies"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:54:30.571Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Salt Lake City (Hybrid)"}},"employmentType":"FULL_TIME","occupationalCategory":"Finance","industry":"Finance","skills":"Excel, Microsoft Word, Software interfaces, Mortgage and business loans, Underwriting and documentation standards, Valuation methodologies"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_bc3afd53-1f3"},"title":"Valuations Associate","description":"<p><strong>The Role</strong></p>\n<p>Join our Carta Europe Valuations Team as a Valuation Associate. You will provide high-quality, defensible valuation reports for our European clients, primarily focused on 409A, EMI, and CSOP valuations.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Prepare financial models and valuation reports for private companies.</li>\n<li>Conduct research and analysis on company financials, industry trends, and market comparables to support valuation conclusions.</li>\n<li>Collaborate with client auditors by providing clear, detailed, and persuasive responses to all valuation-related inquiries.</li>\n<li>Work closely with the Corporations Team and clients to ensure all necessary documentation and information are received in a timely manner.</li>\n<li>Identify and implement enhancements to valuation methodologies and reporting processes to increase efficiency and accuracy.</li>\n<li>Share your deep understanding of Valuations with our Product Team to introduce product improvements;</li>\n<li>Resolve customer inquiries in a timely and professional manner;</li>\n<li>Maintain an in-depth understanding of Carta&#39;s software platform and products;</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Bachelor&#39;s degree in Finance, Accounting, Economics, or a related quantitative field.</li>\n<li>Experience in business valuation, corporate finance, accounting, or a related role.</li>\n<li>Exceptional analytical skills with proficiency in financial modeling using Microsoft Excel or Google Sheets.</li>\n<li>Excellent written and verbal communication skills, with the ability to articulate complex valuation concepts clearly to non-experts and auditors.</li>\n<li>A keen eye for detail and a commitment to producing high-quality, audit-defensible work.</li>\n<li>Comfortable learning quickly and taking on new challenges</li>\n<li>Exhibit diplomacy, tact, and poise under pressure when working through customer issues as well as a strong sense of curiosity to solve problems</li>\n</ul>\n<p><strong>Preferred Qualifications</strong></p>\n<ul>\n<li>Progress toward a relevant professional certification (e.g., CFA, ASA, CPA/ACCA).</li>\n<li>Experience with European valuation regulations, specifically 409A, EMI, and CSOP.</li>\n<li>Familiarity with the startup and venture capital ecosystem.</li>\n<li>Understanding of valuation methodologies (e.g., DCF, comparable company analysis, transaction multiples)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_bc3afd53-1f3","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Carta","sameAs":"https://carta.com/","logo":"https://logos.yubhub.co/carta.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/carta/jobs/7694547003","x-work-arrangement":"onsite","x-experience-level":"entry","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Financial modeling","Business valuation","Corporate finance","Accounting","Excel","Google Sheets","Analytical skills","Communication skills","Attention to detail"],"x-skills-preferred":["CFA","ASA","CPA/ACCA","European valuation regulations","Startup and venture capital ecosystem","Valuation methodologies"],"datePosted":"2026-04-18T15:52:33.666Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, England"}},"employmentType":"FULL_TIME","occupationalCategory":"Finance","industry":"Technology","skills":"Financial modeling, Business valuation, Corporate finance, Accounting, Excel, Google Sheets, Analytical skills, Communication skills, Attention to detail, CFA, ASA, CPA/ACCA, European valuation regulations, Startup and venture capital ecosystem, Valuation methodologies"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a0355e9d-a71"},"title":"Research Lead, Training Insights","description":"<p>As a Research Lead on the Training Insights team, you&#39;ll develop the strategy for, and lead execution on, how we measure and characterise model capabilities across training and deployment. This is a hands-on leadership role: you&#39;ll drive original research into new evaluation methodologies while leading a small team of researchers and research engineers doing the same.</p>\n<p>Your work will span the full lifecycle of model development. You&#39;ll research and build new long-horizon evaluations that test the boundaries of what our models can achieve, develop novel approaches to measuring emerging capabilities, and deepen our understanding of how those capabilities develop , both during production RL training and after. You&#39;ll also take a cross-organisational view, working across Reinforcement Learning, Pretraining, Inference, Product, Alignment, Safeguards, and other teams to map the landscape of model evaluations at Anthropic and identify critical gaps in coverage.</p>\n<p>This role carries significant visibility and impact. You&#39;ll help shape the evaluation narrative for model releases, contributing directly to how Anthropic communicates about its models to both internal and external audiences. Done well, you will change how the industry measures and understands model capabilities, significantly furthering our safety mission.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build new novel and long-horizon evaluations</li>\n<li>Develop novel measurement approaches for understanding how model capabilities emerge and evolve during RL training</li>\n<li>Lead strategic evaluation coverage across the company</li>\n<li>Shape the evaluation narrative for model releases</li>\n<li>Lead and mentor a small team of researchers and research engineers, setting research direction and fostering a culture of rigorous, creative research</li>\n<li>Design evaluation frameworks that balance scientific rigor with the practical demands of production training schedules</li>\n<li>Build and maintain relationships across Anthropic&#39;s research organisation to ensure evaluation insights inform training and deployment decisions</li>\n<li>Contribute to the broader research community through publications, open-source contributions, or external engagement on evaluation best practices</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have significant experience designing and running evaluations for large language models or similar complex ML systems</li>\n<li>Have led technical projects or teams, either formally or through sustained ownership of critical research directions</li>\n<li>Are equally comfortable designing experiments and writing code,you can move between research and implementation fluidly</li>\n<li>Think strategically about what to measure and why, not just how to measure it</li>\n<li>Can synthesise information across multiple teams and workstreams to form a coherent picture of model capabilities</li>\n<li>Communicate complex technical findings clearly to both technical and non-technical audiences</li>\n<li>Are results-oriented and thrive in fast-paced environments where priorities shift based on research findings</li>\n<li>Care deeply about AI safety and want your work to directly influence how capable AI systems are developed and deployed</li>\n</ul>\n<p>Strong candidates may also have:</p>\n<ul>\n<li>Experience building evaluations for long-horizon or agentic tasks</li>\n<li>Deep familiarity with Reinforcement Learning training dynamics and how model behaviour changes during training</li>\n<li>Published research in machine learning evaluation, benchmarking, or related areas</li>\n<li>Experience with safety evaluation frameworks and red teaming methodologies</li>\n<li>Background in psychometrics, experimental psychology, or other measurement-focused disciplines</li>\n<li>A track record of communicating evaluation results to inform high-stakes decisions about model development or deployment</li>\n<li>Experience managing or mentoring researchers and engineers</li>\n</ul>\n<p>Representative projects:</p>\n<ul>\n<li>Designing and implementing a suite of long-horizon evaluations that test model capabilities on tasks requiring sustained reasoning, planning, and tool use over extended interactions</li>\n<li>Building systems to track capability development across RL training checkpoints, surfacing insights about when and how specific capabilities emerge</li>\n<li>Conducting a cross-org audit of evaluation coverage, identifying blind spots, and prioritising new evaluations to fill critical gaps across Pretraining, RL, Inference, and Product</li>\n<li>Developing the evaluation methodology and narrative for a major model release, working with research leads and communications to clearly characterise model capabilities and limitations</li>\n<li>Researching and prototyping novel evaluation approaches for capabilities that are difficult to measure with existing benchmarks</li>\n<li>Leading a team effort to build reusable evaluation infrastructure that serves multiple teams across the research organisation</li>\n</ul>\n<p>The annual compensation range for this role is $850,000.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a0355e9d-a71","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5139654008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$850,000-$850,000","x-skills-required":["AI","Machine Learning","Reinforcement Learning","Evaluation Methodologies","Research Leadership","Team Management","Communication","Results-Oriented","Fast-Paced Environments"],"x-skills-preferred":["Long-Horizon Evaluations","Agentic Tasks","Safety Evaluation Frameworks","Red Teaming Methodologies","Psychometrics","Experimental Psychology","Measurement-Focused Disciplines"],"datePosted":"2026-04-18T15:46:21.084Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly (Travel Required) | San Francisco, CA; San Francisco, CA | New York City, NY"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI, Machine Learning, Reinforcement Learning, Evaluation Methodologies, Research Leadership, Team Management, Communication, Results-Oriented, Fast-Paced Environments, Long-Horizon Evaluations, Agentic Tasks, Safety Evaluation Frameworks, Red Teaming Methodologies, Psychometrics, Experimental Psychology, Measurement-Focused Disciplines","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":850000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_deb98db6-eba"},"title":"Staff Software Engineer, Search Quality","description":"<p>At Databricks, we are enabling data teams to solve the world&#39;s toughest problems by building and running the world&#39;s best data and AI infrastructure platform.来たSearch plays a foundational role in this mission, powering everything from Retrieval Augmented Generation (RAG), AI assistants, and recommendation systems to enterprise knowledge management, in-product search, and data exploration.</p>\n<p>As a Staff Software Engineer for Search Quality, you will drive the technical direction of ranking, relevance, evaluation, and quality initiatives across Databricks&#39; next-generation Search product. You&#39;ll design and build the systems, models, and evaluation frameworks that ensure our Search stack delivers accurate, high-quality results across diverse multimodal datasets and query patterns.</p>\n<p>The impact you will have:</p>\n<ul>\n<li>Lead the technical vision for Search Quality, shaping the ranking architecture, relevance modeling stack, and evaluation systems that power Databricks&#39; next-generation retrieval experiences.</li>\n</ul>\n<ul>\n<li>Identify and solve challenges in ranking, query understanding, and hybrid retrieval , advancing state-of-the-art techniques in vector, keyword, and multimodal search.</li>\n</ul>\n<ul>\n<li>Design and train production-ready ranking and reranking models with strong guarantees around quality, latency, and resource efficiency.</li>\n</ul>\n<ul>\n<li>Partner closely with research, product, and infra teams to define metrics, evaluation methodologies, and experimentation strategies for new retrieval features and model architectures.</li>\n</ul>\n<ul>\n<li>Drive end-to-end engineering efforts , from early prototyping to production rollout , ensuring correctness, reliability, and measurable improvements to relevance.</li>\n</ul>\n<ul>\n<li>Build and operate resilient, low-latency services for ranking, evaluation, and relevance signal processing.</li>\n</ul>\n<ul>\n<li>Champion excellence in ML and search engineering, mentoring teammates and elevating design, code quality, and scientific rigor across the team.</li>\n</ul>\n<ul>\n<li>Shape Databricks&#39; long-term roadmap for retrieval quality, ranking infrastructure, and the foundations for retrieval-driven AI products.</li>\n</ul>\n<p>What we look for:</p>\n<ul>\n<li>10+ years of experience building large-scale search, ranking, recommendation, or ML-driven relevance systems.</li>\n</ul>\n<ul>\n<li>Deep expertise in Search Quality, including ranking models, signals, query understanding, and evaluation methodologies.</li>\n</ul>\n<ul>\n<li>Strong understanding of relevance metrics and evaluation frameworks.</li>\n</ul>\n<ul>\n<li>Familiarity with vector search, keyword search, hybrid retrieval, and embedding-based semantic retrieval.</li>\n</ul>\n<ul>\n<li>Solid foundation in algorithms, data structures, and system design for performance-critical ranking and retrieval systems.</li>\n</ul>\n<ul>\n<li>Proven ability to deliver high-impact technical initiatives with clear business or product outcomes.</li>\n</ul>\n<ul>\n<li>Strong communication skills and ability to collaborate across teams in fast-moving environments.</li>\n</ul>\n<ul>\n<li>Strategic and product-oriented mindset with the ability to align technical execution with long-term vision.</li>\n</ul>\n<ul>\n<li>Passion for mentoring, growing engineers, and fostering technical excellence.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_deb98db6-eba","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8295792002","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$165,300-$219,675 USD","x-skills-required":["large-scale search","ranking","recommendation","ML-driven relevance systems","Search Quality","ranking models","signals","query understanding","evaluation methodologies","relevance metrics","evaluation frameworks","vector search","keyword search","hybrid retrieval","embedding-based semantic retrieval","algorithms","data structures","system design"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:44:36.338Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large-scale search, ranking, recommendation, ML-driven relevance systems, Search Quality, ranking models, signals, query understanding, evaluation methodologies, relevance metrics, evaluation frameworks, vector search, keyword search, hybrid retrieval, embedding-based semantic retrieval, algorithms, data structures, system design","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165300,"maxValue":219675,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f5d92fd6-e21"},"title":"Prompt Engineer, Agent Prompts & Evals","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re looking for prompt and context engineers to join our product engineering team to help build AI-first products, features, and evaluations. Your mission will be to bridge the gap between model capabilities and real product experience, working with product teams to build consistent, safe, and beneficial user experiences across all product surfaces.</p>\n<p><strong>Key Responsibilities</strong></p>\n<ul>\n<li>Design, test, and optimize system prompts and feature-specific prompts that shape Claude&#39;s behavior across consumer and API products.</li>\n<li>Build and maintain comprehensive evaluation suites that ensure model quality and consistency across product launches and updates.</li>\n<li>Partner closely with product teams, research teams, and safeguards to ensure new features meet quality and safety standards.</li>\n<li>Play a critical role in model releases, ensuring smooth rollouts and catching regressions before they impact users.</li>\n<li>Help build and improve the frameworks and tools that allow teams to develop and test prompts and features with confidence.</li>\n<li>Mentor product engineers on prompt engineering best practices and help teams build their first evaluations.</li>\n<li>Work in a fast-paced environment where model capabilities advance daily, requiring quick adaptation and creative problem-solving.</li>\n</ul>\n<p><strong>What We&#39;re Looking For</strong></p>\n<ul>\n<li>5+ years of software engineering experience with Python or similar languages.</li>\n<li>Demonstrated experience with LLMs and prompt engineering (through work, research, or significant personal projects).</li>\n<li>Strong understanding of evaluation methodologies and metrics for AI systems.</li>\n<li>Excellent written and verbal communication skills – you&#39;ll need to explain complex model behaviors to diverse stakeholders.</li>\n<li>Ability to manage multiple concurrent projects and prioritize effectively.</li>\n<li>Experience with version control, CI/CD, and modern software development practices.</li>\n</ul>\n<p><strong>You Might Thrive in This Role If You…</strong></p>\n<ul>\n<li>Get excited about the nuances of how language models behave and love finding creative ways to improve their outputs.</li>\n<li>Enjoy being at the intersection of research and product, translating cutting-edge capabilities into user value.</li>\n<li>Are comfortable with ambiguity and can define success metrics for novel AI features.</li>\n<li>Have a strong sense of ownership and drive projects from conception to production.</li>\n<li>Are passionate about building AI systems that are helpful, harmless, and honest.</li>\n<li>Thrive in collaborative environments and enjoy teaching others.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f5d92fd6-e21","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5107121008","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["Python","LLMs","Prompt engineering","Evaluation methodologies","Metrics for AI systems","Version control","CI/CD","Modern software development practices"],"x-skills-preferred":["Claude","A/B testing","Experimentation frameworks","AI safety","Alignment considerations","Building tools and infrastructure for ML/AI workflows"],"datePosted":"2026-04-18T15:43:24.370Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, LLMs, Prompt engineering, Evaluation methodologies, Metrics for AI systems, Version control, CI/CD, Modern software development practices, Claude, A/B testing, Experimentation frameworks, AI safety, Alignment considerations, Building tools and infrastructure for ML/AI workflows","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_557894f1-074"},"title":"Prompt Engineer, Agent Prompts & Evals","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>We’re looking for prompt and context engineers to join our product engineering team to help build AI-first products, features, and evaluations. Your mission will be to bridge the gap between model capabilities and real product experience, working with product teams to build consistent, safe, and beneficial user experiences across all product surfaces.</p>\n<p>You will be deeply involved in new product feature and model releases at Anthropic, combining engineering expertise with an understanding of frontier AI applications and model quality. You’ll become an expert on Claude’s behavioural quirks and capabilities and apply that knowledge to deliver the best possible user experience across models and domains. You’ll be the first resource for product teams working on Claude’s AI infrastructure: system prompts, tool prompts, skills, and evaluations.</p>\n<p>This role requires someone who can effectively balance caring deeply about making Claude the best it can be while also supporting a wide variety of concurrent projects and efforts across many product teams.</p>\n<p><strong>Key Responsibilities</strong></p>\n<ul>\n<li><strong>Prompt Engineering Excellence:</strong> Design, test, and optimise system prompts and feature-specific prompts that shape Claude’s behaviour across consumer and API products.</li>\n</ul>\n<ul>\n<li><strong>Evaluation Development:</strong> Build and maintain comprehensive evaluation suites that ensure model quality and consistency across product launches and updates.</li>\n</ul>\n<ul>\n<li><strong>Cross-functional Collaboration:</strong> Partner closely with product teams, research teams, and safeguards to ensure new features meet quality and safety standards.</li>\n</ul>\n<ul>\n<li><strong>Model Launch Support:</strong> Play a critical role in model releases, ensuring smooth rollouts and catching regressions before they impact users.</li>\n</ul>\n<ul>\n<li><strong>Infrastructure Contribution:</strong> Help build and improve the frameworks and tools that allow teams to develop and test prompts and features with confidence.</li>\n</ul>\n<ul>\n<li><strong>Knowledge Transfer:</strong> Mentor product engineers on prompt engineering best practices and help teams build their first evaluations.</li>\n</ul>\n<ul>\n<li><strong>Rapid Iteration:</strong> Work in a fast-paced environment where model capabilities advance daily, requiring quick adaptation and creative problem-solving.</li>\n</ul>\n<p><strong>What We’re Looking For</strong></p>\n<p><strong>Required Qualifications</strong></p>\n<ul>\n<li>5+ years of software engineering experience with Python or similar languages.</li>\n</ul>\n<ul>\n<li>Demonstrated experience with LLMs and prompt engineering (through work, research, or significant personal projects).</li>\n</ul>\n<ul>\n<li>Strong understanding of evaluation methodologies and metrics for AI systems.</li>\n</ul>\n<ul>\n<li>Excellent written and verbal communication skills – you’ll need to explain complex model behaviours to diverse stakeholders.</li>\n</ul>\n<ul>\n<li>Ability to manage multiple concurrent projects and prioritise effectively.</li>\n</ul>\n<ul>\n<li>Experience with version control, CI/CD, and modern software development practices.</li>\n</ul>\n<p><strong>Preferred Qualifications</strong></p>\n<ul>\n<li>Experience with Claude or other frontier AI models in production settings.</li>\n</ul>\n<ul>\n<li>Background in machine learning, NLP, or related fields.</li>\n</ul>\n<ul>\n<li>Experience with A/B testing and experimentation frameworks (e.g., Statsig).</li>\n</ul>\n<ul>\n<li>Familiarity with AI safety and alignment considerations.</li>\n</ul>\n<ul>\n<li>Experience building tools and infrastructure for ML/AI workflows.</li>\n</ul>\n<ul>\n<li>Track record of improving AI system performance through systematic evaluation and iteration.</li>\n</ul>\n<p><strong>You Might Thrive in This Role If You…</strong></p>\n<ul>\n<li>Get excited about the nuances of how language models behave and love finding creative ways to improve their outputs.</li>\n</ul>\n<ul>\n<li>Enjoy being at the intersection of research and product, translating cutting-edge capabilities into user value.</li>\n</ul>\n<ul>\n<li>Are comfortable with ambiguity and can define success metrics for novel AI features.</li>\n</ul>\n<ul>\n<li>Have a strong sense of ownership and drive projects from conception to production.</li>\n</ul>\n<ul>\n<li>Are passionate about building AI systems that are helpful, harmless, and honest.</li>\n</ul>\n<ul>\n<li>Thrive in collaborative environments and enjoy teaching others.</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience.</p>\n<p><strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong> Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you’re interested in this work.</p>\n<p><strong>Your safety matters to us.</strong> To protect yourself from potential scams, we want to remind you that we will never ask you to pay any fees for the hiring process. If someone contacts you claiming to be from Anthropic and asks for money, please report it to us immediately.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_557894f1-074","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5107121008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000 - $405,000USD","x-skills-required":["Python","LLMs","Prompt engineering","Evaluation methodologies","Metrics for AI systems","Version control","CI/CD","Modern software development practices"],"x-skills-preferred":["Claude","Frontier AI models","Machine learning","NLP","A/B testing","Experimentation frameworks","AI safety","Alignment considerations","Tools and infrastructure for ML/AI workflows"],"datePosted":"2026-03-08T13:52:10.785Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, LLMs, Prompt engineering, Evaluation methodologies, Metrics for AI systems, Version control, CI/CD, Modern software development practices, Claude, Frontier AI models, Machine learning, NLP, A/B testing, Experimentation frameworks, AI safety, Alignment considerations, Tools and infrastructure for ML/AI workflows","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_447c26bd-a83"},"title":"Research Engineer, Universes","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re looking for Research Engineers to help us build the next generation of training environments for capable and safe agentic AI. This role blends research and engineering responsibilities, requiring you to both implement novel approaches and contribute to research direction.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Build the next generation of agentic environments</li>\n<li>Build rigorous evaluations that measure real capability</li>\n<li>Collaborate across research and infrastructure teams to ship environments into production training</li>\n<li>Debug and iterate rapidly across research and production ML stacks</li>\n<li>Contribute to research culture through technical discussions and collaborative problem-solving</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Are highly impact-driven — you care about outcomes, not activity</li>\n<li>Operate with high agency</li>\n<li>Have good research taste or senior technical experience, demonstrating good judgment in identifying what actually matters in complex problem spaces</li>\n<li>Can balance research exploration with engineering implementation</li>\n<li>Are passionate about the potential impact of AI and are committed to developing safe and beneficial systems</li>\n<li>Are comfortable with uncertainty and adapt quickly as the landscape shifts</li>\n<li>Have strong software engineering skills and can build robust infrastructure</li>\n<li>Enjoy pair programming (we love to pair!)</li>\n</ul>\n<p><strong>Strong candidates may also have one or more of the following:</strong></p>\n<ul>\n<li>Have industry experience with large language model training, fine-tuning or evaluation</li>\n<li>Have industry experience building RL environments, simulation systems, or large-scale ML infrastructure</li>\n<li>Senior experience in a relevant technical field even if transitioning domains</li>\n<li>Deep expertise in sandboxing, containerization, VM infrastructure, or distributed systems</li>\n<li>Published influential work in relevant ML areas</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<ul>\n<li>Education requirements: We require at least a Bachelor&#39;s degree in a related field or equivalent experience.</li>\n<li>Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</li>\n<li>Visa sponsorship: We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</li>\n</ul>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_447c26bd-a83","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5061517008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$500,000 - $850,000 USD","x-skills-required":["reinforcement learning","training environments","evaluation methodologies","software engineering","pair programming"],"x-skills-preferred":["large language model training","RL environments","simulation systems","distributed systems","influential work in ML areas"],"datePosted":"2026-03-08T13:49:07.277Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA, Seattle, WA, New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"reinforcement learning, training environments, evaluation methodologies, software engineering, pair programming, large language model training, RL environments, simulation systems, distributed systems, influential work in ML areas","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":500000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c33b2d78-cc9"},"title":"Research Lead, Training Insights","description":"<p><strong>About the role</strong></p>\n<p>As a Research Lead on the Training Insights team, you&#39;ll develop the strategy for, and lead execution on, how we measure and characterise model capabilities across training and deployment. This is a hands-on leadership role: you&#39;ll drive original research into new evaluation methodologies while leading a small team of researchers and research engineers doing the same.</p>\n<p>Your work will span the full lifecycle of model development. You&#39;ll research and build new long-horizon evaluations that test the boundaries of what our models can achieve, develop novel approaches to measuring emerging capabilities, and deepen our understanding of how those capabilities develop — both during production RL training and after. You&#39;ll also take a cross-organisational view, working across Reinforcement Learning, Pretraining, Inference, Product, Alignment, Safeguards, and other teams to map the landscape of model evaluations at Anthropic and identify critical gaps in coverage.</p>\n<p>This role carries significant visibility and impact. You&#39;ll help shape the evaluation narrative for model releases, contributing directly to how Anthropic communicates about its models to both internal and external audiences. Done well, you will change how the industry measures and understands model capabilities, significantly furthering our safety mission.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Build new novel and long-horizon evaluations</li>\n<li>Develop novel measurement approaches for understanding how model capabilities emerge and evolve during RL training</li>\n<li>Lead strategic evaluation coverage across the company</li>\n<li>Shape the evaluation narrative for model releases</li>\n<li>Lead and mentor a small team of researchers and research engineers, setting research direction and fostering a culture of rigorous, creative research</li>\n<li>Design evaluation frameworks that balance scientific rigor with the practical demands of production training schedules</li>\n<li>Build and maintain relationships across Anthropic&#39;s research organisation to ensure evaluation insights inform training and deployment decisions</li>\n<li>Contribute to the broader research community through publications, open-source contributions, or external engagement on evaluation best practices</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have significant experience designing and running evaluations for large language models or similar complex ML systems</li>\n<li>Have led technical projects or teams, either formally or through sustained ownership of critical research directions</li>\n<li>Are equally comfortable designing experiments and writing code—you can move between research and implementation fluidly</li>\n<li>Think strategically about what to measure and why, not just how to measure it</li>\n<li>Can synthesise information across multiple teams and workstreams to form a coherent picture of model capabilities</li>\n<li>Communicate complex technical findings clearly to both technical and non-technical audiences</li>\n<li>Are results-oriented and thrive in fast-paced environments where priorities shift based on research findings</li>\n<li>Care deeply about AI safety and want your work to directly influence how capable AI systems are developed and deployed</li>\n</ul>\n<p><strong>Strong candidates may also have:</strong></p>\n<ul>\n<li>Experience building evaluations for long-horizon or agentic tasks</li>\n<li>Deep familiarity with Reinforcement Learning training dynamics and how model behaviour changes during training</li>\n<li>Published research in machine learning evaluation, benchmarking, or related areas</li>\n<li>Experience with safety evaluation frameworks and red teaming methodologies</li>\n<li>Background in psychometrics, experimental psychology, or other measurement-focused disciplines</li>\n<li>A track record of communicating evaluation results to inform high-stakes decisions about model development or deployment</li>\n<li>Experience managing or mentoring researchers and engineers</li>\n</ul>\n<p><strong>Representative projects:</strong></p>\n<ul>\n<li>Designing and implementing a suite of long-horizon evaluations that test model capabilities on tasks requiring sustained reasoning, planning, and tool use over extended interactions</li>\n<li>Building systems to track capability development across RL training checkpoints, surfacing insights about when and how specific capabilities emerge</li>\n<li>Conducting a cross-org audit of evaluation coverage, identifying blind spots, and prioritising new evaluations to fill critical gaps across Pretraining, RL, Inference, and Product</li>\n<li>Developing the evaluation methodology and narrative for a major model release, working with research leads and communications to clearly characterise model capabilities and limitations</li>\n<li>Researching and prototyping novel evaluation approaches for capabilities that are difficult to measure with existing benchmarks</li>\n<li>Leading a team effort to build reusable evaluation infrastructure that serves multiple teams across the research organisation</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices repsectively.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas!</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c33b2d78-cc9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5139654008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$850,000 - $850,000USD","x-skills-required":["machine learning","evaluation methodologies","Reinforcement Learning","Pretraining","Inference","Product","Alignment","Safeguards"],"x-skills-preferred":["psychometrics","experimental psychology","safety evaluation frameworks","red teaming methodologies"],"datePosted":"2026-03-08T13:45:37.187Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"machine learning, evaluation methodologies, Reinforcement Learning, Pretraining, Inference, Product, Alignment, Safeguards, psychometrics, experimental psychology, safety evaluation frameworks, red teaming methodologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":850000,"maxValue":850000,"unitText":"YEAR"}}}]}