Associate Manager, Amazon Insights and Analytics

5f4a61c4-ad0 Associate Manager, Amazon Insights and Analytics At Bayer, we're seeking an Associate Manager, Amazon Insights and Analytics to join our Consumer Health team. As a key member of our marketing department, you will be responsible for transforming complex Amazon datasets into actionable sales strategies. Your primary goal will be to identify trends in customer behavior, help optimize promotional spend, and provide actionable insights to support revenue growth.

Your tasks and responsibilities will include: Transforming weekly performance data into concise, executive-ready performance drivers and drags that explain why sales moved and what the immediate next step should be. Trend identification: highlighting emerging consumer search patterns before they become mainstream. Analyzing Amazon Brand Analytics and profitero and Circana data to understand consumer performance and market share performance. Conducting post-event deep dives for tentpole events (e.g., Prime Day, Black Friday) to measure ROI and inform future forecasting, promo depth, and participation. Monitoring competitor pricing moves and in-market performance to highlight any action that needs to be taken. Conversion optimization: monitoring the Buy Box and Glance Views to alert the Sales team of any traffic drops or conversion leaks that require immediate action. Consumer sentiment analysis: monitoring review counts and star ratings to provide Sales with feedback on product quality or Frequently Bought Together trends. Dashboard ownership: designing and maintaining automated sales reporting frameworks using Power BI, Tableau, Excel, and any new reporting tools. KPI management: defining and monitoring critical metrics, including Topline Sales, Glance Views, Conversion Rate (CR), Subscription, Profitability, and additional priority KPIs. Standardized reporting: delivering weekly, monthly, and quarterly business reviews to sales team and key internal stakeholders, highlighting Wins and Opportunities based on sales volume and market share. New item launches: creating the Launch Blueprint for new SKUs, using historical category data to set realistic sales targets and advertising benchmarks. Tracking competitor out-of-stock events, pricing, or promotional changes to provide Sales with live-time opportunities/challenges.

To succeed in this role, you will need: A bachelor's degree in Business, Finance, Economics, Statistics, or a related analytical field. 2-4+ years of experience in e-commerce analytics, retail, or a CPG environment. Advanced Excel skills, including pivot tables, VLOOKUP/XLOOKUP, and complex data modeling. Proven experience building dashboards in Power BI or Tableau. A sales-first mindset, with the ability to see a data point and immediately translate it into a revenue-generating idea. Agility, with comfort working in a fast-paced environment where data is needed quickly. Experience with third-party Amazon tools, such as Pacvue, Helium10, CommerceIQ, Stackline, NielsenIQ, Profitero, or similar tools. Amazon fluency, with expertise in Vendor Central and familiarity with Amazon Marketing Cloud.

As an Associate Manager, Amazon Insights and Analytics, you can expect to be paid a salary of approximately $115-173k, with additional compensation possible through a bonus or incentive program. Benefits include health care, vision, dental, retirement, PTO, sick leave, and more.

If you're interested in joining our team and contributing to our mission of Health for all, Hunger for none, please apply now.

XML job scraping automation by YubHub

]]> full-time mid remote $115-173k data analysis, Amazon insights, analytics, Power BI, Tableau, Excel, VLOOKUP/XLOOKUP, complex data modeling, sales strategy, trend identification, consumer behavior, promotional spend, ROI analysis, forecasting, advertising benchmarks, third-party Amazon tools, Pacvue, Helium10, CommerceIQ, Stackline, NielsenIQ, Profitero, Amazon fluency, Vendor Central, Amazon Marketing Cloud Marketing Healthcare Bayer https://logos.yubhub.co/talent.bayer.com.png Bayer is a multinational pharmaceutical and life sciences company that develops and manufactures a wide range of healthcare products. https://talent.bayer.com https://talent.bayer.com/careers/job/562949976752151 Whippany 2026-04-18 465e2cfb-ddc Staff Machine Learning Research Scientist, LLM Evals As a Staff Machine Learning Research Scientist on the LLM Evals team, you will lead the development of novel evaluation methodologies, metrics, and benchmarks to measure the capabilities and limitations of frontier LLMs.

Your primary responsibilities will include:

Driving research on the effectiveness and limitations of existing LLM evaluation techniques.
Designing and developing novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.
Communicating, collaborating, and building relationships with clients and peer teams to facilitate cross-functional projects.
Collaborating with internal teams and external partners to refine metrics and create standardized evaluation protocols.
Implementing scalable and reproducible evaluation pipelines using modern ML frameworks.
Publishing research findings in top-tier AI conferences and contributing to open-source benchmarking initiatives.
Mentoring and guiding research scientists and engineers, providing technical leadership across cross-functional projects.
Staying deeply engaged with the ML research community, tracking emerging work and contributing to the advancement of LLM evaluation science.

The ideal candidate will have 5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development.

You will thrive in a high-energy, fast-paced startup environment and be ready to dedicate the time and effort needed to drive impactful results.

XML job scraping automation by YubHub

]]> full-time staff onsite $264,800-$331,000 USD large language model, NLP, Transformer modeling, evaluation methodologies, metrics, benchmarks, instruction following, factuality, robustness, fairness Engineering Technology Scale https://logos.yubhub.co/scale.com.png Scale develops reliable AI systems for the world's most important decisions, providing high-quality data and full-stack technologies to power leading models. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4628044005 San Francisco, CA; Seattle, WA; New York, NY 2026-04-18 60a7e1e6-b51 Tech Lead/Manager, Machine Learning Research Scientist- LLM Evals As the leading data and evaluation partner for frontier AI companies, we're dedicated to advancing the evaluation and benchmarking of large language models (LLMs). Our Research teams work with the industry's leading AI labs to provide high-quality data and accelerate progress in GenAI research.

We're seeking a Tech Lead Manager to lead a talented team of research scientists and research engineers focused on developing and implementing novel evaluation methodologies, metrics, and benchmarks to assess the capabilities and limitations of our cutting-edge LLMs.

Key responsibilities:

Lead a team of highly effective research scientists and research engineers on LLM evals.
Conduct research on the effectiveness and limitations of existing LLM evaluation techniques.
Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.
Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.
Collaborate with internal teams and external partners to refine metrics and create standardized evaluation protocols.
Implement scalable and reproducible evaluation pipelines using modern ML frameworks.
Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives.

Ideal candidate has 5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development. Experience supporting and leading a team of research scientists and research engineers is also required.

XML job scraping automation by YubHub

]]> full-time senior onsite $264,800-$331,000 USD large language model, NLP, Transformer modeling, research and engineering development, team leadership, cross-functional collaboration, evaluation methodologies, metrics and benchmarks, scalable and reproducible evaluation pipelines, modern ML frameworks, published research in top-tier AI conferences, open-source benchmarking initiatives, customer-facing role Engineering Technology Scale https://logos.yubhub.co/scale.com.png Scale develops reliable AI systems for the world's most important decisions, providing high-quality data and full-stack technologies. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4304790005 San Francisco, CA; Seattle, WA; New York, NY 2026-04-18 f931591c-87a Research Scientist, Frontier Risk Evaluations As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems.

For example, you might do any or all of the following:

Design and build harnesses to test AI models and systems (including agents) for dangerous capabilities such as security vulnerability exploitation, CBRN uplift, and other high-risk activities;

Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems;

Publish evaluation methodologies and write technical reports for policymakers.

We are seeking talented researchers to join us in shaping this vision.

Ideally you'd have:

Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance;

Practical experience conducting technical research collaboratively. You should be comfortable building and instrumenting ML pipelines, writing evaluation harnesses, and quickly turning new ideas from the research literature into working prototypes;

A track record of published research in machine learning, particularly in generative AI;

At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development;

Strong written and verbal communication skills to operate in a cross-functional team.

Nice to have:

Experience in crafting evaluations and benchmarks, or a background in data science roles related to LLM technologies;

Experience with red-teaming or adversarial testing of AI systems;

Familiarity with AI safety policy frameworks (e.g., NIST AI RMF, EU AI Act, Korea AI Basic Act).

Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organisational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement.

Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity-based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You’ll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.

XML job scraping automation by YubHub

]]> full-time senior hybrid $216,000-$270,000 USD machine learning, generative AI, ML pipelines, evaluation harnesses, AI safety policy frameworks, crafting evaluations and benchmarks, data science roles related to LLM technologies, red-teaming or adversarial testing of AI systems Engineering Technology Scale https://logos.yubhub.co/scale.com.png Scale develops reliable AI systems for the world's most important decisions. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4677657005 San Francisco, CA; New York, NY 2026-04-18 fe04c8cc-782 Forward Deployed Engineering Manager Shape the Future of AI

At Labelbox, we're building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI development, and our work becomes even more essential as AI capabilities expand exponentially.

We're the only company offering three integrated solutions for frontier AI development:

Enterprise Platform & Tools: Advanced annotation tools, workflow automation, and quality control systems that enable teams to produce high-quality training data at scale

Frontier Data Labeling Service: Specialized data labeling through Alignerr, leveraging subject matter experts for next-generation AI models

Expert Marketplace: Connecting AI teams with highly skilled annotators and domain experts for flexible scaling

Why Join Us

High-Impact Environment: We operate like an early-stage startup, focusing on impact over process. You'll take on expanded responsibilities quickly, with career growth directly tied to your contributions.

Technical Excellence: Work at the cutting edge of AI development, collaborating with industry leaders and shaping the future of artificial intelligence.

Innovation at Speed: We celebrate those who take ownership, move fast, and deliver impact. Our environment rewards high agency and rapid execution.

Continuous Growth: Every role requires continuous learning and evolution. You'll be surrounded by curious minds solving complex problems at the frontier of AI.

Clear Ownership: You'll know exactly what you're responsible for and have the autonomy to execute. We empower people to drive results through clear ownership and metrics.

The role

We’re hiring a Forward Deployed Engineering Manager to lead the design, development, and delivery of reinforcement learning environments for agentic AI systems.

You’ll manage a team responsible for building sandboxed, reproducible environments,terminal-based workflows, browser automation, and computer-use simulations,that power both model training and human-in-the-loop evaluation. This is a hands-on leadership role where you’ll set technical direction, guide execution, and stay close to architecture and critical systems.

What You’ll Do

Lead, hire, and develop a high-performing team of Forward Deployed Engineers, setting a high bar for ownership, velocity, and technical quality

Own the RL environment roadmap, aligning team execution with customer needs and evolving model capabilities

Oversee development of sandboxed environments (terminal, browser, tool-augmented workspaces) that support deterministic execution and multi-step agent interaction

Ensure reliability, observability, and data integrity through strong instrumentation (logging, trajectory capture, state snapshotting)

Drive infrastructure excellence across containerization, sandboxing, CI/CD, automated testing, and monitoring

Partner cross-functionally with data operations, product, and leading AI labs to define task design, evaluation protocols, and environment requirements

Enable rapid prototyping and iteration, helping the team move from ambiguous requirements to production-ready systems quickly

Stay close to the technical details,reviewing architecture, unblocking complex issues, and guiding design decisions

What We’re Looking For

5+ years of software engineering experience (Python)

2+ years of experience managing or leading engineers in fast-paced environments

Strong experience with containerization and sandboxing (Docker, Firecracker, or similar)

Solid understanding of reinforcement learning fundamentals (MDPs, reward design, episode structure, observation/action spaces)

Background in infrastructure, developer tooling, or distributed systems

Strong debugging skills and systems thinking across layered, containerized environments

Ability to operate in ambiguity and translate loosely defined problems into clear execution plans

Excellent communication and stakeholder management skills

Preferred

Experience building or working with RL environments (Gym, PettingZoo) or agent benchmarks (SWE-bench, WebArena, OSWorld, TerminalBench)

Familiarity with cloud infrastructure (GCP or AWS)

Prior experience in AI/ML platforms, data companies, or research environments

Contributions to open-source projects in RL, agents, or developer tooling

Why This Role Matters

RL environment quality is a critical bottleneck in advancing agentic AI. Poorly designed or unreliable environments introduce noise into training loops and directly impact model performance.

In this role, you’ll lead the team building the environments that define how models learn,working across a range of cutting-edge projects with leading AI labs. Alignerr offers the speed and ownership of a startup with the scale and resources of Labelbox, giving you the opportunity to have outsized impact on the future of AI.

About Alignerr

Alignerr is Labelbox’s human data organization, powering next-generation AI through high-quality training data, reinforcement learning environments, and evaluation systems. We partner directly with leading AI labs to build the data and infrastructure that push model capabilities forward.

Life at Labelbox

Location: Join our dedicated tech hubs in San Francisco or Wrocław, Poland

Work Style: Hybrid model with 2 days per week in office, combining collaboration and flexibility

Environment: Fast-paced and high-intensity, perfect for ambitious individuals who thrive on ownership and quick decision-making

Growth: Career advancement opportunities directly tied to your impact

Vision: Be part of building the foundation for humanity's most transformative technology

Our Vision

We believe data will remain crucial in achieving artificial general intelligence. As AI models become more sophisticated, the need for high-quality, specialized training data will only grow. Join us in developing new products and services that enable the next generation of AI breakthroughs.

Labelbox is backed by leading investors including SoftBank, Andreessen Horowitz, B Capital, Gradient Ventures, Databricks Ventures, and Kleiner Perkins. Our customers include Fortune 500 enterprises and leading AI labs.

Any emails from Labelbox team members will originate from a @labelbox.com email address. If you encounter anything that raises suspicions during your interactions, we encourage you to exercise caution and suspend or discontinue communications.

XML job scraping automation by YubHub

]]> full-time senior hybrid $180,000-$220,000 USD Software engineering experience (Python), Containerization and sandboxing (Docker, Firecracker, or similar), Reinforcement learning fundamentals (MDPs, reward design, episode structure, observation/action spaces), Infrastructure, developer tooling, or distributed systems, Debugging skills and systems thinking, Experience building or working with RL environments (Gym, PettingZoo) or agent benchmarks (SWE-bench, WebArena, OSWorld, TerminalBench), Familiarity with cloud infrastructure (GCP or AWS), Prior experience in AI/ML platforms, data companies, or research environments, Contributions to open-source projects in RL, agents, or developer tooling Engineering Technology Labelbox https://logos.yubhub.co/labelbox.com.png Labelbox is a data-centric AI development company that provides critical infrastructure for breakthrough AI models. https://www.labelbox.com/ https://job-boards.greenhouse.io/labelbox/jobs/5101195007 San Francisco Bay Area 2026-04-18 43ed459a-4da Machine Learning Engineer, Support Experience Job Role

As a Machine Learning Engineer on the Support Experience team, you'll play a crucial role in enhancing our self-serve support experiences.

About the Team

The Support Experience engineering organization builds and improves Stripe's user support from end to end: how users get help within our products, how they get in touch with us when they have questions, and how our teams use internal tools to answer those questions.

Responsibilities

Design and implement state-of-the-art ML models and large-scale ML systems for enhancing self-serve support capabilities, balancing ML principles, domain knowledge, and engineering constraints
Develop and optimize contextual conversation models and ML-powered resolution flows for common support scenarios, using tools such as PyTorch, TensorFlow, and XGBoost
Create and refine pipelines for training and evaluating models in both offline and online environments, with a focus on improving support quality and user satisfaction
Implement ML features that streamline information collection and processing for support agents, enhancing overall support efficiency
Collaborate with product, strategy, and content teams to propose, prioritize, and implement new AI-driven support features and improve answer capabilities

Requirements

Bachelor's Degree in ML/AI or related field (e.g. math, physics, statistics)
3+ years in AI/ML and backend engineering, including building and operating production ML systems at global scale with stringent SLOs,balancing reliability, latency, and cost,with privacy, security, and compliance by design.
Deep and up-to-date applied LLM experience: RAG/embeddings, tool use/function calling, agentic planning/orchestration architectures, post-training methods, code generation, benchmarks and evaluations, etc.
Familiarity with classical ML methods and common frameworks e.g. Pytorch, TensorFlow.
Proficient in Python; strong distributed systems and data science fundamentals.
Experience working closely with product management, design, other engineers, and other cross-functional partners.
Strong technical leadership and communication: mentoring and elevating engineers, elevating AI/ML awareness and posture within organizations, setting architectural direction, and driving alignment in ambiguity.

Preferred Qualifications

MS/PhD degree in ML/AI or related field (e.g. math, physics, statistics)
Experience working in Java or Ruby codebases
Experience designing, deploying, and owning Agentic LLM solutions (e.g., multi-step orchestrators, tool use/function calling) specifically for complex customer support or internal workflow automation.
Comfortable working with distributed teams across multiple locations and time zones

XML job scraping automation by YubHub

]]> full-time senior hybrid ML/AI, Backend Engineering, PyTorch, TensorFlow, Python, Distributed Systems, Data Science, LLM, Agentic Planning, Orchestration Architectures, Post-Training Methods, Code Generation, Benchmarks and Evaluations Engineering Technology Stripe https://logos.yubhub.co/stripe.com.png Stripe is a financial infrastructure platform for businesses, providing payment processing and other financial services. https://stripe.com/ https://job-boards.greenhouse.io/stripe/jobs/7813942 Toronto, Canada 2026-04-18 ed5725bb-311 Applied Research Engineer, Agents Shape the Future of AI

As an Applied Research Engineer at Labelbox, you'll sit at the junction of advanced AI research and real product impact, with a focus on the data that makes modern agents work,browser interactions, SWE/code traces, GUI sessions, and multi-turn workflows. You'll drive the data landscape required to advance capable, adaptable agents and help shape Labelbox's strategy for collecting, synthesizing, and evaluating it.

Create frameworks and tools to construct, train, benchmark and evaluate autonomous agent capabilities.

Design agent-focused data programs using supervised fine-tuning (SFT) and reinforcement learning (RL) methodologies.

Develop data pipelines from diverse sources like code repositories, web browsers, and computer systems.

Implement and adapt popular open-source agent libraries and benchmarks with proprietary datasets and models.

Engage with research teams in frontier AI labs and the wider AI community to understand evolving agent data needs for frontier models and share best practices.

Collaborate closely with frontier AI lab customers to understand requirements and guide model development.

Publish research findings in academic journals, conferences, and blog posts.

What You Bring

Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or related field.

At least 3 years of experience addressing sophisticated ML problems with successful delivery to customers.

Experience building and training autonomous agents,tool use, structured outputs, multi-step planning,across browsers/GUI, codebases, and databases using SFT and RL.

Constructed and evaluated agentic benchmarks (e.g. SWE-bench, WebArena, τ-bench, OSWorld) and reliability/efficiency suites (e.g. WABER).

Adept at interpreting research literature and quickly turning new ideas into prototypes.

Deep understanding of frontier models (autoregressive, diffusion), post-training (SFT, RLVR, RLAIF, RLHF, et al.), and their human data requirements.

Proficient in Python, data science libraries and deep learning frameworks (e.g., PyTorch, JAX, TensorFlow).

Strong analytical and problem-solving abilities in ambiguous situations.

Excellent communication skills.

Track record of publications in top-tier AI/ML venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, etc.).

Labelbox Applied Research

At Labelbox Applied Research, we're committed to pushing the boundaries of AI and data-centric machine learning, with a particular focus on advanced human-AI interaction techniques. We believe that high-quality human data and sophisticated human feedback integration methods are key to unlocking the next generation of AI capabilities. Our research team works at the intersection of machine learning, human-computer interaction, and AI ethics to develop innovative solutions that can be practically applied in real-world scenarios.

Life at Labelbox

Location: Join our dedicated tech hubs in San Francisco or Wrocław, Poland

Work Style: Hybrid model with 2 days per week in office, combining collaboration and flexibility

Environment: Fast-paced and high-intensity, perfect for ambitious individuals who thrive on ownership and quick decision-making

Growth: Career advancement opportunities directly tied to your impact

Vision: Be part of building the foundation for humanity's most transformative technology

XML job scraping automation by YubHub

]]> full-time senior hybrid $250,000-$300,000 USD Python, data science libraries, deep learning frameworks, PyTorch, JAX, TensorFlow, supervised fine-tuning, reinforcement learning, agent libraries, benchmarks, proprietary datasets, human-AI interaction, AI ethics Engineering Technology Labelbox https://logos.yubhub.co/labelbox.com.png Labelbox is a company that provides critical infrastructure for breakthrough AI models at leading research labs and enterprises. https://www.labelbox.com/ https://job-boards.greenhouse.io/labelbox/jobs/4829775007 San Francisco Bay Area 2026-04-18 ff4d3a91-b20 Principal Engineer - Perf and Benchmarking We're looking for a Principal Engineer to be the technical lead of CoreWeave's Benchmarking & Performance team. You will be responsible for our planet-scale performance data warehouse: Ingesting, storing, transforming and analyzing performance events in all the data centers across our global infrastructure.

You will also be an integral part of achieving industry-leading end-to-end performance benchmarking publications: If MLPerf (Training & Inference), Working closely with NVIDIA (Megatron-LM, TensorRT-LLM & DGX cloud) and the open-source community (llm-d, vLLM and all popular ML frameworks) speak to you, come help us demonstrate CoreWeave's performance reliability leadership in the field.

Responsibilities

Strategy & Leadership - Define the multi-year benchmarking strategy and roadmap; prioritize models/workloads (LLMs, diffusion, vision, speech) and hardware tiers. Build, lead, and mentor a high-performing team of performance engineers and data analysts. Establish governance for claims: documented methodologies, versioning, reproducibility, and audit trails.

Perf Ownership - Lead end-to-end MLPerf Inference and Training submissions: workload selection, cluster planning, runbooks, audits, and result publication. Coordinate optimization tracks with NVIDIA (CUDA, cuDNN, TensorRT/TensorRT-LLM, Triton, NCCL) to hit competitive results; drive upstream fixes where needed.

Internal Latency & Throughput Benchmarks - Design a Kubernetes-native, repeatable benchmarking service that exercises CoreWeave stacks across SUNK (Slurm on Kubernetes), Kueue, and Kubeflow pipelines. Measure and report p50/p95/p99 latency, jitter, tokens/s, time-to-first-token, cold-start/warm-start, and cost-per-token/request across models, precisions (BF16/FP8/FP4), batch sizes, and GPU types. Maintain a corpus of representative scenarios (streaming, batch, multi-tenant) and data sets; automate comparisons across software releases and hardware generations.

Tooling & Automation - Build CI/CD pipelines and K8s controllers/operators to schedule benchmarks at scale; integrate with observability stacks (Prometheus, Grafana, OpenTelemetry) and results warehouses. Implement supply-chain integrity for benchmark artifacts (SBOMs, Cosign signatures).

Cross-functional & Community - Partner with NVIDIA, key ISVs, and OSS projects (vLLM, Triton, KServe, PyTorch/DeepSpeed, ONNX Runtime) to co-develop optimizations and upstream improvements. Support Sales/SEs with authoritative numbers for RFPs and competitive evaluations; brief analysts and press with rigorous, defensible data.

Requirements

10+ years building distributed systems or HPC/cloud services, with deep expertise on large-scale ML training or similar high-performance workloads.

Proven track record of architecting or building planet-scale data systems (e.g., telemetry platforms, observability stacks, cloud data warehouses, large-scale OLAP engines).

Deep understanding of GPU performance (CUDA, NCCL, RDMA, NVLink/PCIe, memory bandwidth), model-server stacks (Triton, vLLM, TensorRT-LLM, TorchServe), and distributed training frameworks (PyTorch FSDP/DeepSpeed/Megatron-LM).

Proficient with Kubernetes and ML control planes; familiarity with SUNK, Kueue, and Kubeflow in production environments.

Excellent communicator able to interface with executives, customers, auditors, and OSS communities.

Nice to have

Experience with time-series databases, log-structured merge trees (LSM), or custom storage engine development.

Experience running MLPerf submissions (Inference and/or Training) or equivalent audited benchmarks at scale.

Contributions to MLPerf, Triton, vLLM, PyTorch, KServe, or similar OSS projects.

Experience benchmarking multi-region fleets and large clusters (thousands of GPUs).

Publications/talks on ML performance, latency engineering, or large-scale benchmarking methodology.

XML job scraping automation by YubHub

]]> full-time senior hybrid $206,000 to $333,000 Distributed systems, HPC/cloud services, Large-scale ML training, GPU performance, Model-server stacks, Distributed training frameworks, Kubernetes, ML control planes, Time-series databases, Log-structured merge trees, Custom storage engine development, MLPerf submissions, Audited benchmarks, Contributions to OSS projects, Benchmarking multi-region fleets, Large clusters, Publications/talks on ML performance Engineering Technology CoreWeave https://logos.yubhub.co/coreweave.com.png CoreWeave is a cloud-based platform for artificial intelligence that provides technology, tools, and teams to enable innovators to build and scale AI with confidence. https://www.coreweave.com https://job-boards.greenhouse.io/coreweave/jobs/4627302006 Sunnyvale, CA / Bellevue, WA 2026-04-18 ee086b6f-d4e Capital Markets & Investor Relations As a key member of the Capital Markets & Corporate Development team at Anthropic, you'll play a central role in shaping our financial strategy and capital structure during a critical period of growth.

You'll lead capital raising initiatives, manage investor relationships, and help prepare Anthropic for the next phase of its evolution as a company. Working closely with our leadership team, you'll help ensure Anthropic has the financial resources and strategic partnerships needed to fulfill our mission of building reliable, interpretable, and steerable AI systems.

In this role, you'll leverage your expertise across capital markets and financial strategy to drive fundraising activities, build robust investor relations frameworks, and lay the groundwork for long-term financial flexibility. You'll also support selective corporate development opportunities that align with our strategic priorities.

The ideal candidate brings deep capital markets experience, strong analytical capabilities, and exceptional relationship-building skills to help guide Anthropic through its next phase of growth while maintaining our commitment to responsible AI development.

Responsibilities:

Lead capital raising processes, working with executive leadership to determine timing, structure, and terms for potential financing rounds

Build and maintain relationships with existing and potential investors across institutional, strategic, and financial investor bases

Develop comprehensive investor relations strategies, including communications, reporting frameworks, and engagement plans

Help build financial infrastructure and reporting capabilities to support institutional-grade transparency and governance

Track and analyze market conditions, comparable transactions, and valuation benchmarks to inform capital strategy

Identify and evaluate strategic investment opportunities and M&A transactions aligned with Anthropic's mission

Create detailed financial models, valuation analyses, and market research to support strategic decision-making

Prepare and present recommendations to leadership and the board on capital structure and financing strategies

Collaborate with Finance, Legal, and Comms teams to align financial and strategic initiatives with organizational priorities

You may be a good fit if you:

Have 8+ years of experience in investment banking, equity capital markets, private equity, venture capital, or similar roles with significant capital markets exposure

Possess deep knowledge of capital markets, financial instruments, transaction structures, and institutional investor perspectives

Have a proven track record of successfully executing capital raises or advising on financing transactions

Demonstrate exceptional financial modeling and analytical capabilities

Are a strategic thinker who can connect financial decisions to long-term organizational goals

Have excellent communication skills and can effectively engage with diverse stakeholders including investors, executives, and technical teams

Thrive in fast-paced environments and can manage multiple complex projects simultaneously

Show sound judgment when evaluating risks and opportunities in ambiguous situations

Are passionate about AI safety and align with Anthropic's mission to develop AI systems that are reliable, interpretable, and steerable

Strong candidates may also:

Have experience in technology or AI-related industries

Bring experience from companies that have scaled through major growth transitions or prepared for significant capital markets events

Possess advanced degrees in finance, business, or related fields

Have worked with both private and public companies, understanding the requirements and expectations at different stages

Demonstrate knowledge of AI research and development landscapes

Show intellectual curiosity about the technical aspects of AI safety and alignment

Have a strong professional network in relevant investment communities

Bring experience working in high-growth, mission-driven organizations

Annual compensation range for this role is $250,000-$310,000 USD.

XML job scraping automation by YubHub

]]> full-time senior hybrid $250,000-$310,000 USD Investment banking, Equity capital markets, Private equity, Venture capital, Financial modeling, Analytical capabilities, Relationship-building skills, Capital raising, Investor relations, Financial infrastructure, Reporting capabilities, Market analysis, Valuation benchmarks, Strategic investment opportunities, M&A transactions, Financial models, Valuation analyses, Market research, AI safety, Responsible AI development, High-growth organizations, Mission-driven organizations, Technology or AI-related industries, Advanced degrees in finance, business, or related fields, Strong professional network in relevant investment communities Finance Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a technology company focused on developing artificial intelligence systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5116167008 San Francisco, CA 2026-04-18 1d6db969-527 Head of Programmatic Outcomes — Product Job Title: Head of Programmatic Outcomes , Product

About the Role: The Head of Programmatic Outcomes (Product) shapes the product-led growth strategy, in-product activation design, and adoption loops that reach every knowledge worker , not account by account, but at population scale , in deep partnership with Anthropic’s Product, Growth Marketing, and Strategy functions.

Responsibilities:

Drive the PLG activation strategy across Claude Teams, Claude for Enterprise, Claude Code, and API , partnering with Product PMs and Growth Marketing to define plays that drive adoption from first login to habitual daily use

Co-design onboarding sequences, in-product nudges, and activation milestones with Product , then codify them so the Customer Success team and partners execute at scale

Partner with Product and Strategy to define which usage signals indicate healthy vs. at-risk accounts and trigger the right human or automated interventions

Partner closely with Product PMs to ensure the activation motion is built into the product, not bolted on , serving as the dedicated GTM voice advocating for UX patterns and enablement features that reduce time-to-value, and reducing duplication between Product-owned onboarding and CS-owned playbooks

Develop and test hypothesis-driven activation plays across specific knowledge-worker use cases (Finance, Legal, HR, exec workflows) , codifying what converts at scale

Hold the connective tissue across the Partners and Growth Marketing capability leads to ensure PLG motions are product-anchored, not just relationship-driven or campaign-driven , this leader is the integration point, not a parallel function

Establish adoption benchmarks, population-level penetration metrics, and health indicators the programmatic team uses to track progress on the knowledge-worker gap

Feed product activation learnings back into the Success motion , creating the intelligence loop between population-level data and account-level execution

Requirements:

8+ years in PLG, growth PM, product marketing, or CS strategy with direct ownership of activation or adoption metrics

Experience designing and running product-led onboarding or activation programs at enterprise SaaS or AI companies , not just contributing to them

Deep understanding of how product telemetry translates into adoption signals and how to trigger the right human and automated interventions

Strong cross-functional fluency: you’ve worked embedded with Product teams while maintaining a GTM orientation and commercial accountability

Track record of building programmatic motions that generate scale across a large customer base , not one-off account customizations

Real perspective on what drives knowledge workers to build habits around new tools, backed by experience

Salary: The annual compensation range for this role is $315,000-$350,000 USD.

Logistics:

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience

Required field of study: A field relevant to the product management role

Minimum years of experience: 8+ years

Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time

Visa sponsorship: We do sponsor visas!

XML job scraping automation by YubHub

]]> full-time senior hybrid $315,000-$350,000 USD Product Management, Growth Marketing, Strategy, Product Development, Customer Success, Activation Strategy, Onboarding Sequences, In-Product Nudges, Activation Milestones, Usage Signals, Human or Automated Interventions, UX Patterns, Enablement Features, Time-to-Value, Product-Owned Onboarding, CS-Owned Playbooks, Hypothesis-Driven Activation Plays, Knowledge-Worker Use Cases, Finance, Legal, HR, Exec Workflows, Adoption Benchmarks, Population-Level Penetration Metrics, Health Indicators, Programmatic Team, Success Motion, Intelligence Loop, Account-Level Execution Product Management Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a public benefit corporation that creates reliable, interpretable, and steerable AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5153580008 San Francisco, CA 2026-04-18 2999b795-846 Capital Markets & Investor Relations As a key member of the Capital Markets & Corporate Development team at Anthropic, you'll play a central role in shaping our financial strategy and capital structure during a critical period of growth.

Responsibilities:

Lead capital raising processes, working with executive leadership to determine timing, structure, and terms for potential financing rounds

Build and maintain relationships with existing and potential investors across institutional, strategic, and financial investor bases

Develop comprehensive investor relations strategies, including communications, reporting frameworks, and engagement plans

Help build financial infrastructure and reporting capabilities to support institutional-grade transparency and governance

Track and analyze market conditions, comparable transactions, and valuation benchmarks to inform capital strategy

Identify and evaluate strategic investment opportunities and M&A transactions aligned with Anthropic's mission

Create detailed financial models, valuation analyses, and market research to support strategic decision-making

Prepare and present recommendations to leadership and the board on capital structure and financing strategies

Collaborate with Finance, Legal, and Comms teams to align financial and strategic initiatives with organizational priorities

You may be a good fit if you:

Have 8+ years of experience in investment banking, equity capital markets, private equity, venture capital, or similar roles with significant capital markets exposure

Possess deep knowledge of capital markets, financial instruments, transaction structures, and institutional investor perspectives

Have a proven track record of successfully executing capital raises or advising on financing transactions

Demonstrate exceptional financial modeling and analytical capabilities

Are a strategic thinker who can connect financial decisions to long-term organizational goals

Have excellent communication skills and can effectively engage with diverse stakeholders including investors, executives, and technical teams

Thrive in fast-paced environments and can manage multiple complex projects simultaneously

Show sound judgment when evaluating risks and opportunities in ambiguous situations

Are passionate about AI safety and align with Anthropic's mission to develop AI systems that are reliable, interpretable, and steerable

Strong candidates may also:

Have experience in technology or AI-related industries

Bring experience from companies that have scaled through major growth transitions or prepared for significant capital markets events

Possess advanced degrees in finance, business, or related fields

Have worked with both private and public companies, understanding the requirements and expectations at different stages

Demonstrate knowledge of AI research and development landscapes

Show intellectual curiosity about the technical aspects of AI safety and alignment

Have a strong professional network in relevant investment communities

Bring experience working in high-growth, mission-driven organizations

Annual compensation range for this role is $250,000-$310,000 USD.

XML job scraping automation by YubHub

]]> full-time senior hybrid $250,000-$310,000 USD Investment banking, Equity capital markets, Private equity, Venture capital, Financial modeling, Analytical capabilities, Relationship-building skills, Capital raising, Investor relations, Financial infrastructure, Reporting capabilities, Market analysis, Valuation benchmarks, Strategic investment opportunities, M&A transactions, Financial models, Valuation analyses, Market research, AI safety, Responsible AI development, High-growth industries, Mission-driven organizations, Strong professional network, Intellectual curiosity, Technical aspects of AI safety and alignment Finance Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a technology company focused on developing artificial intelligence systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5116167008 San Francisco, CA 2026-04-18 540ce49c-271 Member of Technical Staff - Multimodal Understanding About the Role

You will join the multimodal team to push toward superhuman multimodal intelligence. Advance understanding and generation across modalities,image, video, audio, and text,spanning the full stack: data curation/acquisition, tokenizer training, large-scale pre-training, post-training/alignment, infrastructure/scaling, evaluation, tooling/demos, and end-to-end product experiences.

Collaborate cross-functionally with pre-training, post-training, reasoning, data, applied, and product teams to deliver frontier capabilities in multimodal reasoning, world modeling, tool use, agentic behaviors, and interactive human-AI collaboration. Contribute to building models that can see, hear, reason about, and interact with the world in real time at unprecedented levels.

Responsibilities

Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale.
Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text).
Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding & generation, real-time video processing, and noisy data handling.
Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models.
Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy.
Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance.
Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback.
Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions.

Basic Qualifications

Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal).
Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA.
Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design).
Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data.
Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors).
Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers.
Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences.

Preferred Skills and Experience

Experience leading major improvements in model capabilities through better data, modeling, algorithms, or scaling.
Familiarity with state-of-the-art in multimodal LLMs, scaling laws, tokenizers, compression techniques, reasoning, or agentic systems.
Proficiency in Rust and/or C++ for performance-critical components.
Hands-on work with large-scale orchestration tools such as Spark, Ray, or Kubernetes.
Background building full-stack tooling: performant interfaces, real-time research demos/apps, or end-to-end product ownership.
Passion for end-to-end user experience in interactive, real-time multimodal AI systems.

XML job scraping automation by YubHub

]]> full-time staff onsite $180,000 - $440,000 USD Multimodal pre-training, Post-training, Fine-tuning, Python, JAX, PyTorch, XLA, Large-scale distributed ML systems, Data pipelines, Evaluation design, Benchmarks, Reward modeling, RL techniques, State-of-the-art in multimodal LLMs, Scaling laws, Tokenizers, Compression techniques, Reasoning, Agentic systems, Rust, C++, Spark, Ray, Kubernetes, Full-stack tooling Engineering Technology xAI https://logos.yubhub.co/xai.com.png xAI creates AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. https://www.xai.com https://job-boards.greenhouse.io/xai/jobs/5111374007 Palo Alto, CA 2026-04-18 d63f049e-ad7 Security Lead, Agentic Red Team Job Title: Security Lead, Agentic Red Team

We're a team of scientists, engineers, and machine learning experts working together to advance the state of the art in artificial intelligence. Our mission is to close the 'Agentic Launch Gap'; the critical window where novel AI capabilities outpace traditional security reviews.

As the Security Lead for the Agentic Red Team, you will direct a specialized unit of AI Researchers and Offensive Security Engineers focused on adversarial AI and agentic exploitation. Operating as a technical player-coach, you will architect complex, multi-turn attack scenarios while managing cross-functional partnerships with Product Area leads and Google security to influence launch criteria.

Key Responsibilities:

Direct Agile Offensive Security: Lead a specialized red team focused on rapid, high-impact engagements targeting production-level AI models and systems.
Perform Complex AI Exploitation: Develop and carry out advanced attack sequences that focus on vulnerabilities unique to GenAI, such as escalating privileges through tool usage, poisoning data, and executing multi-turn prompt injections.
Design Automated Validation Systems: Collaborate with Google teams to engineer 'Auto RedTeaming' solutions that transform manual vulnerability discoveries into robust, automated regression testing frameworks.
Engineer Technical Countermeasures: Create innovative defense-in-depth frameworks and control systems to mitigate agentic logic errors and non-deterministic model behaviors.
Manage Threat Intelligence Assets: Develop and oversee an evolving inventory of exploit primitives and agent-specific attack patterns used to establish release criteria and evaluate model security benchmarks.
Establish Security Scope: Collaborate with Google for conventional infrastructure protection, allowing the team to concentrate solely on agentic logic, model inference, and AI-centric exploits.

About You:

Bachelor's degree in Computer Science, Information Security, or equivalent practical experience.
Experience in Red Teaming, Offensive Security, or Adversarial Machine Learning.
Deep technical understanding of LLM architectures and agentic workflows (e.g., chain-of-thought reasoning, tool usage).
Proven ability to work in a consulting capacity with product teams, driving security improvements in fast-paced release cycles.
Experience managing or technically leading small, high-performance engineering teams.

In addition, the following would be an advantage:

Hands-on experience developing exploits for GenAI models (e.g., prompt injection, adversarial examples, training data extraction).
Familiarity with AI safety benchmarks and evaluation frameworks.
Experience writing code (Python, Go, or C++) to build automated security tools or fuzzers.
Ability to communicate complex probabilistic risks to executive stakeholders and engineering teams effectively.

The US base salary range for this full-time position is between $248,000 - $349,000 + bonus + equity + benefits.

XML job scraping automation by YubHub

]]> full-time senior onsite $248,000 - $349,000 + bonus + equity + benefits Bachelor's degree in Computer Science, Information Security, or equivalent practical experience, Experience in Red Teaming, Offensive Security, or Adversarial Machine Learning, Deep technical understanding of LLM architectures and agentic workflows, Proven ability to work in a consulting capacity with product teams, Experience managing or technically leading small, high-performance engineering teams, Hands-on experience developing exploits for GenAI models, Familiarity with AI safety benchmarks and evaluation frameworks, Experience writing code (Python, Go, or C++) to build automated security tools or fuzzers, Ability to communicate complex probabilistic risks to executive stakeholders and engineering teams effectively Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a team of scientists, engineers, and machine learning experts working together to advance the state of the art in artificial intelligence. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7560787 Mountain View, California, US; New York City, New York, US 2026-03-16 f73f108d-30a Senior Security Engineer, Agentic Red Team Job Title: Senior Security Engineer, Agentic Red Team

We're a team of scientists, engineers, machine learning experts, and more, working together to advance the state of the art in artificial intelligence.

About Us The Agentic Red Team is a specialized, high-velocity unit within Google DeepMind Security. Our mission is to close the 'Agentic Launch Gap',the critical window where novel AI capabilities outpace traditional security reviews.

The Role As a Senior Security Engineer on the Agentic Red Team, you will be the primary technical executor of our adversarial engagements. You will work 'in the room' with product builders, identifying architectural flaws during the design phase long before formal reviews begin.

Key Responsibilities:

Execute Agile Red Teaming: Conduct rapid, high-impact security assessments on agentic services, focusing on vulnerabilities unique to GenAI such as prompt injection, tool-use escalation, and autonomous lateral movement.
Develop Advanced Exploits: Engineer and execute complex attack sequences that exploit non-deterministic model behaviors, agentic logic errors, and data poisoning vectors.
Build Automated Defenses: Write code to transform manual vulnerability discoveries into automated regression testing frameworks ('Auto Red Teaming') that prevent regression in future model versions.
Embed with Product Teams: Partner directly with developers during the design and build phases to provide immediate feedback, effectively shortening the feedback loop between offensive findings and defensive engineering.
Curate Threat Intelligence: Maintain and expand a library of agent-specific attack patterns and exploit primitives to establish robust release criteria for new models.

About You In order to set you up for success as a Software Engineer at Google DeepMind, we look for the following skills and experience:

Bachelor's degree in Computer Science, Information Security, or equivalent practical experience.
Experience in Red Teaming, Offensive Security, or Adversarial Machine Learning.
Strong coding skills in Python, Go, or C++ with experience building security tools or automation.
Technical understanding of LLM architectures, agentic workflows (e.g., chain-of-thought reasoning), and common AI vulnerability classes.

Preferred Qualifications

Hands-on experience developing exploits for GenAI models (e.g., prompt injection, adversarial examples, training data extraction).
Experience working in a consulting capacity with product teams or in a fast-paced 'startup-like' environment.
Familiarity with AI safety benchmarks, evaluation frameworks, and fuzzing techniques.
Ability to translate complex probabilistic risks into actionable engineering fixes for developers.

Salary & Benefits The US base salary range for this full-time position is between $166,000 - $244,000 + bonus + equity + benefits.

XML job scraping automation by YubHub

]]> full-time senior onsite $166,000 - $244,000 + bonus + equity + benefits Python, Go, C++, Red Teaming, Offensive Security, Adversarial Machine Learning, LLM architectures, agentic workflows, chain-of-thought reasoning, AI vulnerability classes, prompt injection, adversarial examples, training data extraction, AI safety benchmarks, evaluation frameworks, fuzzing techniques Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a technology company that specializes in artificial intelligence research and development. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7596438 Mountain View, California, US; New York City, New York, US; Zurich, Switzerland 2026-03-16 297256fb-31a Applied AI Engineer, Beneficial Deployments About Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About Beneficial Deployments

Beneficial Deployments ensures AI reaches and benefits the communities that need it most. We partner with nonprofits, foundations, and mission-driven organisations to deploy Claude in education, global health, economic mobility, and life sciences, focusing on raising the floor.

About the Role

We're looking for an Applied AI Engineer to join our Beneficial Deployments team. You'll use your deep technical expertise to help partners accelerate their impact through advising on evals, hill-climbing on harnesses, prototyping new agents, etc. You will also work on building ecosystem-level tooling and infrastructure to scale impact beyond individual partnerships.

Responsibilities

Serve as a deep technical partner to mission-driven organisations through advising on evals, agent architectures, context engineering, cost optimisation, and more
Provide hands-on support to partner engineering teams through pair programming, prototyping, and code contributions that accelerate their development
Develop public goods infrastructure that benefits entire ecosystems through benchmarks, MCP's, and Agent Skills
Identify challenges unique to social impact partners, and contribute findings and improvements back to product, engineering, and research
Create technical presentations, demos, and scalable technical content (documentation, tutorials, sample code) to accelerate partner adoption and self-service
Help shape team processes and culture as we scale from 1 to N
Travel occasionally to customer sites for workshops, technical deep dives, and relationship building

You Might Be a Good Fit If You Have

4+ years as a Software Engineer, Forward Deployed Engineer, or technical founder
Production experience building LLM-powered applications, including prompting, context engineering, agent architectures, evaluation frameworks, and deployment at scale
Builder credibility that earns trust with technical founders and engineering teams—you've shipped products and can speak from experience
Experience working in ed-tech, healthcare, scientific research, nonprofit, or other mission-driven organisations, understanding their unique challenges and constraints
A love of teaching, mentoring, and helping others succeed
A scrappy mentality–comfortable wearing multiple hats, building from scratch, driving clarity in ambiguous situations, and doing whatever it takes to further the mission

Logistics

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices
Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal

XML job scraping automation by YubHub

]]> full-time senior hybrid $280,000 - $300,000 USD Software Engineer, Forward Deployed Engineer, Technical founder, LLM-powered applications, Prompting, Context engineering, Agent architectures, Evaluation frameworks, Deployment at scale, Ed-tech, Healthcare, Scientific research, Nonprofit, Mission-driven organisations, Public goods infrastructure, Benchmarks, MCP's, Agent Skills, Technical presentations, Demos, Scalable technical content, Documentation, Tutorials, Sample code, Pair programming, Prototyping, Code contributions, Team processes, Communication skills Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that aims to create reliable, interpretable, and steerable AI systems. It has a quickly growing team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://job-boards.greenhouse.io https://job-boards.greenhouse.io/anthropic/jobs/5068226008 San Francisco, CA | New York City, NY 2026-03-08 7d5a8f0f-540 Research Engineer, Agents About Anthropic

About the role:

Agentic systems are becoming an increasingly important part of how AI is deployed. Over the last year, we’ve seen rapid adoption of Claude-powered agentic systems in spaces like coding, research, customer support, network security, and more. We believe this is just the beginning, and we expect Claude to be handling much more complex tasks end-to-end or in cooperation with a human user as time goes on. We have a team striving to make Claude an even more effective agent over longer time horizon tasks, and coordinate with groups of other agents at many different scales to accomplish large tasks. This team endeavors to maximize agent performance by solving challenges at whatever level is needed, whether it’s novel harness design, improved agent affordances and infrastructure, or finetuning.

Given that this is a nascent field, we ask that you share with us a project built on LLMs that showcases your skill at getting them to do complex tasks. Here are some example projects of interest: design of complex agents, quantitative experiments with prompting, constructing model benchmarks, synthetic data generation, or model finetuning. There is no preferred task; we just want to see what you can build. It’s fine if several people worked on it; simply share what part of it was your contribution. You can also include a short description of the process you used or any roadblocks you hit and how to deal with them, but this is not a requirement.

Responsibilities:

Ideate, develop, and compare the performance of different agent harnesses (eg memory, context compression, communication architectures for agents)
Design and implement rigorous quantitative benchmarks for large scale agentic tasks
Assist with automated evaluation of Claude models and prompts across the training and product lifecycle
Work with our product org to find solutions to our most vexing challenges applying agents to our products
Help create and optimize data mixes for model training that maximize Claude’s performance or ease of use on agentic tasks

You may be a good fit if you:

Have experience developing complex agentic systems using LLMs
Have significant software engineering and ML experience
Have spent time prompting and/or building products with language models
Have good communication skills and an interest in working with other researchers on difficult tasks
Have a passion for making powerful technology safe and societally beneficial
Stay up-to-date and informed by taking an active interest in emerging research and industry trends.
Enjoy pair programming (we love to pair!)

Strong candidates may also have experience with:

Large-scale RL on language models
Multi-agent systems

Representative projects:

Design and build a novel agent harness that outperforms existing agents on coding or knowledge work benchmarks
Design and build agent affordances that unlock new capabilities for internal use and deployed products
Design and build a novel eval that measures how many agents interact in groups to solve problems
Build a scaled model evaluation framework driven by model-based evaluation techniques.
Build the prompting and model orchestration for a production application backed by a language model
Finetune Claude to maximize its performance using a particular set of agent tools or harness

Logistics

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience. Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you’re interested in this work. We think AI systems like the ones we’re building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you’re ever unsure about a communication, don’t click any links—

XML job scraping automation by YubHub

]]> full-time senior hybrid $500,000 - $850,000 USD LLMs, agent harnesses, quantitative benchmarks, automated evaluation, data mixes, model training, large-scale RL, multi-agent systems, pair programming Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that aims to create reliable, interpretable, and steerable AI systems. It has a quickly growing team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com https://job-boards.greenhouse.io/anthropic/jobs/4017544008 San Francisco, CA, Seattle, WA, New York City, NY 2026-03-08 b5d24d69-593 GTM Strategy & Operations (Industries) - EMEA As a Sales Strategy & Operations Partner focused on the Industry segment, you'll be instrumental in driving revenue growth across key industry verticals by partnering with sales leadership to develop, refine, and execute industry-specific go-to-market strategies.

You will serve as a trusted advisor and operational force multiplier, bringing deep B2B SaaS expertise to help teams achieve revenue targets while building scalable frameworks that drive industry-specific customer success.

Responsibilities:

Industry Strategy & Analysis:

Partner with sales and regional leaders to develop go-to-market strategy, motions, country/region prioritisation and sector focus and strategy. Refine industry-specific go-to-market strategies across key verticals (e.g., Financial Services, Healthcare, Technology, Retail, Manufacturing etc.)

Identify Industry customer needs, pain points etc. and share feedback cross functionally to guide solution building across global teams such as product, engineering, finance, legal etc.

Build and maintain industry-specific value propositions, use cases, and sales playbooks

Drive territory and account segmentation strategies based on industry characteristics and opportunity

Establish industry benchmarks and best practices to guide sales approach and prioritization

Work with Finance on target setting, planning and bottom-up modeling

Go-to-Market Strategy & Execution:

Develop comprehensive industry frameworks that standardize how customers in each vertical derive value from our solutions

Drive quarterly, semi-annual, and annual planning cycles for industry-focused GTM segments

Analyse industry-specific sales metrics and pipeline dynamics to develop proactive insights

Partner with Account Executives to evolve sales motions and industry best practices based on vertical nuances

Continuously evaluate market opportunities, competitive positioning, and whitespace within target industries

Cross-Functional Leadership:

Lead collaboration across Strategy and RevOps, Sales Enablement, Strategic Finance, Product, and Marketing teams to ensure industry alignment

Work with Product teams to communicate industry-specific feature requirements and market feedback

Prepare executive-level materials including industry performance reviews, strategic planning sessions, and board updates

Drive alignment across teams to ensure consistent execution of industry-specific go-to-market strategies

Partner with Marketing on industry-focused campaigns, events, and thought leadership initiatives

Operational Excellence:

Remove operational bottlenecks specific to complex industry sales cycles and compliance requirements

Execute critical processes including strategic account transitions, industry-specific CRM configurations, and vertical market data management

Optimise sales technology stack and ensure effective adoption of Salesforce and other sales tools

Establish sales process methodology, qualification criteria, and stage definitions for consistent execution

Ensure smooth operations across complex, multi-stakeholder sales motions typical in enterprise B2B software

You may be a good fit if you have:

Required qualifications:

10+ years of experience in B2B SaaS or enterprise software, with focus on sales strategy, revenue operations, industry consulting, or commercial roles

Deep expertise in one or more enterprise industry verticals with proven track record of driving growth in those markets

Strong analytical capabilities with demonstrated ability to convert complex industry data into actionable insights and strategy

Direct experience in commercial GTM roles selling or supporting sales of B2B software solutions to enterprise customers

Proven experience designing territories, setting quotas, and building variable compensation plans for enterprise sales teams

Track record of building sales capacity models, productivity frameworks, and headcount planning processes

Extensive experience with enterprise CRM systems (Salesforce required) and business intelligence tools

Proven ability to influence and partner with C-level stakeholders and senior sales leadership

Track record of developing industry-specific frameworks, sales plays, and value propositions

Bachelor's degree in business, economics, or related field

Strong candidates may also have:

Experience at high-growth B2B SaaS companies, particularly in industry-focused GTM or vertical sales roles

Background in AI/ML, cloud infrastructure, or enterprise software companies serving multiple industries

Track record of scaling sales organisations through 3x+ growth periods in B2B software, including territory expansion and comp plan evolution

MBA or advanced degree

Experience building industry-specific go-to-market motions from scratch

Track record of scaling sales organisations through 3x+ growth periods in B2B software

Direct enterprise sales or sales engineering experience in complex, multi-stakeholder environments

Experience with emerging technologies (AI, ML) in enterprise contexts

Management consulting experience with focus on industry strategy or commercial excellence

Deadline to apply: None. Applications will be reviewed on a rolling basis.

XML job scraping automation by YubHub

]]> full-time senior onsite The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and Sales strategy, Revenue operations, Industry consulting, Commercial roles, B2B SaaS, Enterprise software, Salesforce, Business intelligence tools, Analytical capabilities, Industry data, Actionable insights, Strategy, Customer success, Go-to-market strategy, Territory and account segmentation, Industry benchmarks, Best practices, Sales approach, Prioritization, Target setting, Planning, Bottom-up modeling, Comprehensive industry frameworks, Sales metrics, Pipeline dynamics, Proactive insights, Sales motions, Industry best practices, Vertical nuances, Market opportunities, Competitive positioning, Whitespace, Target industries, Cross-functional leadership, Strategy and RevOps, Sales Enablement, Strategic Finance, Product, Marketing, Industry alignment, Feature requirements, Market feedback, Executive-level materials, Industry performance reviews, Strategic planning sessions, Board updates, Alignment, Consistent execution, Industry-specific go-to-market strategies, Industry-focused campaigns, Events, Thought leadership initiatives, Operational excellence, Complex industry sales cycles, Compliance requirements, Strategic account transitions, Industry-specific CRM configurations, Vertical market data management, Sales technology stack, Effective adoption, Sales process methodology, Qualification criteria, Stage definitions, Consistent execution, Smooth operations, Complex sales motions, Enterprise B2B software, High-growth B2B SaaS companies, Industry-focused GTM or vertical sales roles, AI/ML, Cloud infrastructure, Enterprise software companies, Scaling sales organisations, Territory expansion, Comp plan evolution, MBA or advanced degree, Building industry-specific go-to-market motions, Direct enterprise sales or sales engineering experience, Emerging technologies, Management consulting experience, Industry strategy or commercial excellence Sales Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that creates reliable, interpretable, and steerable AI systems. It has a quickly growing team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com https://job-boards.greenhouse.io/anthropic/jobs/5005519008 London, UK 2026-03-08 b7392e45-d19 Growth Marketing Lead, Paid Acquisition Rewrite this job ad in your own words, matching the tone of voice of the original. Reuse the same section headings from the original ad (e.g. if the ad says "Responsibilities", use that heading, not "What you'll do").

Start with an opening paragraph (no heading): what the role is, who the company is, why it matters. If the ad mentions salary, include it here.

Rephrase bullet points in your own words while keeping the factual content. Combine related points where it makes sense.

For benefits/perks: gather them from anywhere in the ad into one section. If the ad mentions nothing about benefits, omit a benefits section entirely.

Do not invent information that is not in the original ad.

Job Description

We're looking for a data-driven, strategic and execution-focused Growth Marketing/Performance Marketing Lead to manage Replit's PLG and self-serve paid acquisition channels. We're looking for someone who can inherit an existing program and immediately drive efficiency, optimization, and scale. Your job is to squeeze more signal from what's working, prune what isn't, and bring a relentless experimentation mindset to accelerating our self-serve growth engine.

You'll work directly with our Head of Growth Marketing and collaborate closely with Creative, Product and Data. This role is ideal for someone who thrives in a high-velocity environment, is deeply channel-native, and is obsessed with improving CAC, conversion rates, and LTV across every stage of the self-serve funnel.

Responsibilities

Own and optimize paid acquisition across PLG channels

Take full accountability for self-serve new user acquisition, free-to-paid conversion, and paid channel efficiency primarily focused on SEM/paid search, but able to lean into other channels such as paid social, display, programmatic, and affiliate

Manage multi-million dollar media budgets with a clear focus on CAC efficiency and return on ad spend

Take a full funnel performance approach from impression to activated paying user — including landing pages, onboarding flow touchpoints, and trial conversion

Drive continuous optimization and a rigorous testing culture

Identify performance gaps and inefficiencies across existing campaigns and channels; develop and execute a prioritized roadmap to address them

Design and run structured A/B and multivariate experiments across creative, copy, audiences, landing pages, and bidding strategies

Establish benchmarks, performance baselines, and a regular cadence of insights reporting to stakeholders

Scale what's working — and build what's missing

Lead channel expansion efforts: identify emerging acquisition opportunities in programmatic, connected TV, or new affiliate partnerships

Partner with Creative and Design teams to develop and iterate on high-performing ad assets optimized for each channel and audience

Develop audience segmentation strategies that target the right users — with channel-appropriate messaging

Build measurement infrastructure and attribution rigor

Partner with Data to ensure accurate measurement of paid channel contribution across the self-serve funnel

Develop incrementality testing frameworks to move beyond last-click and measure true paid lift

Track and report on full-funnel metrics

What You'll Bring

7+ years of performance marketing experience with expertise and hands-on expertise specifically in SEM/Paid Search. Experience with other channels such as paid social (Meta, TikTok, LinkedIn), display, programmatic, and affiliate channels are valuable

Proven track record inheriting and meaningfully improving an existing paid acquisition program — not just maintaining, but finding new efficiency and growth levers

Deep fluency in performance measurement: you understand attribution models, incrementality, MMM basics, and can design experiments that produce actionable signal

Strong analytical skills — you're comfortable pulling your own data, building dashboards, and translating numbers into decisions without waiting for a data team

Experience marketing consumer or prosumer SaaS products with a product-led or freemium growth model at a high growth tech company

Creative instincts backed by data — you know how to brief creative teams and evaluate assets against performance goals, not just aesthetics

Benefits

Competitive Salary & Equity

401(k) Program with a 4% match

Health, Dental, Vision and Life Insurance

Short Term and Long Term Disability

Paid Parental, Medical, Caregiver Leave

Commuter Benefits

Monthly Wellness Stipend

Autonomous Work Environment

In Office Set-Up Reimbursement

Flexible Time Off (FTO) + Holidays

Quarterly Team Gatherings

In Office Amenities

XML job scraping automation by YubHub

]]> full-time senior hybrid $165K - $215K performance marketing, paid search, paid social, display, programmatic, affiliate, paid acquisition, SEM, paid search marketing, paid social media marketing, display advertising, programmatic advertising, affiliate marketing, paid channel efficiency, CAC efficiency, return on ad spend, full funnel performance, landing pages, onboarding flow touchpoints, trial conversion, A/B testing, multivariate testing, creative, copy, audiences, landing pages, bidding strategies, benchmarks, performance baselines, insights reporting, incrementality testing, true paid lift, full-funnel metrics, data analysis, data visualization, data-driven decision making, performance measurement, attribution models, incrementality, MMM basics, experiment design, actionable signal, analytical skills, data pulling, dashboard building, number translation, creative instincts, briefing creative teams, evaluating assets against performance goals Marketing Technology Replit https://logos.yubhub.co/replit.com.png Replit is an agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide, Replit is democratizing software development by removing traditional barriers to application creation. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/replit/e487e4a9-1248-4954-b311-873d48b79e80 Foster City, CA 2026-03-07 c6d33ad6-f0d Research Scientist, Mathematical Sciences Job Posting

Research Scientist, Mathematical Sciences

Location

San Francisco

Employment Type

Full time

Department

Research

Compensation

$380K – $445K

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the Team

The Strategic Deployment team makes frontier models more capable, reliable, and aligned to transform high-impact domains. On one hand, this involves deploying models in real-world, high-stakes settings to drive AI-driven transformation and elicit insights—training data, evaluation methods, and techniques—to shape our frontier model development. On the other hand, we leverage these learnings to build the science and engineering of impactful frontier model deployment.

As a key element of this effort, OpenAI for Science aims to harness AI to accelerate the process of scientific research. This involves building models and an AI-powered platform that speeds up discovery and helps researchers everywhere do more, faster.

About the Role

As a Research Scientist focused on the mathematical sciences, you will help build models, tools, and workflows that move theoretical research—in fields such as mathematics, theoretical physics, and theoretical computer science—forward. You’ll design domain-specific data and signals, shape training and evaluation, guide how to wire models to scientific tools, and work with the academic community to speed up adoption and impact.

We’re looking for people who…

Hold a current or recent academic position in mathematical sciences (mathematics, theoretical physics, theoretical computer science) or a related field

Regularly use frontier models in their own research

Move easily between theory and code, and are eager to contribute technically as well as academically

Either know or are eager to learn modern AI and run AI experiments end-to-end

Are strong scientific communicators

Care about rigor and reproducibility in scientific results

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will

Assist in designing and building frontier AI models that are great at solving frontier mathematical sciences problems

Build high-quality scientific datasets and synthetic data pipelines (symbolic, numeric, and simulator-based)

Design reinforcement and grading signals for mathematical sciences and run reinforcement learning/optimization loops to improve model reasoning

Define and run evals for scientific reasoning, derivations, simulations, and literature grounding; track progress over time

Partner with research labs and the academic community

Drive adoption of frontier AI within the scientific community

Uphold high standards for safety, data governance, and reproducibility

You might thrive in this role if you

Are passionate about pushing the boundaries of your field using AI

Have used ChatGPT to do calculations and prove or improve lemmas in your field of study

Communicate clearly to both scientists and AI engineers; you like collaborating across teams and with academia

Nice to have

Open-source contributions to mathematical science or AI tooling

Experience building or curating domain datasets and benchmarks

Experience engaging a research community (teaching, workshops, tutorials, standards)

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

XML job scraping automation by YubHub

]]> full-time senior hybrid $380K – $445K AI, Machine Learning, Deep Learning, Mathematical Sciences, Theoretical Physics, Theoretical Computer Science, Scientific Computing, Data Science, Programming (Python, C++, etc.), Open-source contributions to mathematical science or AI tooling, Experience building or curating domain datasets and benchmarks, Experience engaging a research community (teaching, workshops, tutorials, standards) Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/3f7e0545-a6b4-4108-82ca-653d16262934 San Francisco 2026-03-06 dbcceacb-d90 Model Behavior Architect We're looking for a Model Behavior Architect to help build Perplexity's AI products and evaluations. You'll sit within our AI team and collaborate closely with research and product teams, designing prompt and context engineering strategies to deliver high quality user experiences across multiple domains and models.

What you'll do

Context Engineering: Design, test, and optimize context strategies and system prompts that shape answer engine behavior across products, features, and use cases.
Evaluation Systems: Build automated and semi-automated evaluation pipelines that measure model quality, catch regressions, and scale across product surfaces.
Model Launch Support: Partner with research and engineering to validate model behavior before and during rollouts, ensuring smooth transitions with no degradation.
Research & Analysis: Identify inconsistencies and failure modes in model outputs through well-designed research projects — for both internal and production-facing systems.
Cross-functional Collaboration: Work closely with design, product, and research teams to translate product goals into concrete model behavior requirements.
Knowledge Sharing: Help engineers across teams build intuition for prompt design, context engineering, and evaluation best practices.
Staying Current: Track the latest alignment, evaluation, and prompting techniques from industry and academia, and bring the best ideas back to the team.

What you need

Experience designing evaluations, benchmarks, or metrics for AI systems.
Strong written and verbal communication skills, particularly in explaining complex concepts to diverse stakeholders.
Ability to manage multiple concurrent projects in a fast-moving environment.
Strong experience with Perplexity or other frontier AI models in production settings.
Demonstrated experience with Python — you'll prototype, debug, automate, and build systems at scale.
3+ years of experience working with LLMs in a product or research setting.

XML job scraping automation by YubHub

]]> full-time senior onsite $180K - $270K experience designing evaluations, benchmarks, or metrics for AI systems, strong written and verbal communication skills, ability to manage multiple concurrent projects, strong experience with Perplexity or other frontier AI models, demonstrated experience with Python, 3+ years of experience working with LLMs, experience with A/B testing or experimentation frameworks, track record of improving AI system performance through systematic evaluation and iteration Engineering Technology Perplexity https://logos.yubhub.co/perplexity.com.png Perplexity is a leading AI company that specializes in building and evaluating AI products and evaluations. They are known for their cutting-edge technology and innovative approach to AI development. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/perplexity/9904db61-b8ca-4207-8f93-88ab6f0cd3fd San Francisco 2026-03-04