Strategic Finance/FP&A

933df1f3-205 Strategic Finance/FP&A You'll architect the financial planning backbone for a company scaling faster than traditional frameworks can handle. This isn't about maintaining existing budgets,it's about building planning processes, financial models, and decision frameworks that help leadership allocate resources in a market that's evolving in real time.

You'll be the person who:

Owns and drives the full planning cycle,annual operating plans, quarterly forecasts, rolling projections, and long-range strategic planning tailored to AI research timelines (spoiler: annual budgets don't work)

Builds sophisticated financial models covering multiple revenue streams (API, Enterprise, Playground), GPU compute economics, headcount planning, and cash flow across US and German entities

Delivers monthly and quarterly business reviews with variance analysis, KPIs, and actionable insights that support executive planning and board reporting

Leads scenario planning for decisions most companies don't face: GPU provider contracts, go-to-market expansion, pricing frameworks, and R&D investment allocation where one breakthrough hire might 10x capabilities

Partners with GTM leadership on revenue forecasting when your 'pipeline' includes researchers, startups, and Fortune 500s with completely different economics

Collaborates with Engineering on GPU compute optimization and infrastructure planning where the input is 'we need more GPUs' and your job is figuring out how many, when, and from which providers

Builds executive dashboards tracking what actually matters: ARR growth, customer cohort economics, gross margins, burn rate, and the relationship between model quality and infrastructure cost

Designs scalable FP&A processes and drives automation across financial planning,working with Accounting to ensure data integrity and reporting consistency

Supports high-impact strategic initiatives: pricing optimization, enterprise contract structuring, customer segmentation economics, and fundraising support

Questions We're Wrestling With:

How do you forecast revenue when your models are simultaneously open source and commercial, free tier and enterprise?

What's the right balance between investing in cutting-edge research (expensive, uncertain) and scaling known winners (profitable, competitive)?

What do healthy gross margins look like when your COGS is GPU compute and infrastructure optimization is a competitive advantage?

How do you plan headcount when breakthrough hires in research might exponentially increase capabilities?

How do you structure pricing as customers move from experimentation to production at scale?

What financial frameworks work for a German company scaling globally with US operations?

Who Thrives Here:

You've built FP&A in high-growth environments where 'best practices' are still being written. You're equally comfortable building three-statement models from scratch as you are explaining complex tradeoffs to non-finance stakeholders. You understand that perfect forecasts are impossible, but disciplined planning is essential. You move fast, think in systems, and know when precision matters versus when directional clarity is enough.

You likely have:

6-10 years in FP&A, corporate finance, investment banking, or strategic finance, with at least 4 years hands-on FP&A experience at a high-growth company

Proven track record owning full planning cycles (annual budgeting, quarterly forecasts, long-range planning) at a B2B SaaS, AI, or technology company

Advanced Excel/Google Sheets modeling skills,you build complex financial models from first principles, not templates

Fluency in SaaS metrics (ARR/MRR, NDR, CAC, LTV, payback period, gross margin, Rule of 40) and ideally consumption-based pricing models

Experience with modern finance tech stack and genuine curiosity about AI-powered finance workflows versus legacy systems

Comfort with international operations, multi-entity financial structures, and US GAAP

Ability to work with large datasets, perform deep variance analysis, and build dashboards that executives actually use

We'd be especially excited if you:

Have experience with usage-based or consumption revenue models, API pricing structures, or GPU/cloud infrastructure economics

Understand subscription economics alongside usage-based pricing in technical or developer-focused markets

Bring exposure to enterprise contract structuring and technical sales processes

Have worked somewhere that scaled 3-5x and had to rebuild planning processes mid-flight

Are intellectually honest about uncertainty,you can say 'here's what we don't know yet' without flinching

How We Work Together:

We're a distributed team with real offices that people actually use. Depending on your role, you'll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We'll cover reasonable travel costs to make this possible. We think in-person time matters, and we've structured things to make it accessible to all. We'll discuss what this will look like for the role during our interview process.

Everything we do is grounded in four values:

Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.

Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.

Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.

Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

What We're Building Toward:

We're not just tracking expenses,we're building the financial architecture that enables a frontier AI company to make disciplined decisions at breakthrough speed. Every model you build gives leadership better visibility. Every process you implement accelerates decision-making. Every scenario you analyze shapes how we allocate resources between growth and efficiency, research investment and commercialization. If that sounds more compelling than optimizing existing budgets, we should talk.

XML job scraping automation by YubHub

]]> full-time senior hybrid $165,000–$210,000 USD Financial Planning & Analysis, Financial Modeling, Strategic Planning, International Financial Reporting Standards, US GAAP, Excel, Google Sheets, Data Analysis, Variance Analysis, KPIs, Actionable Insights, Executive Planning, Board Reporting, Scenario Planning, GPU Provider Contracts, Go-to-Market Expansion, Pricing Frameworks, R&D Investment Allocation, Revenue Forecasting, GTM Leadership, GPU Compute Optimization, Infrastructure Planning, Executive Dashboards, ARR Growth, Customer Cohort Economics, Gross Margins, Burn Rate, Model Quality, Infrastructure Cost, Scalable FP&A Processes, Automation, Financial Planning, Accounting, Data Integrity, Reporting Consistency, High-Impact Strategic Initiatives, Pricing Optimization, Enterprise Contract Structuring, Customer Segmentation Economics, Fundraising Support, Usage-Based Revenue Models, Consumption-Based Pricing Models, API Pricing Structures, GPU/Cloud Infrastructure Economics, Subscription Economics, Technical Sales Processes Finance Technology A frontier research lab pioneering Latent Diffusion and Stable Diffusion - breakthroughs that made generative AI accessible to millions. https://logos.yubhub.co/example.com.png The company is a ~50-person team learning what's possible at the edge of generative AI, with models downloaded hundreds of millions of times. https://www.example.com https://job-boards.greenhouse.io/blackforestlabs/jobs/5099226008 San Francisco (USA), Remote (EU), London (United Kingdom) 2026-04-24 d08d38d2-b72 Engineering Manager, Agent Prompts & Evals About the Role

Anthropic is looking for an Engineering Manager to lead the Agent Prompts & Evals team. This team owns the infrastructure that lets Anthropic ship model and prompt changes with confidence , the eval frameworks, system prompt pipelines, and regression-detection systems that every model launch depends on.

When a new Claude model is ready to ship, this team is the one answering “is it actually better in our products?” When a product team wants to change how Claude behaves, this team owns the tooling that tells them whether they broke something. It’s a platform team whose platform is model behavior itself.

The team sits deliberately at the seam between product engineering and research. You’ll partner closely with other evals groups across the company on shared infrastructure and methodology, with product teams who are shipping features on top of Claude, and with the TPMs and research PMs driving model launches. The pace is set by the model release cadence, and the team operates as both a platform owner and a hands-on partner during launch periods.

Responsibilities

Lead and grow a team of prompt engineers and platform software engineers
Own the product-side eval platform: the frameworks, dashboards, bulk runners, and CI integrations that product teams use to measure Claude’s behavior and catch regressions before they ship
Own system prompt infrastructure: versioning, deployment, rollback, and review tooling for the prompts that run in production across claude.ai, the API, and agentic surfaces
Be a steady hand through model launches , these are the team’s highest-stakes operational moments and the EM is the backstop when things get chaotic
Build durable collaboration with other evals groups across the company; this means real work on ownership boundaries, shared roadmaps, and avoiding tragedy-of-the-commons on shared eval infrastructure
Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior
Shape where the team invests next: there are credible paths into frontier eval development, model launch automation, and deeper prompt engineering support, and part of the job is sequencing them
Push the team toward measuring things that are hard to measure , behavioral drift, prompt quality, harness parity , not just things that are easy

You May Be a Good Fit If You Have

8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team where your customers were other engineers
A track record of building “pits of success” , tooling and process that made it easy for other teams to do the right thing without needing to understand all the details
Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm, all at once
Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs , you don’t need to be writing code by hand every day, but you should be able to read it, review it, and be comfortable leveraging Claude to understand, design, and occasionally build.
A product mindset and willingness to wear multiple hats when the work calls for it
Demonstrated ability to build and maintain peer relationships with partner orgs that have different cultures and incentives , negotiating ownership, aligning roadmaps, and holding ground when it matters without being territorial about it
Experience recruiting and closing senior ICs in a competitive market

Strong Candidates May Also Have

Prior exposure to LLM evals, ML experimentation platforms, or model quality work , even tangentially
Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems
Background in devtools, CI/CD platforms, or testing infrastructure at scale
A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability
Interest in AI safety and alignment , not required, but it makes the “why” of the work land harder

Logistics

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

How we’re different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We’re an extremely collaborative group, and we host frequent research discussions

XML job scraping automation by YubHub

]]> full-time senior hybrid $320,000-$405,000 USD software engineering, team management, platform ownership, service-to-other-teams, launch-driven operational rhythm, system design, pipeline architecture, product mindset, recruiting and closing senior ICs, LLM evals, ML experimentation platforms, model quality work, A/B testing infrastructure, feature flagging, gradual rollout systems, devtools, CI/CD platforms, testing infrastructure at scale Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that creates reliable, interpretable, and steerable AI systems. The company has a team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5159608008 San Francisco, CA | New York City, NY 2026-04-18 0806749e-694 Engineering Manager, Agent Prompts & Evals About the Role

Responsibilities

Lead and grow a team of prompt engineers and platform software engineers
Own the product-side eval platform: the frameworks, dashboards, bulk runners, and CI integrations that product teams use to measure Claude’s behavior and catch regressions before they ship
Own system prompt infrastructure: versioning, deployment, rollback, and review tooling for the prompts that run in production across claude.ai, the API, and agentic surfaces
Be a steady hand through model launches , these are the team’s highest-stakes operational moments and the EM is the backstop when things get chaotic
Build durable collaboration with other evals groups across the company; this means real work on ownership boundaries, shared roadmaps, and avoiding tragedy-of-the-commons on shared eval infrastructure
Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior
Shape where the team invests next: there are credible paths into frontier eval development, model launch automation, and deeper prompt engineering support, and part of the job is sequencing them
Push the team toward measuring things that are hard to measure , behavioral drift, prompt quality, harness parity , not just things that are easy

Requirements

8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team where your customers were other engineers
A track record of building “pits of success” , tooling and process that made it easy for other teams to do the right thing without needing to understand all the details
Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm, all at once
Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs , you don’t need to be writing code by hand every day, but you should be able to read it, review it, and be comfortable leveraging Claude to understand, design, and occasionally build.
A product mindset and willingness to wear multiple hats when the work calls for it
Demonstrated ability to build and maintain peer relationships with partner orgs that have different cultures and incentives , negotiating ownership, aligning roadmaps, and holding ground when it matters without being territorial about it
Experience recruiting and closing senior ICs in a competitive market

Nice to Have

Prior exposure to LLM evals, ML experimentation platforms, or model quality work , even tangentially
Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems
Background in devtools, CI/CD platforms, or testing infrastructure at scale
A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability
Interest in AI safety and alignment , not required, but it makes the “why” of the work land harder

Logistics

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

XML job scraping automation by YubHub

]]> full-time senior hybrid $320,000-$405,000 USD Software engineering, Team management, Platform ownership, Service-to-other-teams, Launch-driven operational rhythm, System design, Pipeline architecture, Product mindset, Peer relationships, Recruiting and closing senior ICs, LLM evals, ML experimentation platforms, Model quality work, A/B testing infrastructure, Feature flagging, Gradual rollout systems, Devtools, CI/CD platforms, Testing infrastructure, AI safety and alignment Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that creates reliable, interpretable, and steerable AI systems. It has a team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5159608008 San Francisco, CA | New York City, NY 2026-04-18 c6f5337c-c2f Research Engineer (Scaling Multimodal Data) We're looking for a research engineer to help improve our in-house world models through better multimodal data. This role is about figuring out what data actually moves model quality , then building the datasets, pipelines, and experiments to prove it.

The best generative models aren’t just a product of model architecture and compute, they are a product of the training data. The model output reflects someone’s obsession over what goes into the data, how it’s processed, and what gets thrown away. We’re looking for the person who does the obsessing and builds the tools to act on it at scale.

This isn’t a role where someone hands you a dataset and asks you to clean it. You will decide what data we need, figure out where to get it, build the processing and curation systems, and close the loop with model training to make sure it actually works.

Responsibilities:

Discover, evaluate, and acquire training data
Build data processing and curation systems
Look at the actual data constantly
Close the data → model → evaluation loop
Deploy ML models for data enrichment
Make systematic, documented decisions

Requirements:

Strong software engineering fundamentals
Deep experience with image and video data at scale
Experience with distributed computing
Experience using ML models as components
A research-oriented approach to data decisions
Familiarity with the model training lifecycle

Nice to Have:

Familiarity with columnar and large-scale data storage formats and libraries
Track record of independently discovering and integrating new data sources into a training pipeline
Direct experience closing the data → model quality loop
Strong visual intuition for data quality and diversity

What This Isn’t:

Not infrastructure
Not pure research
Not a role where you wait for instructions

XML job scraping automation by YubHub

]]> full-time senior onsite software engineering fundamentals, image and video data at scale, distributed computing, ML models as components, research-oriented approach to data decisions, model training lifecycle, columnar and large-scale data storage formats and libraries, independently discovering and integrating new data sources, closing the data → model quality loop, visual intuition for data quality and diversity Engineering Technology World Labs https://logos.yubhub.co/world-labs.com.png World Labs builds foundational world models that can perceive, generate, reason, and interact with the 3D world. https://world-labs.com/ https://job-boards.greenhouse.io/worldlabs/jobs/4164503009 San Francisco 2026-04-17 ccdfdbc6-9b5 Research Scientist, Gemini Personal Intelligence At Google DeepMind, we're pushing the boundaries of Large Language Models (LLMs) to build the brain of the world's most helpful personal assistant. As a Research Scientist for Gemini Personal Intelligence, you will advance the state-of-the-art in understanding and reasoning to create an AI that truly understands, remembers, and adapts to the user's unique life and context.

Key responsibilities for this role include:

Driving research on post-training techniques (e.g., RL, SFT, and preference optimisation) specifically tailored for personalisation scenarios.
Developing novel evaluation frameworks and simulation methods to measure model quality against user behaviours / feedback.
Designing and training agents capable of orchestrating tools and APIs to deliver hyper-personalised experiences.

We are seeking a Research Scientist who can drive new research ideas from conception and experimentation through to productionisation. In this rapidly shifting landscape, we regularly invent novel solutions to open-ended problems. You should be flexible, adaptable, and comfortable pivoting when ideas don't work out.

To succeed in this role, you will need:

A PhD in Machine Learning, Computer Science, or a relevant field (or equivalent practical research experience).
A proven track record of research excellence (e.g., publications at top-tier venues like NeurIPS, ICML, ICLR, or significant industry contributions).
Strong software engineering skills to complement your research background.

In addition, hands-on experience with modern post-training methods (SFT, RLHF, etc.) and prior work applying LLMs to personalisation, memory, or agentic workflows would be an advantage.

At Google DeepMind, we want employees and their families to live happier and healthier lives, both in and out of work, and our benefits reflect that. Some select benefits we offer include enhanced maternity, paternity, adoption, and shared parental leave, private medical and dental insurance for yourself and any dependents, and flexible working options.

XML job scraping automation by YubHub

]]> full-time senior onsite $141,000 USD - 244,000 USD + bonus + equity + benefits Machine Learning, Computer Science, Software Engineering, Post-training techniques, RL, SFT, Preference optimisation, Evaluation frameworks, Simulation methods, Model quality, User behaviours, Feedback, Agent design, Tool orchestration, API integration, Hyper-personalisation, Modern post-training methods, LLMs, Personalisation, Memory, Agentic workflows Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a leading artificial intelligence research organisation. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7477025 Mountain View, California, US 2026-03-16