AI Engineer, Product

58b03260-1e2 AI Engineer, Product About Mistral AI

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

We are a global company with a diverse workforceREADME

Embedded directly in a product team as search, chat, documents, or audio, you'll improve AI-powered features through rigorous evaluation, prompt and orchestration design, and rapid experimentation. You'll own your domain's AI quality end-to-end: define what "good" looks like, measure it, run experiments, and ship what works.

Responsibilities

• Design and run evaluations for your product area: reference tests, heuristics, model-graded checks tailored to search relevance, chat quality, document understanding, or audio performance.

• Define and track metrics that matter: task success, helpfulness, hallucination proxies, safety flags, latency, cost.

• Own prompt and orchestration design: write, test, and iterate on prompts and system prompts as a core part of your work.

• Run A/B tests on prompts, models, and configurations; analyze results; make rollout or rollback decisions from data.

• Set up observability for LLM calls: structured logging, tracing, dashboards, alerts.

• Operate model releases: canary and shadow traffic, sign-offs, SLO-based rollback criteria, regression detection.

• Improve core behaviors in your product area, whether that's memory policies, intent classification, routing, tool-call reliability, or retrieval quality.

• Create templates and documentation so other teams can author evals and ship safely.

• Partner with Science to diagnose regressions and lead post-mortems.

About you

• 3-4 years of experience; backgrounds that fit well include ML engineers moving closer to product, or software engineers with real AI/ML production experience.

• Strong TypeScript or Python skills - we have both tracks depending on team fit.

• Production LLM experience: prompts, tool/function calling, system prompts.

• Hands-on with evals and A/B testing; you can design metrics, not just run them.

• Comfortable implementing directly in product code, not only notebooks.

• Observability experience: logging, tracing, dashboards, alerting.

• Product mindset: form hypotheses, run experiments, interpret results, ship.

• Clear communication, autonomous, and oriented toward production impact over experimentation for its own sake.

It would be ideal if you also have:

• Safety systems experience: moderation, PII handling/redaction, guardrails.

• Release operations: canary/shadowing, automated rollbacks, experiment platforms.

• Prior work on search ranking, chat systems, document AI, or audio ML features.

Hiring Process

• Introduction call - 30 min

• Hiring Manager interview - 30 min

• Technical Rounds - Live-coding Interview - 45 min - AI Engineering Interview - 45 min

• Culture-fit discussion - 30 min

• References

By applying, you agree to our Applicant Privacy Policy.

XML job scraping automation by YubHub

]]> full-time mid hybrid TypeScript, Python, Production LLM experience, Evals and A/B testing, Observability, Product mindset, Clear communication, Safety systems experience, Release operations, Search ranking, Chat systems, Document AI, Audio ML features Engineering Technology Mistral AI https://logos.yubhub.co/mistral.ai.png Mistral AI develops high-performance, open-source AI models and solutions for enterprise use. Its comprehensive AI platform meets needs in both on-premises and cloud environments. https://mistral.ai https://jobs.lever.co/mistral/c79ff8ed-6689-4dda-aec6-979a5dc767d0 Paris 2026-04-17 6663d8f4-ea5 AI Engineer, Product About Mistral AI

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

We are a global company with teams distributed between France, USA, UK, Germany, and Singapore. Our diverse workforce thrives in competitive environments and is committed to driving innovation.

Role Summary

Embedded directly in a product team as search, chat, documents, or audio, you'll improve AI-powered features through rigorous evaluation, prompt and orchestration design, and rapid experimentation. You'll own your domain's AI quality end-to-end: define what 'good' looks like, measure it, run experiments, and ship what works.

Responsibilities

Design and run evaluations for your product area: reference tests, heuristics, model-graded checks tailored to search relevance, chat quality, document understanding, or audio performance.
Define and track metrics that matter: task success, helpfulness, hallucination proxies, safety flags, latency, cost.
Own prompt and orchestration design: write, test, and iterate on prompts and system prompts as a core part of your work.
Run A/B tests on prompts, models, and configurations; analyze results; make rollout or rollback decisions from data.
Set up observability for LLM calls: structured logging, tracing, dashboards, alerts.
Operate model releases: canary and shadow traffic, sign-offs, SLO-based rollback criteria, regression detection.
Improve core behaviors in your product area, whether that's memory policies, intent classification, routing, tool-call reliability, or retrieval quality.
Create templates and documentation so other teams can author evals and ship safely.
Partner with Science to diagnose regressions and lead post-mortems.

About You

3-4 years of experience; backgrounds that fit well include ML engineers moving closer to product, or software engineers with real AI/ML production experience.
Strong TypeScript or Python skills - we have both tracks depending on team fit.
Production LLM experience: prompts, tool/function calling, system prompts.
Hands-on with evals and A/B testing; you can design metrics, not just run them.
Comfortable implementing directly in product code, not only notebooks.
Observability experience: logging, tracing, dashboards, alerting.
Product mindset: form hypotheses, run experiments, interpret results, ship.
Clear communication, autonomous, and oriented toward production impact over experimentation for its own sake.

Benefits

Competitive salary and equity package
Health insurance
Transportation allowance
Sport allowance
Meal vouchers
Private pension plan
Generous parental leave policy

XML job scraping automation by YubHub

]]> full-time mid hybrid TypeScript, Python, Production LLM experience, Evals and A/B testing, Observability, Product mindset, Safety systems experience, Release operations, Search ranking, Chat systems, Document AI, Audio ML features Engineering Technology Mistral AI Mistral AI develops high-performance, open-source AI models and solutions for enterprise use. https://mistral.ai https://jobs.lever.co/mistral/c79ff8ed-6689-4dda-aec6-979a5dc767d0 Paris 2026-03-10