{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/columnar-storage-standards"},"x-facet":{"type":"skill","slug":"columnar-storage-standards","display":"Columnar Storage Standards","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_5bc76aca-281"},"title":"Research Engineer, Data Infrastructure","description":"<p>About Mistral AI</p>\n<p>At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.</p>\n<p>We are a dynamic team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation.</p>\n<p>Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.</p>\n<p>Join us to be part of a pioneering company shaping the future of AI.</p>\n<p>Together, we can make a meaningful impact.</p>\n<p>Role Summary</p>\n<p>Research Engineer, Data Infrastructure</p>\n<p>The Data Infrastructure team at Mistral AI is architecting the backbone of our frontier model training and fine-tuning ecosystem. We are building the specialized compute and data fabrics required to power the development of world-class AI.</p>\n<p>Our vision is to operate some of the largest compute fleets in production and build data lakes and metadata systems with a roadmap toward exabyte-scale architecture.</p>\n<p>We are currently in the process of building a high-performance training platform designed for massive scale across both on-premise and cloud-native Kubernetes environments.</p>\n<p>We are leading a strategic transition from legacy scheduling to modern orchestration.</p>\n<p>With numerous clusters distributed across various regions, we are focussed on implementing sophisticated multi-cluster orchestration and cloud-bursting capabilities to better utilize our global resources and ensure our researchers have seamless access to compute wherever it resides.</p>\n<p>Our mission is to evolve our current systems into a platform that is as durable as it is flexible.</p>\n<p>Location: Paris / London (hybrid) or remote EU/UK with one hub day per month.</p>\n<p>About the Role</p>\n<p>This role focuses on building and operating the next generation of data infrastructure at Mistral AI.</p>\n<p>You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability.</p>\n<p>You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research.</p>\n<p>You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Build &amp; Scale: Help us reach our goal of operating massive distributed compute and storage systems</li>\n</ul>\n<ul>\n<li>Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.</li>\n</ul>\n<ul>\n<li>Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.</li>\n</ul>\n<ul>\n<li>Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.</li>\n</ul>\n<ul>\n<li>Metadata &amp; Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.</li>\n</ul>\n<ul>\n<li>Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.</li>\n</ul>\n<ul>\n<li>Have experience or a strong interest in supporting foundational compute and storage platforms.</li>\n</ul>\n<ul>\n<li>Are proficient in Python and enjoy solving the &quot;brittle data lake&quot; problem with modern, columnar storage standards.</li>\n</ul>\n<ul>\n<li>Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.</li>\n</ul>\n<ul>\n<li>Take pride in building and operating scalable, reliable, and secure systems from the ground up.</li>\n</ul>\n<ul>\n<li>Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.</li>\n</ul>\n<p>Benefits</p>\n<p>France</p>\n<ul>\n<li>Competitive cash salary and equity</li>\n</ul>\n<ul>\n<li>Food: Daily lunch vouchers</li>\n</ul>\n<ul>\n<li>Sport: Monthly contribution to a Gympass subscription</li>\n</ul>\n<ul>\n<li>Transportation: Monthly contribution to a mobility pass</li>\n</ul>\n<ul>\n<li>Health: Full health insurance for you and your family</li>\n</ul>\n<ul>\n<li>Parental: Generous parental leave policy</li>\n</ul>\n<ul>\n<li>Visa sponsorship</li>\n</ul>\n<p>UK</p>\n<ul>\n<li>Competitive cash salary and equity</li>\n</ul>\n<ul>\n<li>Insurance</li>\n</ul>\n<ul>\n<li>Transportation: Reimburse office parking charges, or £90 per month for public transport</li>\n</ul>\n<ul>\n<li>Sport: £90 per month reimbursement for gym membership</li>\n</ul>\n<ul>\n<li>Meal voucher: £200 monthly allowance for meals</li>\n</ul>\n<ul>\n<li>Pension plan: SmartPension (percentages are 5% Employee &amp; 3% Employer)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_5bc76aca-281","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/071a5491-ea01-413f-ad78-f85b5e4c2215","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","Kubernetes","Data Infrastructure","MLOps","Infrastructure Engineering","Cloud-Native Deployments","Modern Deployment Workflows","Columnar Storage Standards","Distributed Systems","Multi-Cluster Environments"],"x-skills-preferred":[],"datePosted":"2026-04-24T16:12:09.114Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Kubernetes, Data Infrastructure, MLOps, Infrastructure Engineering, Cloud-Native Deployments, Modern Deployment Workflows, Columnar Storage Standards, Distributed Systems, Multi-Cluster Environments"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dbfbd1d2-0a3"},"title":"Research Engineer, Data Infrastructure","description":"<p>About Mistral AI</p>\n<p>Mistral AI is a pioneering company shaping the future of AI. We believe in the power of AI to simplify tasks, save time, and enhance learning and creativity.</p>\n<p>Role Summary</p>\n<p>The Data Infrastructure team at Mistral AI is architecting the backbone of our frontier model training and fine-tuning ecosystem. We are building the specialized compute and data fabrics required to power the development of world-class AI.</p>\n<p>In this role, you will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability. You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Build &amp; Scale: Help us reach our goal of operating massive distributed compute and storage systems</li>\n<li>Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.</li>\n<li>Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.</li>\n<li>Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.</li>\n<li>Metadata &amp; Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.</li>\n<li>Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.</li>\n</ul>\n<p>You might thrive in this role if you:</p>\n<ul>\n<li>Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.</li>\n<li>Have experience or a strong interest in supporting foundational compute and storage platforms.</li>\n<li>Are proficient in Python and enjoy solving the &quot;brittle data lake&quot; problem with modern, columnar storage standards.</li>\n<li>Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.</li>\n<li>Take pride in building and operating scalable, reliable, and secure systems from the ground up.</li>\n<li>Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.</li>\n</ul>\n<p>Benefits</p>\n<p>France</p>\n<ul>\n<li>Competitive cash salary and equity</li>\n<li>Food: Daily lunch vouchers</li>\n<li>Sport: Monthly contribution to a Gympass subscription</li>\n<li>Transportation: Monthly contribution to a mobility pass</li>\n<li>Health: Full health insurance for you and your family</li>\n<li>Parental: Generous parental leave policy</li>\n</ul>\n<p>UK</p>\n<ul>\n<li>Competitive cash salary and equity</li>\n<li>Insurance</li>\n<li>Transportation: Reimburse office parking charges, or £90 per month for public transport</li>\n<li>Sport: £90 per month reimbursement for gym membership</li>\n<li>Meal voucher: £200 monthly allowance for meals</li>\n<li>Pension plan: SmartPension (percentages are 5% Employee &amp; 3% Employer)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dbfbd1d2-0a3","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/071a5491-ea01-413f-ad78-f85b5e4c2215","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","Kubernetes","Data Infrastructure","MLOps","Infrastructure Engineering","Columnar Storage Standards"],"x-skills-preferred":[],"datePosted":"2026-04-24T13:10:47.945Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Kubernetes, Data Infrastructure, MLOps, Infrastructure Engineering, Columnar Storage Standards"}]}