{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/data-lakehouse-architecture"},"x-facet":{"type":"skill","slug":"data-lakehouse-architecture","display":"Data Lakehouse Architecture","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d5b743bb-d8f"},"title":"Product Manager, AI Platforms","description":"<p>The AI Platform Product Manager will drive the strategy and execution of Shield AI&#39;s next-generation autonomy intelligence stack. This PM owns the product vision and roadmap for the Hivemind AI Platform, ensuring we can manufacture, govern, and field advanced world models, robotics foundation models, and vision-language-action systems safely and at scale.</p>\n<p>This role sits at the intersection of AI/ML, autonomy, model lifecycle, infrastructure, and product strategy. The PM partners closely with engineering, AI research, Hivemind Solutions, and field teams to deliver the tooling that enables sovereign autonomy, AI Factories at the edge, and continuous learning,capabilities that are central to Shield AI&#39;s strategic direction.</p>\n<p>This is a high-impact role for an experienced product leader excited to define how foundation models are trained, validated, governed, and deployed across thousands of autonomous systems in highly contested environments.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>AI Model Development &amp; Training Platform</li>\n</ul>\n<p>Own the roadmap for foundation model training workflows, including dataset ingestion, curation, labeling, synthetic data generation, domain model training, and distillation pipelines. Define requirements for world models, robotics models, and VLA-based training, evaluation, and specialization. Lead the evolution of MLOps capabilities in Forge, including data lineage, experiment tracking, model versioning, and scalable evaluation suites.</p>\n<ul>\n<li>Data, Simulation &amp; Synthetic Data Factory</li>\n</ul>\n<p>Define product requirements for synthetic data generation, simulation-integrated data flywheels, and automated scenario generation. Partner with Digital Twin, Simulation, and autonomy teams to convert natural-language mission inputs into data needs, training procedures, and model variants.</p>\n<ul>\n<li>Safe Deployment &amp; Model Governance</li>\n</ul>\n<p>Lead the development of model governance and auditability tooling, including model cards, dataset rights, lineage tracking, safety gates, and compliance evidence. Build guardrails and workflows to safely deploy models onto edge hardware in disconnected, GPS- or comms-denied environments. Partner with Safety, Certification, Cyber, and Engineering teams to ensure traceability and evaluation pipelines meet operational and accreditation requirements.</p>\n<ul>\n<li>Edge Deployment &amp; AI Factory Integration</li>\n</ul>\n<p>Partner with Pilot, EdgeOS, and hardware teams to integrate foundation-model-based perception and reasoning into autonomy behaviors. Define requirements for distillation, quantization, and inference tooling as part of the “three-computer” development and deployment model. Ensure closed-loop workflows between cloud model training and edge-native execution.</p>\n<ul>\n<li>Cross-Functional Leadership</li>\n</ul>\n<p>Collaborate with Engineering, Research, Product, Customer Engagement, and Solutions teams to ensure model outputs meet mission and platform constraints. Translate advanced AI capabilities into intuitive workflows that platform OEMs and partner nations can use to build sovereign AI factories. Sequence foundational capabilities that unblock autonomy, simulation, and customer-facing product teams.</p>\n<ul>\n<li>User &amp; Customer Impact</li>\n</ul>\n<p>Develop deep empathy for ML engineers, autonomy developers, and Solutions engineers who rely on the platform. Capture operational data gaps, mission-driven model needs, and domain-specific specialization requirements. Lead demos and onboarding for model-development capabilities across internal and external teams.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d5b743bb-d8f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Shield AI","sameAs":"https://www.shield.ai","logo":"https://logos.yubhub.co/shield.ai.png"},"x-apply-url":"https://jobs.lever.co/shieldai/7886f437-2d5e-4616-8dcb-3dc488f1f585","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$190,000 - $290,000 a year","x-skills-required":["AI Model Development & Training Platform","Data, Simulation & Synthetic Data Factory","Safe Deployment & Model Governance","Edge Deployment & AI Factory Integration","Cross-Functional Leadership","User & Customer Impact","Strong engineering background","Deep understanding of foundation models, robotics models, multimodal models, MLOps, and training infrastructure","Experience managing complex products spanning data pipelines, cloud training clusters, model governance, and edge deployments","Proven success partnering with research teams to transition ML innovations into stable, production-grade workflows"],"x-skills-preferred":["Experience working on autonomy, robotics, embedded AI, or mission-critical systems","Hands-on familiarity with GPU infrastructure, distributed training, or data lakehouse architectures","Experience supporting defense, dual-use, or safety-critical AI systems","Background designing or operating AI Factory–style pipelines (data → training → evaluation → distillation → edge deployment)","Advanced degree in engineering, ML/AI, robotics, or a related field"],"datePosted":"2026-04-17T13:02:54.419Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Diego"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI Model Development & Training Platform, Data, Simulation & Synthetic Data Factory, Safe Deployment & Model Governance, Edge Deployment & AI Factory Integration, Cross-Functional Leadership, User & Customer Impact, Strong engineering background, Deep understanding of foundation models, robotics models, multimodal models, MLOps, and training infrastructure, Experience managing complex products spanning data pipelines, cloud training clusters, model governance, and edge deployments, Proven success partnering with research teams to transition ML innovations into stable, production-grade workflows, Experience working on autonomy, robotics, embedded AI, or mission-critical systems, Hands-on familiarity with GPU infrastructure, distributed training, or data lakehouse architectures, Experience supporting defense, dual-use, or safety-critical AI systems, Background designing or operating AI Factory–style pipelines (data → training → evaluation → distillation → edge deployment), Advanced degree in engineering, ML/AI, robotics, or a related field","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":190000,"maxValue":290000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_015e5c6d-a31"},"title":"Senior Data Engineer","description":"<p><strong>Why Valvoline Global Operations?</strong></p>\n<p>At Valvoline Global Operations, we&#39;re proud to be The Original Motor Oil, but we&#39;ve never rested on being first. Founded in 1866, we introduced the world&#39;s first branded motor oil, staking our claim as a pioneer in the automotive and industrial solutions industry.</p>\n<p><strong>Job Purpose</strong></p>\n<p>We are seeking a highly skilled and motivated Data Engineer to join our growing data and analytics team. The ideal candidate will have strong experience designing and developing scalable data pipelines, integrating complex systems, and optimizing data workflows. Proficiency in Databricks and SAP Datasphere is preferred, as these platforms are central to our data ecosystem.</p>\n<p><strong>How You Make an Impact (Job Accountabilities)</strong></p>\n<ul>\n<li>Design, build, and maintain robust, scalable, and high-performance data pipelines using Databricks and SAP Datasphere.</li>\n<li>Collaborate with data architects, analysts, data scientists, and business stakeholders to gather requirements and deliver data solutions aligned with stakeholders&#39; goals.</li>\n<li>Integrate diverse data sources (e.g., SAP, APIs, flat files, cloud storage) into the enterprise data platforms</li>\n<li>Ensure high standards of data quality and implement data governance practices. Stay current with emerging trends and technologies in cloud computing, big data, and data engineering.</li>\n<li>Provide ongoing support for the platform, troubleshoot any issues that arise, and ensure high availability and reliability of data infrastructure.</li>\n<li>Create documentation for the platform infrastructure and processes, and train other team members or users in platform effectively.</li>\n</ul>\n<p><strong>What You Bring to the Role (Job Qualifications / Education / Skills / Requirements / Capabilities)</strong></p>\n<ul>\n<li>Bachelor&#39;s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field.</li>\n<li>5-7+ years of experience in a data engineering or related role.</li>\n<li>Strong knowledge of data engineering principles, data warehousing concepts, and modern data architecture.</li>\n<li>Proficiency in SQL and at least one programming language (e.g., Python, Scala).</li>\n<li>Experience with cloud platforms (e.g., Azure, AWS, or GCP), particularly in data services.</li>\n<li>Familiarity with data orchestration tools (e.g., PySpark, Airflow, Azure Data Factory) and CI/CD pipelines.</li>\n</ul>\n<p><strong>Competencies Desired</strong></p>\n<ul>\n<li>Hands-on experience with Databricks (including Spark/PySpark, Delta Lake, MLflow, Unity Catalog, etc.).</li>\n<li>Practical experience working with SAP Datasphere (or SAP Data Warehouse Cloud) in data modeling and data integration scenarios.</li>\n<li>SAP BW or SAP HANA experience is a plus.</li>\n<li>Experience with BI tools like Power BI or Tableau.</li>\n<li>Understanding of data governance frameworks and data security best practices.</li>\n<li>Exposure to data lakehouse architecture and real-time streaming data pipelines.</li>\n<li>Certifications in Databricks, SAP, or cloud platforms are advantageous.</li>\n</ul>\n<p><strong>Working Conditions / Physical Requirements / Travel Requirements</strong></p>\n<ul>\n<li>Normal Office environment.</li>\n<li>Prolonged periods of computer use and frequent participation in meetings</li>\n<li>Occasional walking, standing, and light lifting (up to 10 lbs)</li>\n</ul>\n<ul>\n<li>Minimal travel required.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_015e5c6d-a31","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Valvoline Global Operations","sameAs":"https://jobs.valvolineglobal.com","logo":"https://logos.yubhub.co/jobs.valvolineglobal.com.png"},"x-apply-url":"https://jobs.valvolineglobal.com/job/Senior-Data-Engineer/1316654400/","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["data engineering","Databricks","SAP Datasphere","SQL","Python","Scala","cloud platforms","data orchestration tools","CI/CD pipelines"],"x-skills-preferred":["Databricks","SAP Datasphere","SAP BW","SAP HANA","Power BI","Tableau","data governance frameworks","data security best practices","data lakehouse architecture","real-time streaming data pipelines"],"datePosted":"2026-03-08T22:14:37.507Z","jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Automotive","skills":"data engineering, Databricks, SAP Datasphere, SQL, Python, Scala, cloud platforms, data orchestration tools, CI/CD pipelines, Databricks, SAP Datasphere, SAP BW, SAP HANA, Power BI, Tableau, data governance frameworks, data security best practices, data lakehouse architecture, real-time streaming data pipelines"}]}