{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/onnx"},"x-facet":{"type":"skill","slug":"onnx","display":"Onnx","count":3},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_586b9fef-509"},"title":"Senior Software Engineer - Network Enablement (Applied ML)","description":"<p>We believe that the way people interact with their finances will drastically improve in the next few years. We&#39;re dedicated to empowering this transformation by building the tools and experiences that thousands of developers use to create their own products.</p>\n<p>On this team, you will build and operate the ML infrastructure and product services that enable trust and intelligence across Plaid&#39;s network. You&#39;ll own feature engineering, offline training and batch scoring, online feature serving, and real-time inference so model outputs directly power partner-facing fraud &amp; trust products and bank intelligence features.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows).</li>\n<li>Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact).</li>\n<li>Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses.</li>\n<li>Build and operate offline training pipelines and production batch scoring for bank intelligence products.</li>\n<li>Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring.</li>\n<li>Implement model CI/CD, model/version registry, and safe rollout/rollback strategies.</li>\n<li>Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs.</li>\n<li>Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions.</li>\n<li>Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection).</li>\n<li>Ensure fairness, explainability and PII-aware handling for partner-facing ML features; maintain auditability for compliance.</li>\n<li>Partner with platform and cross-functional teams to scale the ML/data foundation (graph features, sequence embeddings, unified pipelines).</li>\n<li>Mentor engineers and document team standards for ML productization and operations.</li>\n</ul>\n<p><strong>Qualifications</strong></p>\n<ul>\n<li>Must-haves:</li>\n<li>Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred).</li>\n<li>Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark.</li>\n<li>Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference.</li>\n<li>Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics.</li>\n<li>Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline &amp; online parity, monitoring and incident response.</li>\n<li>Nice to have:</li>\n<li>Experience in fraud, risk, or marketing intelligence domains.</li>\n<li>Experience with feature-store products (Tecton / Chronon / Feast / internal) and unified pipelines.</li>\n<li>Experience with graph frameworks, graph feature engineering, or sequence embeddings.</li>\n<li>Experience optimizing inference at scale (Triton/ONNX/quantization, batching, caching).</li>\n</ul>\n<p><strong>Additional Information</strong></p>\n<p>Our mission at Plaid is to unlock financial freedom for everyone. To support that mission, we seek to build a diverse team of driven individuals who care deeply about making the financial ecosystem more equitable.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_586b9fef-509","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Plaid","sameAs":"https://plaid.com/","logo":"https://logos.yubhub.co/plaid.com.png"},"x-apply-url":"https://jobs.lever.co/plaid/43b1374d-5c5e-4b63-b710-a95e3cb76bbe","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$190,800-$286,800 per year","x-skills-required":["software engineering","systems design","APIs","backend services","Go","Python","batch and streaming data pipelines","orchestration tools","Airflow","Spark","real-time scoring","online feature-serving systems","feature stores","low-latency model inference","model outputs","product flows","experiments","product metrics","model lifecycle","operations","model registries","CI/CD","reproducible training","offline & online parity","monitoring","incident response"],"x-skills-preferred":["fraud","risk","marketing intelligence","feature-store products","unified pipelines","graph frameworks","graph feature engineering","sequence embeddings","inference at scale","Triton","ONNX","quantization","batching","caching"],"datePosted":"2026-04-17T12:51:26.228Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, systems design, APIs, backend services, Go, Python, batch and streaming data pipelines, orchestration tools, Airflow, Spark, real-time scoring, online feature-serving systems, feature stores, low-latency model inference, model outputs, product flows, experiments, product metrics, model lifecycle, operations, model registries, CI/CD, reproducible training, offline & online parity, monitoring, incident response, fraud, risk, marketing intelligence, feature-store products, unified pipelines, graph frameworks, graph feature engineering, sequence embeddings, inference at scale, Triton, ONNX, quantization, batching, caching","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":190800,"maxValue":286800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_e37be4c0-4be"},"title":"AI Inference Engineer","description":"<p>Perplexity is looking for an AI Inference Engineer to join their team. The successful candidate will be responsible for developing APIs for AI inference, benchmarking and addressing bottlenecks throughout the inference stack, improving the reliability and observability of systems, and exploring novel research and implementing LLM inference optimisations.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<p>As an AI Inference Engineer at Perplexity, you will have the opportunity to work on large-scale deployment of machine learning models for real-time inference. You will be responsible for developing APIs for AI inference that will be used by both internal and external customers.</p>\n<ul>\n<li>Develop APIs for AI inference that will be used by both internal and external customers</li>\n<li>Benchmark and address bottlenecks throughout our inference stack</li>\n<li>Improve the reliability and observability of our systems and respond to system outages</li>\n<li>Explore novel research and implement LLM inference optimisations</li>\n</ul>\n<p><strong>What you need</strong></p>\n<p>To be successful in this role, you will need to have experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX), familiarity with common LLM architectures and inference optimisation techniques (e.g. continuous batching, quantisation, etc.), and understanding of GPU architectures or experience with GPU kernel programming using CUDA.</p>\n<ul>\n<li>Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)</li>\n<li>Familiarity with common LLM architectures and inference optimisation techniques (e.g. continuous batching, quantisation, etc.)</li>\n<li>Understanding of GPU architectures or experience with GPU kernel programming using CUDA</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_e37be4c0-4be","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Perplexity","sameAs":"https://www.perplexity.ai/","logo":"https://logos.yubhub.co/perplexity.ai.png"},"x-apply-url":"https://jobs.ashbyhq.com/perplexity/8a976851-9bef-4b07-8d36-567fa9540aef","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$220K – $405K","x-skills-required":["ML systems","deep learning frameworks","LLM architectures","inference optimisation techniques","GPU architectures","GPU kernel programming"],"x-skills-preferred":["continuous batching","quantisation","PyTorch","TensorFlow","ONNX"],"datePosted":"2026-03-04T12:24:24.046Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, New York City, Palo Alto"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"ML systems, deep learning frameworks, LLM architectures, inference optimisation techniques, GPU architectures, GPU kernel programming, continuous batching, quantisation, PyTorch, TensorFlow, ONNX","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":220000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_5399cdab-244"},"title":"Senior Software Engineer - On Device Machine Learning","description":"<p>We are looking for a Senior Software Engineer with expertise in software optimisation for gaming consoles and CPU/GPU architectures to join our Machine Learning team. You&#39;ll report to a Leader of Engine Development and collaborate with both game and central technology engineers and researchers to bring ML models into the hands of our players by deploying them directly into EA&#39;s games.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<ul>\n<li>Design, build, and maintain robust end-to-end solutions for running machine learning models efficiently on a variety of devices.</li>\n<li>Partner with ML experts across EA to help adopt and scale new models and architectures optimised for on-device performance.</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>7+ years of hands-on software engineering experience with C++, including expertise in multithreading and low-level/near-hardware optimisations.</li>\n<li>Good knowledge of GPU programming.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_5399cdab-244","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Electronic Arts","sameAs":"https://jobs.ea.com","logo":"https://logos.yubhub.co/jobs.ea.com.png"},"x-apply-url":"https://jobs.ea.com/en_US/careers/JobDetail/Senior-Software-Engineer-On-Device-Machine-Learning/212350","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C++","GPU programming"],"x-skills-preferred":["ML frameworks such as PyTorch or TensorFlow","Knowledge of the ONNX format"],"datePosted":"2026-02-05T13:04:30.445Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Guildford, Surrey, United Kingdom"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, GPU programming, ML frameworks such as PyTorch or TensorFlow, Knowledge of the ONNX format"}]}