{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/dagster"},"x-facet":{"type":"skill","slug":"dagster","display":"Dagster","count":11},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_21f5f6c3-734"},"title":"Data Engineer","description":"<p>About the Role We are at a pivotal scaling point where our data ambitions have outpaced our current setup, and we need a Data Engineer to architect the professional-grade foundations of our platform.</p>\n<p>This role exists to bridge the gap between &quot;getting data&quot; and &quot;engineering data,&quot; moving us from manual syncs to a fully automated ecosystem. By building custom pipelines and implementing a robust orchestration layer, you will directly enable our Operations teams and leadership to transition from basic reporting to sophisticated, AI-ready data products.</p>\n<p>Your primary focus will be on Infrastructure-as-Code, orchestration, and building a resilient &quot;plumbing&quot; system that serves as the backbone for our entire Product and GTM strategy.</p>\n<p>Your 12-Month Journey During the first 3 months: you will learn about our existing stack (GCP, BigQuery, Airbyte, dbt) and understand the current pain points in our data flow. You will identify and execute &quot;low-hanging fruit&quot; improvements to our product usage analytics, providing immediate value to the Product and GTM teams. You’ll begin designing the blueprint for our custom data pipelines and the migration strategy for moving our infrastructure into Terraform.</p>\n<p>Within 6 months: You will have deployed our new orchestration layer (e.g., Airflow or Dagster) and successfully transitioned our first set of custom pipelines to production. Collaborating with the Analytics Engineer, you will enable a unified view of our customer journey by successfully merging product usage data with CRM and billing data. At this point, a significant portion of our data infrastructure will be defined as code, reducing manual overhead and increasing deployment reliability.</p>\n<p>After 1 year: you will take full strategic ownership of the data platform and its long-term architecture. You will act as the go-to technical expert for the leadership team, advising on the scalability of new data-driven features. You will lay the groundwork for AI and Machine Learning initiatives by ensuring our data warehouse has the right quality controls, governance, and low-latency access patterns in place.</p>\n<p>What You’ll Be Doing Architect Scalable Infrastructure-as-Code: Take our existing foundations to the next level by migrating all GCP and BigQuery resources into Terraform. You will establish automated CI/CD patterns to ensure our entire data environment is reproducible, version-controlled, and enterprise-ready.</p>\n<p>Deploy State-of-the-Art Pipelines: Design, deploy, and operate high-quality production ELT pipelines. You will implement a modern orchestration layer (e.g., Airflow or Dagster) to build custom Python-based integrations while maintaining and optimizing our existing syncs.</p>\n<p>Champion Data Quality &amp; Performance: Act as the guardian of our data platform. You will implement rigorous testing and monitoring protocols to ensure data is accurate and timely. You will proactively identify BigQuery bottlenecks, optimizing query performance and resource utilization.</p>\n<p>Technical Roadmap &amp; Ownership: scope and architect end-to-end data flows from production source to warehouse. Manage your own technical backlog, prioritizing infrastructure stability over technical debt. You will ensure platform security and SOC2 compliance through PII masking, data contracts, and robust access controls.</p>\n<p>Collaboration: You will work in a tight loop with the Analytics Engineer to turn raw data into actionable products. You will partner daily with DataOps and RevOps to understand business requirements, with occasional strategic syncs with DevOps and R&amp;D to align on production schema changes and global infrastructure standards.</p>\n<p>What You Bring Solid experience in Data Engineering, with a track record of building and evolving data ingestion infrastructure in cloud environments. The Modern Data Stack: Familiarity with dbt and Airbyte/Fivetran. You understand how these tools fit into a broader ecosystem. Expertise in BigQuery (partitioning, clustering, IAM) and the broader GCP ecosystem; Infrastructure-as-Code (Terraform). Hands-on experience with Airflow, Dagster, or similar orchestration tools. You know how to design DAGs that are resilient and easy to debug. DevOps practices in the data context: familiarity with CI/CD best practices as they apply to data (data testing, automated deployments). Programming: Expert-level Python and advanced SQL. You are comfortable writing clean, testable, and modular code. Comfortable in a fast-paced environment Project management skills: capable of managing stakeholders, explaining complicated technical trade-offs to non-technical users, and taking care of own project scoping and backlog management. Fluency in English, both written and spoken, at a minimum C1 level</p>\n<p>What We Offer Flexibility to work from home in the Netherlands and from our beautiful canal-side office in Amsterdam A chance to be part of and shape one of the most ambitious scale-ups in Europe Work in a diverse and multicultural team €1,500 annual training budget plus internal training Pension plan, travel reimbursement, and wellness perks 28 paid holiday days + 2 additional days to relax in 2026 Work from anywhere for 4 weeks/year An inclusive and international work environment with a whole lot of fun thrown in! Apple MacBook and tools €200 Home Office budget</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_21f5f6c3-734","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Tellent","sameAs":"https://careers.tellent.com","logo":"https://logos.yubhub.co/careers.tellent.com.png"},"x-apply-url":"https://careers.tellent.com/o/data-engineer","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"EUR 70000–90000 / year","x-skills-required":["Data Engineering","Cloud environments","dbt","Airbyte/Fivetran","BigQuery","GCP ecosystem","Infrastructure-as-Code","Terraform","Airflow","Dagster","Python","SQL","CI/CD best practices","DevOps practices"],"x-skills-preferred":[],"datePosted":"2026-04-18T22:12:06.548Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Amsterdam"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Data Engineering, Cloud environments, dbt, Airbyte/Fivetran, BigQuery, GCP ecosystem, Infrastructure-as-Code, Terraform, Airflow, Dagster, Python, SQL, CI/CD best practices, DevOps practices","baseSalary":{"@type":"MonetaryAmount","currency":"EUR","value":{"@type":"QuantitativeValue","minValue":70000,"maxValue":90000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_09a4d1ce-cde"},"title":"Data Engineer","description":"<p>We are looking for an experienced Data Engineer to partner with our Data Science and Data Infrastructure teams to own and scale our data pipelines. You&#39;ll also work closely with stakeholders across business teams including sales, marketing, and finance to ensure that the data they need arrives promptly and reliably.</p>\n<p>As a Data Engineer at Figma, you will be responsible for building and maintaining scalable data pipelines that connect various cloud data sources. You will develop a deep understanding of Figma&#39;s core data models and optimize data pipelines for scale. You will partner with the Data Science and Data Infrastructure teams to build new foundational data sets that are trusted, well understood, and enable self-service.</p>\n<p>You will work with a wide range of cross-functional stakeholders to derive requirements and architect shared datasets; ability to document, simplify and explain complex problems to different types of audiences. You will establish best practices for the development of specialized data sets for analytics and modeling.</p>\n<p>We&#39;d love to hear from you if you have:</p>\n<ul>\n<li>4+ years in a relevant field.</li>\n<li>Fluency with both SQL and Python.</li>\n<li>Familiarity with Snowflake, dbt, Dagster, and ETL/reverse ETL tools.</li>\n<li>Excellent judgment and creative problem-solving skills.</li>\n<li>A self-starting mindset along with strong communication and collaboration skills.</li>\n</ul>\n<p>While not required, it&#39;s an added plus if you also have:</p>\n<ul>\n<li>Knowledge in data modeling methodologies to design and build robust data architectures for insightful analytics.</li>\n<li>Experience with business systems such as Salesforce, Customer IO, Stripe, NetSuite is a big plus.</li>\n</ul>\n<p>At Figma, one of our values is Grow as you go. We believe in hiring smart, curious people who are excited to learn and develop their skills. If you&#39;re excited about this role but your past experience doesn&#39;t align perfectly with the points outlined in the job description, we encourage you to apply anyways. You may be just the right candidate for this or other roles.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_09a4d1ce-cde","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Figma","sameAs":"https://www.figma.com/","logo":"https://logos.yubhub.co/figma.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/figma/jobs/5220003004","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$140,000-$348,000 USD","x-skills-required":["SQL","Python","Snowflake","dbt","Dagster","ETL/reverse ETL tools"],"x-skills-preferred":["data modeling methodologies","business systems such as Salesforce, Customer IO, Stripe, NetSuite"],"datePosted":"2026-04-18T15:51:04.727Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA • New York, NY • United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"SQL, Python, Snowflake, dbt, Dagster, ETL/reverse ETL tools, data modeling methodologies, business systems such as Salesforce, Customer IO, Stripe, NetSuite","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":140000,"maxValue":348000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_01845b18-90a"},"title":"Tech Lead (CI & Test Data Platform)","description":"<p>At Trunk, our mission is to help teams create high-quality software quickly. We&#39;ve helped engineerings teams at Google X, Zillow, and Brex to understand why their builds fail, which tests are flaky, and how to ship code faster without sacrificing reliability. AI has made writing code 10x faster, but shipping is still painfully slow. The bottleneck has shifted downstream - to merge conflicts, flaky tests, inconsistent code quality, and dozens of other frictions that drain productivity and morale. Engineering teams that can stay focused on designing, implementing, and delivering software will build magical, high-quality projects - and they&#39;ll be happier doing it. We&#39;re building a CI Reliability Platform that empowers teams to land code faster and develop happier.</p>\n<p>Our founders launched Trunk in 2021 after designing, delivering, and scaling software at Uber, Google, YouTube, and Microsoft. We raised a $25M Series A led by Initialized Capital (Garry Tan) and a16z (Peter Levine), with investments from Haystack Ventures, Garage VC, and the founders of GitHub (Tom Preston-Werner), Apollo GraphQL (Geoff Schmidt), Algolia (Nicolas Dessaigne), and Peopl.ai (Oleg Rogynsky).</p>\n<p>CI pipelines are black boxes. Engineers waste hours debugging failures that turn out to be flaky tests or infrastructure noise. Trunk makes this visible: what failed, why, and whether it&#39;s worth fixing.</p>\n<p>The next wave is agentic. AI tools today hit a wall when code leaves the local environment. We&#39;re building the data layer that lets AI agents actually reason about CI: diagnosing failures, suggesting fixes, and eventually shipping code autonomously.</p>\n<p>We&#39;re looking for a Tech Lead to own the data platform that powers Trunk&#39;s flaky test detection and CI analytics products. You&#39;ll design and build the systems that ingest millions of test runs per hour, surface actionable insights, and lay the foundation for AI-driven CI workflows.</p>\n<p>We&#39;re at an inflection point. The scale challenges are real and growing. The AI/agentic future of development tooling is taking shape, and we&#39;re building the data infrastructure that makes it possible. If you want to work on hard systems problems with direct customer impact, this is the role.</p>\n<p>As a Tech Lead, you will:</p>\n<ul>\n<li>Design and build the data pipelines, storage systems, and backend services that power Trunk&#39;s flaky test and CI products</li>\n<li>Lead a team of engineers through complex distributed systems and data infrastructure challenges</li>\n<li>Work directly with customers to understand their pain points and translate them into robust technical solutions</li>\n<li>Drive architectural decisions for scale, reliability, and future AI/agentic integrations (MCP, semantic failure clustering, automated remediation)</li>\n<li>Ship independently with high autonomy. We&#39;re a small team solving hard problems, and you&#39;ll have significant ownership</li>\n</ul>\n<p>We&#39;re looking for someone with:</p>\n<ul>\n<li>7+ years of backend/infrastructure engineering experience, with a focus on data processing pipelines and distributed systems</li>\n<li>Experience leading teams of 2+ engineers on complex technical projects</li>\n<li>Track record of building and operating systems at scale</li>\n<li>Strong proficiency in Rust and Python; familiarity with TypeScript</li>\n<li>Experience with our stack: PostgreSQL, ClickHouse, AWS, Kubernetes, Dagster</li>\n<li>Comfort with monitoring, observability, and debugging in distributed environments</li>\n<li>Previous experience at a high-growth startup</li>\n</ul>\n<p>You&#39;re a good fit if:</p>\n<ul>\n<li>You&#39;re passionate about building high-quality, scalable systems and take pride in clean, maintainable code</li>\n<li>You have deep experience with distributed systems, databases, and performance optimization</li>\n<li>You&#39;re comfortable navigating large codebases and can ramp quickly on complex systems</li>\n<li>You enjoy mentoring engineers and thrive in collaborative environments</li>\n<li>Experience and intuition to zero in on root causes for bugs that can leave others stumped</li>\n<li>You&#39;re self-directed, making sound technical decisions without waiting for detailed specs</li>\n</ul>\n<p>Our tech stack includes:</p>\n<ul>\n<li>Frontend: Typescript, React, Next.js, AWS</li>\n<li>Backend: Typescript, Node, AWS</li>\n<li>Data pipelines: Dagster, python, polars</li>\n<li>CI/CD: GitHub Actions</li>\n</ul>\n<p>We offer:</p>\n<ul>\n<li>Unlimited PTO</li>\n<li>Competitive salary and equity</li>\n<li>Work-life balance</li>\n<li>Lunch ordered in on us at the office on Wednesdays and Thursdays</li>\n<li>Few meetings, so you can ship fast and focus on building</li>\n<li>One Medical membership on us!</li>\n<li>Top-notch medical, dental, vision, short-term disability, long-term disability, and life insurance</li>\n<li>All insurance is 100% company-paid ($0 premiums) for employees and highly subsidized for dependents</li>\n<li>FSA, HSA with company contributions, and pre-tax commuter benefits</li>\n<li>401(k) plan</li>\n<li>Paid parental leave (up to 12 weeks)</li>\n</ul>\n<p>The salary and equity range for this role are: $200-$245K and .3-.5%.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_01845b18-90a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Trunk","sameAs":"https://trunk.io","logo":"https://logos.yubhub.co/trunk.io.png"},"x-apply-url":"https://jobs.lever.co/trunkio/32921dae-d3b1-4771-bb09-cac8a3b14d0c","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$200-$245K","x-skills-required":["Rust","Python","Typescript","PostgreSQL","ClickHouse","AWS","Kubernetes","Dagster"],"x-skills-preferred":[],"datePosted":"2026-04-17T13:07:07.005Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Rust, Python, Typescript, PostgreSQL, ClickHouse, AWS, Kubernetes, Dagster","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":245000,"maxValue":245000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d3dad1eb-2a2"},"title":"Senior Software Engineer (Platform)","description":"<p>At Trunk, our mission is to help teams create high-quality software quickly. We&#39;ve helped engineerings teams at Google X, Zillow, and Brex to understand why their builds fail, which tests are flaky, and how to ship code faster without sacrificing reliability.</p>\n<p>The bottleneck has shifted downstream - to merge conflicts, flaky tests, inconsistent code quality, and dozens of other frictions that drain productivity and morale. Engineering teams that can stay focused on designing, implementing, and delivering software will build magical, high-quality projects - and they&#39;ll be happier doing it.</p>\n<p>We&#39;re building a CI Reliability Platform that empowers teams to land code faster and develop happier.</p>\n<p>We are looking for a motivated and experienced Senior Software Engineer to join our Platform/Data Engineering team. In this role, you will be responsible for developing and optimizing data ingestion pipelines that can handle vast amounts of real-time and batch data from various sources.</p>\n<p>Your focus will be on designing systems that are scalable, reliable, and performant, as well as ensuring the proper integration of data across our entire ecosystem.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design, build, and maintain scalable data ingestion pipelines to handle large volumes of structured and unstructured data.</li>\n<li>Optimize and improve the efficiency of existing data processing workflows, ensuring they can scale as the data grows.</li>\n<li>Collaborate with cross-functional teams to gather data requirements and ensure seamless integration with various data sources.</li>\n<li>Implement real-time and batch processing systems for ingesting data from APIs and webhooks.</li>\n<li>Ensure data quality, consistency, and integrity across all data pipelines.</li>\n<li>Troubleshoot and resolve performance bottlenecks and data-related issues in the ingestion pipeline.</li>\n<li>Develop monitoring and alerting systems to proactively manage the health of data pipelines.</li>\n<li>Continuously evaluate and adopt new technologies and tools to improve the scalability and performance of our systems.</li>\n<li>Document the design, implementation, and operations of data pipelines for knowledge sharing within the team.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>4-5+ years of professional software engineering experience</li>\n<li>You&#39;re located within commute distance of San Francisco and are willing to work in office at least 8 days per month.</li>\n<li>You have experience in areas such as databases, distributed systems, service-oriented architectures, and data infrastructure</li>\n<li>You derive joy from refactoring and building clean abstractions in order to make complex systems fun to develop on and easy to understand</li>\n<li>Excellent debugging and troubleshooting skills and the tenacity to drive a solution to a conclusion</li>\n<li>Experience and intuition to zero in on root causes for bugs that can leave others stumped</li>\n<li>The ability to operate independently, but know when you are in too deep and need to ask for help</li>\n<li>Ability to collaborate with colleagues to plan and execute the best solution</li>\n</ul>\n<p><strong>Tech Stack</strong></p>\n<ul>\n<li>Frontend: Typescript, React, Next.js, AWS</li>\n<li>Backend: Typescript, Node, AWS</li>\n<li>Data pipelines: Dagster, python, polars</li>\n<li>CI/CD: GitHub Actions</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Unlimited PTO</li>\n<li>Competitive salary and equity</li>\n<li>Work-life balance</li>\n<li>Lunch ordered in on us at the office on Wednesdays and Thursdays</li>\n<li>Few meetings, so you can ship fast and focus on building</li>\n<li>One Medical membership on us!</li>\n<li>Top-notch medical, dental, vision, short-term disability, long-term disability, and life insurance</li>\n<li>All insurance is 100% company-paid ($0 premiums) for employees and highly subsidized for dependents</li>\n<li>FSA, HSA with company contributions, and pre-tax commuter benefits</li>\n<li>401(k) plan</li>\n<li>Paid parental leave (up to 12 weeks)</li>\n</ul>\n<p>The salary and equity range for this role are: $170K - $210K and .15% - .35%.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d3dad1eb-2a2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Trunk","sameAs":"https://trunk.io","logo":"https://logos.yubhub.co/trunk.io.png"},"x-apply-url":"https://jobs.lever.co/trunkio/43b778ae-e2b0-472c-8316-a079da4e54da","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$170K - $210K","x-skills-required":["databases","distributed systems","service-oriented architectures","data infrastructure","typescript","react","next.js","aws","node","python","polars","dagster","github actions"],"x-skills-preferred":[],"datePosted":"2026-04-17T13:06:44.831Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"databases, distributed systems, service-oriented architectures, data infrastructure, typescript, react, next.js, aws, node, python, polars, dagster, github actions","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":170000,"maxValue":210000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_e503559e-cf7"},"title":"Senior Machine Learning Engineer","description":"<p><strong>Job Title: Senior Machine Learning Engineer</strong></p>\n<p><strong>Job Description:</strong></p>\n<p>Before 1965, it was extremely difficult and time-consuming to analyze complicated signals, like radio or images. You could solve it, but you had to throw a ton of compute at it. That all changed with the invention of the Fast Fourier transform, which could efficiently break that signal down into the frequencies that are a part of it.</p>\n<p>The Risk Onboarding team is working on efficiently reviewing customers’ applications without compromising on quality. We are the front line of defense for preventing money laundering and financial crimes, building systems to verify that someone is who they say they are and that we are allowed to do business with them.</p>\n<p><strong>About Us:</strong></p>\n<p>At Mercury, we craft an exceptional banking experience for startups. Our team is focused on ensuring our products create a safe environment that meets the needs of our customers, administrators, and regulators.</p>\n<p><strong>Job Responsibilities:</strong></p>\n<p>As part of this role, you will:</p>\n<ul>\n<li>Partner with data science &amp; engineering teams to design and deploy ML &amp; Gen AI microservices, primarily focusing on automating reviews</li>\n<li>Work with a full-stack engineering team to embed these services into the overall review experience, including human in the loop, escalations, and feeding human decisions back into the service</li>\n<li>Implement testing, observability, alerting, and disaster recovery for all services</li>\n<li>Implement tracing, performance, and regression testing</li>\n<li>Feel a strong sense of product ownership and actively seek responsibility – we often self-organize on small/medium projects, and we want someone who’s excited to help shape and build Mercury’s future</li>\n</ul>\n<p><strong>Ideal Candidate:</strong></p>\n<p>The ideal candidate for the role has:</p>\n<ul>\n<li>7+ years of experience in roles like machine learning engineering, data engineering, backend software engineering, and/or devops</li>\n<li>Expertise with:</li>\n</ul>\n<ul>\n<li>A full modern data stack: Snowflake, dbt, Fivetran, Airbyte, Dagster, Airflow</li>\n<li>SQL, dbt, Python</li>\n<li>OLAP / OLTP data modelling and architecture</li>\n<li>Key-value stores: Redis, dynamoDB, or equivalent</li>\n<li>Streaming / real-time data pipelines: Kinesis, Kafka, Redpanda</li>\n<li>API frameworks: FastAPI, Flask, etc.</li>\n<li>Production ML Service experience</li>\n<li>Working across full-stack development environment, with experience transferable to Haskell, React, and TypeScript</li>\n</ul>\n<p><strong>Total Rewards Package:</strong></p>\n<p>The total rewards package at Mercury includes base salary, equity (stock options/RSUs), and benefits. Our salary and equity ranges are highly competitive within the SaaS and fintech industry and are updated regularly using the most reliable compensation survey data for our industry. New hire offers are made based on a candidate’s experience, expertise, geographic location, and internal pay equity relative to peers.</p>\n<p><strong>Salary Range:</strong></p>\n<p>Our target new hire base salary ranges for this role are the following:</p>\n<ul>\n<li>US employees (any location): $200,700 - $250,900</li>\n<li>Canadian employees (any location): CAD 189,700 - 237,100</li>\n</ul>\n<p><strong>Diversity &amp; Belonging:</strong></p>\n<p>Mercury values diversity &amp; belonging and is proud to be an Equal Employment Opportunity employer. All individuals seeking employment at Mercury are considered without regard to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, sexual orientation, or any other legally protected characteristic.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_e503559e-cf7","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mercury","sameAs":"https://www.mercury.com/","logo":"https://logos.yubhub.co/mercury.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/mercury/jobs/5639559004","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$200,700 - $250,900 (US) | CAD 189,700 - 237,100 (Canada)","x-skills-required":["Snowflake","dbt","Fivetran","Airbyte","Dagster","Airflow","SQL","Python","OLAP / OLTP data modelling and architecture","Redis","dynamoDB","Kinesis","Kafka","Redpanda","FastAPI","Flask","Production ML Service experience","Haskell","React","TypeScript"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:45:16.566Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA, New York, NY, Portland, OR, or Remote within Canada or United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Finance","skills":"Snowflake, dbt, Fivetran, Airbyte, Dagster, Airflow, SQL, Python, OLAP / OLTP data modelling and architecture, Redis, dynamoDB, Kinesis, Kafka, Redpanda, FastAPI, Flask, Production ML Service experience, Haskell, React, TypeScript","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":189700,"maxValue":250900,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f0f321c2-15d"},"title":"Data Platform Engineer","description":"<p>At Anchorage Digital, we are building the world&#39;s most advanced digital asset platform for institutions to participate in crypto. Join the Data Platform team and build the Trusted Data Platform that powers Anchorage&#39;s transition to Data 3.0.</p>\n<p>You&#39;ll help shape the unified orchestration foundation, collaborate on governance-as-code patterns, and contribute to self-service frameworks that make quality and compliance automatic. We&#39;re moving from manual spreadsheets and theoretical architectures to automated control planes where every dataset is trusted, monitored, and traceable by default.</p>\n<p><strong>Technical Skills:</strong></p>\n<ul>\n<li>Collaborate on designing and implementing unified orchestration patterns (Dagster/Airflow) to replace legacy and fragmented scheduling</li>\n<li>Develop governance-as-code systems in partnership with the team that automatically apply policy tags, RLS, and access controls through an active control plane</li>\n</ul>\n<p><strong>Complexity and Impact of Work:</strong></p>\n<ul>\n<li>Help guide the technical design for platform capabilities like data contracts, automated quality gating, observability, and cost visibility</li>\n<li>Support the migration of workloads from legacy patterns to the modern platform, ensuring domain teams have clear paths and golden templates</li>\n</ul>\n<p><strong>Organizational Knowledge:</strong></p>\n<ul>\n<li>Partner with domain teams (Asset Data, Reporting &amp; Statements, Product teams) to understand their needs and design platform capabilities that enable their success</li>\n<li>Promote and support data mesh principles and dbt best practices, helping domain owners build and own their data products while platform ensures quality</li>\n</ul>\n<p><strong>Communication and Influence:</strong></p>\n<ul>\n<li>Promote data platform engineering best practices, developer experience, and &#39;Data as a Product&#39; principles across the engineering organization</li>\n<li>Contribute to architectural decisions and help establish engineering culture around reliability, cost efficiency, and operational excellence</li>\n</ul>\n<p><strong>You may be a fit for this role if you:</strong></p>\n<ul>\n<li>5-7+ years building data platforms or infrastructure: You bring experience helping design and operate modern data platforms that handle enterprise-scale workloads with quality, governance, and cost controls</li>\n<li>Strong dbt and SQL expertise: You&#39;re proficient with dbt and SQL, understand dbt Mesh, and have strong opinions on data modeling, testing, and documentation best practices</li>\n<li>Orchestration experience: You&#39;ve implemented production data orchestration with Airflow, Dagster, Prefect, or similar tools, and understand the trade-offs between different orchestration patterns</li>\n<li>Cloud data warehouse proficiency: You have strong experience with BigQuery, Snowflake, or Redshift, including query optimization, cost management, and security configurations</li>\n<li>Platform mindset: You think in terms of golden paths, reusable abstractions, and developer experience - you build systems that let others move fast safely</li>\n</ul>\n<p><strong>Although not a requirement, bonus points if:</strong></p>\n<ul>\n<li>Metadata and catalog experience: You&#39;ve worked with Atlan, Collibra, DataHub, or similar metadata platforms and understand active governance patterns</li>\n<li>Data observability tools: You&#39;ve implemented data quality monitoring with Great Expectations, Monte Carlo, Soda, or similar tools</li>\n<li>Infrastructure as code: You have experience with Terraform, Kubernetes, and modern DevOps practices for data infrastructure</li>\n<li>You&#39;re the kind of person who gets excited about declarative config, immutable infrastructure, and metrics dashboards showing cost-per-query trending down</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f0f321c2-15d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anchorage Digital","sameAs":"https://www.anchorage.co/","logo":"https://logos.yubhub.co/anchorage.co.png"},"x-apply-url":"https://jobs.lever.co/anchorage/8a325cd5-ef99-4f1e-bba8-7bb1fca64f12","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["dbt","SQL","Airflow","Dagster","Prefect","BigQuery","Snowflake","Redshift"],"x-skills-preferred":["Metadata and catalog experience","Data observability tools","Infrastructure as code"],"datePosted":"2026-04-17T12:24:40.602Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"dbt, SQL, Airflow, Dagster, Prefect, BigQuery, Snowflake, Redshift, Metadata and catalog experience, Data observability tools, Infrastructure as code"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_72eaaa6e-3c0"},"title":"Founding Engineer - Reporting & Statements","description":"<p>Join us as a founding engineer on our Reporting &amp; Statements team. You&#39;ll design the systems that power every financial report and statement we deliver from monthly reports to daily statements to custom client requests. We&#39;re building automated frameworks that guarantee accuracy and consistency for every number we send to clients.</p>\n<p><strong>Technical Skills:</strong></p>\n<ul>\n<li>Evolve our architecture from decentralized reporting scripts to a centralized, framework-based delivery system</li>\n<li>Build automated validation and reconciliation that lets us scale without adding manual oversight</li>\n</ul>\n<p><strong>Complexity and Impact of Work:</strong></p>\n<ul>\n<li>Design data models that become a trusted, shared source of truth for downstream product teams and external APIs</li>\n<li>Navigate complexity across multiple product data streams, applying consistent logic to all financial statements</li>\n</ul>\n<p><strong>Organizational Knowledge:</strong></p>\n<ul>\n<li>Work with Product and Foundations teams to standardize how we capture and represent financial data</li>\n<li>Create self-service frameworks so other teams can add new report types through configuration instead of code</li>\n</ul>\n<p><strong>Communication and Influence:</strong></p>\n<ul>\n<li>Listen to product stakeholders to stay ahead of scaling needs for client-facing data</li>\n<li>Help mature our engineering culture by advocating for and modeling &#39;Data as a Product&#39; principles and high-quality engineering standards</li>\n</ul>\n<p><strong>You may be a fit for this role if you:</strong></p>\n<ul>\n<li>7+ years building data systems: You have experience creating internal tools, frameworks, or engines that handle 10x scale</li>\n<li>Financial domain experience: You&#39;ve worked in fintech, banking, or other environments where numbers matter. You understand what a &#39;Statement of Record&#39; means and the precision it demands.</li>\n<li>Systems thinking: You consider the next 100 products, not just the current one. You value extensible systems over one-off pipelines.</li>\n<li>Solid technical foundation: You&#39;re proficient with Python (Pandas/Polars/Arrow) and SQL, with experience in BigQuery or similar cloud warehouses and modern orchestration tools like Airflow or Dagster.</li>\n</ul>\n<p><strong>Although not a requirement, bonus points if:</strong></p>\n<ul>\n<li>You&#39;ve been a data consumer: Prior experience as a financial or business analyst gives you the perspective to design truly usable data models.</li>\n<li>You care about performance: You enjoy making data move faster and cheaper, whether through ADBC, multiprocessing, vectorized operations, or other optimizations.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_72eaaa6e-3c0","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anchorage Digital","sameAs":"https://www.anchorage.co/","logo":"https://logos.yubhub.co/anchorage.co.png"},"x-apply-url":"https://jobs.lever.co/anchorage/5bcfc8f2-5f26-4f72-8ca7-f4b20ee7f7db","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","SQL","BigQuery","Airflow","Dagster","Pandas","Polars","Arrow"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:23:54.465Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, SQL, BigQuery, Airflow, Dagster, Pandas, Polars, Arrow"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b1d4c773-5c5"},"title":"Analytics Engineer, Finance","description":"<p><strong>Compensation</strong></p>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>The Finance Data team is embedded within the CFO Org and is responsible for building internal data products that scale analytics across business teams and drive efficiencies in our daily operations. This team provides technical guidance on high-impact, scalable projects across Finance, and is the subject-matter expert in financial and transactional data that supports our Finance day-to-day operations.</p>\n<p><strong>About the Role</strong></p>\n<p>As an Analytics Engineer, you will be setting the foundation to scale analytics across our business functions and impart best data practices for a rapidly growing organization. We aspire to build the Finance team of the future.</p>\n<p>In addition, you will work collaboratively with key stakeholders in Finance and other business teams to understand their pain points and take the lead in proposing viable, future-proof solutions to resolve them. You will also autonomously lead your own projects that deliver business impact and help cultivate a mature data culture among Finance teams.</p>\n<p>We are looking for a seasoned engineer who has a proven track record of owning the entire data stack at high transaction volume companies, managing business critical ETL pipelines consumed by non-technical teams. As a generalist “fixer”, you may be deployed across several different Finance domains (e.g. Tax datamart, ERP migration, Procurement automation). For this role we need someone who excels in dynamic environments, adapts quickly to changing needs, and confidently navigates ambiguous or evolving requirements. If you&#39;re energized by solving technical problems without a playbook and comfortable wearing multiple hats, this role is for you! To clarify, you will <strong>not</strong> be responsible for training ML models and neither would we describe this role as ‘product analytics’.</p>\n<p>This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Understand the data needs of Finance teams, including Revenue, Tax, Procurement, Compute &amp; Infrastructure Accounting, Strategic Finance, and translate that scope into technical requirements</li>\n</ul>\n<ul>\n<li>Facilitate the development of data products and tools to for stakeholders to self-service and enable analytics to scale across the company</li>\n</ul>\n<ul>\n<li>Lead dimensional design - define, own, and maintain business facing data marts</li>\n</ul>\n<ul>\n<li>Be a cross-functional champion at upholding high data integrity standards and SLAs for the timely delivery of data</li>\n</ul>\n<ul>\n<li>Build and maintain insightful and reliable dashboards to track both operational and financial Metrics for the Executive team</li>\n</ul>\n<ul>\n<li>Contribute to the future roadmap of the Finance team from a data systems perspective</li>\n</ul>\n<ul>\n<li>Grow to be an expert in Finance Data and OpenAI’s data architecture</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>7+ years of experience as an Analytics Engineer or in a similar role (Data Analyst or Data Engineer) with a proven track record in shipping canonical datasets</li>\n</ul>\n<ul>\n<li>Empathy towards non-developer stakeholders and their day-to-day pain points</li>\n</ul>\n<ul>\n<li>Strong proficiency in SQL for data transformation, comfort in at least one functional/OOP language such as Python or R</li>\n</ul>\n<ul>\n<li>Familiarity with managing distributed data stores (e.g. S3, Trino, Hive, Spark), and experience building multi-step ETL jobs coupled with orchestrating workflows (e.g. Airflow, Dagster)</li>\n</ul>\n<ul>\n<li>Experience in writing unit tests to validate data products and version control (e.g. GitHub, Stash)</li>\n</ul>\n<ul>\n<li>Expert at creating compelling data visualizations with dashboarding tools (e.g. Tableau, Looker or similar)</li>\n</ul>\n<ul>\n<li>Excellent communication skills and ability to present data-driven narratives in both verbal and written form to a non-technical audience</li>\n</ul>\n<ul>\n<li>Experience solving ambiguous problem statements in an early stage environment</li>\n</ul>\n<p><strong>You could be an especially great fit if you have:</strong></p>\n<ul>\n<li>Prior experience leading the development of an internal production tool, serving hundreds of cross-functional customers such as Billing Operations, Deal Desk or Go-to-Market teams</li>\n</ul>\n<ul>\n<li>Some frontend experience with React, TypeScript, Retool, Streamlit, or building web apps</li>\n</ul>\n<ul>\n<li>Good understanding of Spark and ability to write, debug, and optimize Spark jobs</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p>We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.</p>\n<p>For additional information, please see [OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement](https://cdn.openai.com/policies/eeo-policy-statement.pdf).</p>\n<p>Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b1d4c773-5c5","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/7cd50a19-65f2-4a52-89a2-512130e58c5c","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"Full time","x-salary-range":"$198K – $260K • Offers Equity","x-skills-required":["SQL","Python","R","S3","Trino","Hive","Spark","Airflow","Dagster","GitHub","Stash","Tableau","Looker"],"x-skills-preferred":["React","TypeScript","ReTool","Streamlit","Web development"],"datePosted":"2026-03-08T22:16:37.388Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"SQL, Python, R, S3, Trino, Hive, Spark, Airflow, Dagster, GitHub, Stash, Tableau, Looker, React, TypeScript, ReTool, Streamlit, Web development","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":198000,"maxValue":260000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d6450ee6-847"},"title":"Data Infrastructure Engineer","description":"<p><strong>About the Role</strong></p>\n<p>Cursor ships daily. Every release leaves signals behind: telemetry, prompts, completions, agent runs, sessions. Those signals power model improvement, evals, and experimentation. Data infrastructure is what turns them into something teams can trust.</p>\n<p>A lot of systems here started simple so we could move fast. Over time, the constraints change and the “good enough” version becomes the bottleneck. This role owns the full ladder: patch what should be patched, redesign what should be redesigned, ship the replacement, and operate it.</p>\n<p>Privacy guarantees are part of correctness. What we can retain and use depends on Privacy Mode and org configuration, and getting that wrong breaks a product promise. We choose work by business impact: what blocks product and model teams today, and what will block them next month.</p>\n<p><strong>Sample projects include...</strong></p>\n<ul>\n<li>A core pipeline started as a pragmatic reuse of infrastructure built for something else. It works, but it cannot guarantee properties downstream consumers now need (for example, point-in-time consistency). You design and ship the replacement while keeping the existing system running.</li>\n</ul>\n<ul>\n<li>A new product surface ships without instrumentation. You talk to the team, define what needs to be captured, and wire it through before the absence becomes anyone else’s problem.</li>\n</ul>\n<ul>\n<li>Eval coverage drops. You trace it to an instrumentation gap introduced weeks ago by a product change nobody flagged. You fix the gap, add a contract so it cannot recur, and ship the dashboard that would have caught it earlier.</li>\n</ul>\n<ul>\n<li>Multiple consumers depend on overlapping data. You design schema evolution and validation so changes in one place do not silently degrade the others.</li>\n</ul>\n<ul>\n<li>Storage costs rise faster than usage. You decide what is worth keeping, implement retention and compression, and delete what is not.</li>\n</ul>\n<p><strong>What we&#39;re looking for</strong></p>\n<p>We’re looking for someone who has built real systems at scale and cares about correctness, cost, and ergonomics.</p>\n<p>Strong signals include:</p>\n<ul>\n<li>Deep experience with Spark (Databricks or open-source Spark both count)</li>\n</ul>\n<ul>\n<li>Production experience with Ray Data</li>\n</ul>\n<ul>\n<li>Hands-on ownership of large data pipelines and storage systems</li>\n</ul>\n<ul>\n<li>Comfort debugging performance issues across client instrumentation, streaming, storage, and model-facing workflows, as well as, compute, storage, and networking layers</li>\n</ul>\n<ul>\n<li>Clear thinking about data modeling and long-term maintainability</li>\n</ul>\n<ul>\n<li>You have good judgment about when to patch and when to rebuild</li>\n</ul>\n<p>Nice to have</p>\n<ul>\n<li>Experience running or scaling ClickHouse</li>\n</ul>\n<ul>\n<li>Familiarity with dbt, Dagster, or similar orchestration and modeling tools</li>\n</ul>\n<p>We&#39;re in-person with cozy offices in North Beach, San Francisco and Manhattan, New York, replete with well-stocked libraries.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d6450ee6-847","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cursor","sameAs":"https://cursor.com","logo":"https://logos.yubhub.co/cursor.com.png"},"x-apply-url":"https://cursor.com/careers/software-engineer-data-infrastructure","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Spark","Ray Data","data pipelines","storage systems","debugging performance issues","data modeling","long-term maintainability"],"x-skills-preferred":["ClickHouse","dbt","Dagster"],"datePosted":"2026-03-08T00:17:58.290Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Spark, Ray Data, data pipelines, storage systems, debugging performance issues, data modeling, long-term maintainability, ClickHouse, dbt, Dagster"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c873a489-0dc"},"title":"Data Engineer, Analytics","description":"<p><strong>Data Engineer, Analytics</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Applied AI</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$230K – $385K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the team</strong></p>\n<p>The Applied team works across research, engineering, product, and design to bring OpenAI’s technology to consumers and businesses.</p>\n<p>We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth.</p>\n<p><strong>About the role</strong></p>\n<p>We&#39;re seeking a Data Engineer to take the lead in building our data pipelines and core tables for OpenAI. These pipelines are crucial for powering analyses, safety systems that guide business decisions, product growth, and prevent bad actors. If you&#39;re passionate about working with data and are eager to create solutions with significant impact, we&#39;d love to hear from you. This role also provides the opportunity to collaborate closely with the researchers behind ChatGPT and help them train new models to deliver to users. As we continue our rapid growth, we value data-driven insights, and your contributions will play a pivotal role in our trajectory. Join us in shaping the future of OpenAI!</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design, build and manage our data pipelines, ensuring all user event data is seamlessly integrated into our data warehouse.</li>\n</ul>\n<ul>\n<li>Develop canonical datasets to track key product metrics including user growth, engagement, and revenue.</li>\n</ul>\n<ul>\n<li>Work collaboratively with various teams, including, Infrastructure, Data Science, Product, Marketing, Finance, and Research to understand their data needs and provide solutions.</li>\n</ul>\n<ul>\n<li>Implement robust and fault-tolerant systems for data ingestion and processing.</li>\n</ul>\n<ul>\n<li>Participate in data architecture and engineering decisions, bringing your strong experience and knowledge to bear.</li>\n</ul>\n<ul>\n<li>Ensure the security, integrity, and compliance of data according to industry and company standards.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have 3+ years of experience as a data engineer and 8+ years of any software engineering experience(including data engineering).</li>\n</ul>\n<ul>\n<li>Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java.</li>\n</ul>\n<ul>\n<li>Experience with distributed processing technologies and frameworks, such as Hadoop, Flink and distributed storage systems (e.g., HDFS, S3).</li>\n</ul>\n<ul>\n<li>Expertise with any of ETL schedulers such as Airflow, Dagster, Prefect or similar frameworks.</li>\n</ul>\n<ul>\n<li>Solid understanding of Spark and ability to write, debug and optimize Spark code.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c873a489-0dc","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/fc5bbc77-a30c-4e7a-9acc-8a2e748545b4","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$230K – $385K • Offers Equity","x-skills-required":["Python","Scala","Java","Hadoop","Flink","HDFS","S3","Airflow","Dagster","Prefect","Spark"],"x-skills-preferred":[],"datePosted":"2026-03-06T18:20:01.101Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Scala, Java, Hadoop, Flink, HDFS, S3, Airflow, Dagster, Prefect, Spark","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":230000,"maxValue":385000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2902359a-64d"},"title":"Member of Technical Staff, Infrastructure Data & Analytics","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Member of Technical Staff, Infrastructure Data &amp; Analytics to join their MAI SuperIntelligence Team. This role sits at the heart of strategic decision-making, turning raw telemetry into trusted, decision-quality insights on utilization, capacity, readiness, and efficiency. You&#39;ll work directly with leadership to shape the company&#39;s direction in the Superintelligence space.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Member of Technical Staff, Infrastructure Data &amp; Analytics, you will act as the technical lead and owner for infrastructure analytics across compute, storage, and networking. You will design and build durable, scalable data pipelines that ingest telemetry from clusters, schedulers, health systems, and capacity trackers into Data Warehouse. You will define and standardize core metrics and semantics (e.g., utilization, occupancy, MFU, goodput, capacity readiness, delivery-to-production). You will architect and maintain self-service dashboards and APIs for fleet, cluster, and squad-level visibility. You will partner closely with stakeholders across Supercomputing Infra, Researchers, Strategy and Executives to ensure metrics reflect operational and business reality.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Act as the technical lead and owner for infrastructure analytics across compute, storage, and networking.</li>\n<li>Design and build durable, scalable data pipelines that ingest telemetry from clusters, schedulers, health systems, and capacity trackers into Data Warehouse.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>8+ years technical engineering experience with data engineering, analytics, or data science, with increasing technical ownership in startup environment.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Distributed data processing frameworks and large-scale data systems.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Strong communication skills; can explain complex systems clearly to senior leader.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Software Engineering IC5 – The typical base pay range for this role across the U.S. is USD $139,900 – $274,800 per year.</li>\n<li>Certain roles may be eligible for benefits and other compensation.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2902359a-64d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-infrastructure-data-analytics-mai-superintelligence-team/","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["data engineering","analytics","data science","distributed data processing frameworks","large-scale data systems"],"x-skills-preferred":["ETL orchestration frameworks","Airflow","Dagster"],"datePosted":"2026-03-06T07:29:22.881Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Multiple Locations, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data engineering, analytics, data science, distributed data processing frameworks, large-scale data systems, ETL orchestration frameworks, Airflow, Dagster","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}}]}