{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/sharding"},"x-facet":{"type":"skill","slug":"sharding","display":"Sharding","count":8},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_5558189c-8cd"},"title":"Software Engineer","description":"<p><strong>About the Role</strong></p>\n<p>As a Software Engineer on the Storage team at Cursor, you&#39;ll own the data layer that underpins every product surface: the databases, caches, and the strategy for how teams provision, query, and scale their data stores.</p>\n<p>Millions of developers depend on Cursor every day, and the future of our storage architecture is one of the highest-leverage problems at the company: get it right, and every team ships faster, every product surface gets more reliable, and Cursor can scale to meet explosive demand. You&#39;ll design and execute the path to a robust, multi-database topology built for that growth.</p>\n<p><strong>Example projects include...</strong></p>\n<ul>\n<li>Designing the next-generation data architecture: evolving our storage layer into a partitioned, resilient topology that keeps pace with Cursor&#39;s rapid growth.</li>\n</ul>\n<ul>\n<li>Building query attribution and guardrails: instrumenting every database query by service, catching bad patterns before they hit production, and making it impossible to ship problematic queries without review.</li>\n</ul>\n<ul>\n<li>Defining the &#39;when to use what&#39; strategy for data stores: creating clear guidance and golden pathways so every team picks the right engine for their workload without second-guessing.</li>\n</ul>\n<ul>\n<li>Owning cache infrastructure end-to-end: reliability, capacity planning, and patterns that let product teams move fast without worrying about cache correctness.</li>\n</ul>\n<p><strong>You may be a fit if</strong></p>\n<ul>\n<li>You have deep experience with relational databases at scale, especially Postgres, MySQL, or similar OLTP systems.</li>\n</ul>\n<ul>\n<li>You&#39;ve tackled database sharding, migration, or decomposition problems in production environments.</li>\n</ul>\n<ul>\n<li>You understand the tradeoffs between different storage engines and can help teams make the right choices for their workloads.</li>\n</ul>\n<ul>\n<li>You care about operational excellence: backups, monitoring, query performance, and capacity planning are things you think about proactively.</li>\n</ul>\n<ul>\n<li>You have strong software engineering fundamentals and enjoy building systems that other engineers depend on.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_5558189c-8cd","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cursor","sameAs":"https://cursor.com","logo":"https://logos.yubhub.co/cursor.com.png"},"x-apply-url":"https://cursor.com/careers/software-engineer-storage","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Postgres","MySQL","relational databases","database sharding","migration","decomposition","storage engines","operational excellence","backups","monitoring","query performance","capacity planning"],"x-skills-preferred":[],"datePosted":"2026-04-24T14:09:43.224Z","jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Postgres, MySQL, relational databases, database sharding, migration, decomposition, storage engines, operational excellence, backups, monitoring, query performance, capacity planning"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_377e69db-df1"},"title":"Database Engineer","description":"<p>We&#39;re looking for a database engineer with deep experience building and scaling both structured and unstructured database platforms supporting distributed systems, data-intensive applications, and machine learning infrastructure.</p>\n<p>As a member of the Platform team, you will build and mature database foundations for Scale, leveraging industry-standard platforms. You will collaborate with stakeholders across the organisation, including software developers, platform engineers, machine learning scientists, customer operations, etc.</p>\n<p>Key responsibilities include:</p>\n<p>Building and maintaining high-performance database systems Collaborating with cross-functional teams to design and implement scalable database solutions Developing and optimising database queries and indexing strategies Ensuring data consistency and integrity across multiple systems Mentoring junior engineers and contributing to the growth of the team Improving engineering standards, tooling, and processes Working directly with engineering and sales teams to create backend database solutions to meet their challenging data and security needs Working with the Security Team on security compliance, pen tests, and mitigations that improve security across Scale Building systems capable of handling millions of frames of data every day, making it available to both our workforce and our internal teams with high availability.</p>\n<p>This role requires:</p>\n<p>5+ years of industry experience as a database engineer post-graduation Engineering experience with building real-time and distributed system architecture Experience designing and self-hosting databases on industry-standard public cloud platforms Deep familiarity with design, architecture, optimisation, and tuning multiple database platforms such as MongoDB, Postgres, MySQL, DynamoDB, Redis Deep familiarity with SQL query optimisation, database indexing, scalability (partitioning/sharding), and replication Experience developing and optimising backup and restore functionality to meet RTO goals Intermediate experience in at least one coding language: Typescript, Python, Go, Java, C++ Experience working with Docker, Kubernetes, and Infra-as-Code (e.g. Terraform); bonus points for experience supporting GPU/ML workloads.</p>\n<p>Nice to haves:</p>\n<p>Prior startup experience to help us grow responsibly Experience with AWS, Datadog, ElasticSearch Experience with cloud-based data warehouse solutions like Snowflake or Databricks Experience with cost optimisation strategies and techniques for database platforms Experience developing and designing intermediary data abstraction layers Mentored and grown members of your team or been a tech lead on large projects.</p>\n<p>Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity-based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You&#39;ll also receive benefits including, but not limited to: Comprehensive health, dental, and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_377e69db-df1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4688489005","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$162,400-$203,000 USD","x-skills-required":["database engineering","distributed systems","data-intensive applications","machine learning infrastructure","SQL query optimisation","database indexing","scalability","partitioning","sharding","replication","backup and restore functionality","Docker","Kubernetes","Infra-as-Code","Terraform","GPU/ML workloads"],"x-skills-preferred":["prior startup experience","AWS","Datadog","ElasticSearch","cloud-based data warehouse solutions","cost optimisation strategies","intermediary data abstraction layers"],"datePosted":"2026-04-24T13:03:06.404Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"database engineering, distributed systems, data-intensive applications, machine learning infrastructure, SQL query optimisation, database indexing, scalability, partitioning, sharding, replication, backup and restore functionality, Docker, Kubernetes, Infra-as-Code, Terraform, GPU/ML workloads, prior startup experience, AWS, Datadog, ElasticSearch, cloud-based data warehouse solutions, cost optimisation strategies, intermediary data abstraction layers","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":162400,"maxValue":203000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3513ac8f-9c4"},"title":"Staff Software Engineer, PostgreSQL","description":"<p>You&#39;ll own Gamma&#39;s PostgreSQL infrastructure as we scale from 70 million users to hundreds of millions, and from terabytes of data to hundreds of terabytes. Your job is to make sure our database can handle orders of magnitude more usage without compromising performance.</p>\n<p>This is a deeply technical, hands-on role. You&#39;ll read and write code daily, dig into low-level systems, debug complex issues across massive datasets, and work on both core database scaling projects and application features. You&#39;ll collaborate closely with backend engineers, data engineers, and infrastructure teams to ensure our database architecture keeps pace with Gamma&#39;s growth.</p>\n<p>Our team has a strong in-office culture and works in person 4–5 days per week in San Francisco. We love working together to stay creative and connected, with flexibility to work from home when focus matters most.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Architect and implement solutions for horizontally scaling PostgreSQL to hundreds of millions of users and hundreds of terabytes of data</li>\n</ul>\n<ul>\n<li>Own database performance, availability, and reliability as usage grows by orders of magnitude</li>\n</ul>\n<ul>\n<li>Debug complex issues across very large datasets and optimize query performance at scale</li>\n</ul>\n<ul>\n<li>Establish best practices for database design, query optimization, and data modeling across engineering</li>\n</ul>\n<ul>\n<li>Work across core infrastructure and application features that depend on database architecture</li>\n</ul>\n<ul>\n<li>Collaborate with backend, data, and infrastructure engineers to align database strategy with product needs</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>10+ years of software engineering experience with deep expertise in large-scale relational database systems, including hands-on experience managing hundreds of terabytes of data in production</li>\n</ul>\n<ul>\n<li>Expert-level understanding of PostgreSQL (or comparable relational databases), horizontal scaling techniques such as sharding and partitioning, and complex query tuning</li>\n</ul>\n<ul>\n<li>Strong programming skills in at least one backend language, with experience writing and maintaining highly available web APIs</li>\n</ul>\n<ul>\n<li>Experience with large-scale event streaming systems, preferably Apache Kafka</li>\n</ul>\n<ul>\n<li>Ability to explain complex technical concepts clearly to engineers across teams</li>\n</ul>\n<ul>\n<li>Familiarity with TypeScript, Prisma, Apollo GraphQL, Terraform, AWS, or AI/LLM tooling (Nice to have)</li>\n</ul>\n<p><strong>Compensation</strong></p>\n<p>The base salary for this full-time position, which spans multiple internal levels depending on qualifications, ranges between $230K - $310K plus benefits &amp; equity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3513ac8f-9c4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Gamma","sameAs":"https://gamma.com","logo":"https://logos.yubhub.co/gamma.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/gamma/f672c729-457f-4143-80e9-363ddf8a0870","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"Full time","x-salary-range":"$230K - $310K","x-skills-required":["PostgreSQL","horizontal scaling","sharding","partitioning","complex query tuning","backend language","web APIs","Apache Kafka"],"x-skills-preferred":["TypeScript","Prisma","Apollo GraphQL","Terraform","AWS","AI/LLM tooling"],"datePosted":"2026-04-24T12:16:45.597Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PostgreSQL, horizontal scaling, sharding, partitioning, complex query tuning, backend language, web APIs, Apache Kafka, TypeScript, Prisma, Apollo GraphQL, Terraform, AWS, AI/LLM tooling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":230000,"maxValue":310000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2c095439-13b"},"title":"Principal Software Engineer","description":"<p>Microsoft Advertising is seeking a Principal Software Engineer to join our Ads Engineering Platform team and advance the core capabilities of our ad-serving infrastructure,the engine that powers advertising across Bing Search, MSN, Microsoft Start, and shopping experiences in the Edge browser.</p>\n<p>Our serving stack operates at massive global scale, delivering millions of ad requests per second through a geo-distributed, low-latency system that combines large-scale GPU/CPU inference, real-time bidding, and intelligent ranking pipelines.</p>\n<p>This role focuses on advancing the performance, efficiency, and scalability of the next generation of model serving and inference platforms for Ads.</p>\n<p>As a senior technical leader, you’ll design and optimize high-performance serving systems and GPU inference frameworks that drive measurable latency improvements and cost efficiency across Microsoft’s ad ecosystem.</p>\n<p>You’ll work across the stack,from CUDA kernel tuning and NUMA-aware threading to large-scale distributed orchestration and model deployment for deep learning and LLM workloads.</p>\n<p>This is a rare opportunity to shape the architecture of one of the world’s most advanced, mission-critical online serving platforms, collaborating with world-class engineers to deliver innovation at Internet scale.</p>\n<p>Microsoft’s mission is to empower every person and every organization on the planet to achieve more.</p>\n<p>As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals.</p>\n<p>Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>Starting January 26, 2026, Microsoft AI (MAI) employees who live within a 50-mile commute of a designated Microsoft office in the U.S. or 25-mile commute of a non-U.S., country-specific location are expected to work from the office at least four days per week.</p>\n<p>This expectation is subject to local law and may vary by jurisdiction.</p>\n<p>Responsibilities:</p>\n<p>Design and lead the development of large-scale, distributed online serving systems,including GPU-accelerated and CPU-based ranking/inference pipelines,to process millions of ad requests per second with ultra-low latency, high throughput, and solid reliability.</p>\n<p>Architect and optimize end-to-end inference infrastructure, including model serving, batching/streaming, caching, scheduling, and resource orchestration across heterogeneous hardware (GPU, CPU, and memory tiers).</p>\n<p>Profile and optimize performance across the full stack,from CUDA kernels and GPU pipelines to CPU threads and OS-level scheduling,identifying bottlenecks, tuning latency tails, and improving cost efficiency through advanced profiling and instrumentation.</p>\n<p>Own live-site reliability as a DRI: design telemetry, alerting, and fault-tolerance mechanisms; drive rapid diagnosis and mitigation of performance regressions or outages in globally distributed systems.</p>\n<p>Collaborate and mentor across teams,driving architecture reviews, enforcing engineering excellence, promoting system-level optimization practices, and mentoring others in deep debugging, profiling, and performance engineering.</p>\n<p>Qualifications:</p>\n<p>Required Qualifications:</p>\n<p>Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Preferred Qualifications:</p>\n<p>Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Industry experience in advertising or search engine backend systems, such as large-scale ad ranking, real-time bidding (RTB), or relevance-serving infrastructure.</p>\n<p>Hands-on experience with real-time data streaming systems (Kafka, Flink, Spark Streaming), feature-store integration, and multi-region deployment for low-latency, globally distributed services.</p>\n<p>Familiarity with LLM inference optimization,model sharding, tensor/kv-cache parallelism, paged attention, continuous batching, quantization (AWQ/FP8), and hybrid CPU–GPU orchestration.</p>\n<p>Demonstrated success operating large-scale systems with SLA-based capacity forecasting, autoscaling, and performance telemetry; proven leadership in cross-functional architecture initiatives and technical mentorship.</p>\n<p>Passion for performance engineering, observability, and deep systems debugging, with a solid drive to push the limits of serving infrastructure for the next generation of ads and AI models.</p>\n<p>Deep expertise in GPU inference frameworks such as NVIDIA Triton Inference Server, CUDA, and TensorRT, including hands-on experience implementing custom CUDA kernels, optimizing memory movement (H2D/D2H), overlapping compute and I/O, and maximizing GPU occupancy and kernel fusion for deep learning and LLM workloads.</p>\n<p>Solid understanding of model-serving trade-offs,batching vs. streaming, latency vs. throughput, quantization (FP16/BF16/INT8), dynamic batching, continuous model rollout, and adaptive inference scheduling across CPU/GPU tiers.</p>\n<p>Proven ability to profile and optimize GPU and system workloads,including tensor/memory alignment, compute–memory balancing, embedding table management, parameter servers, hierarchical caching, and vectorized inference for transformer/LLM architectures.</p>\n<p>Expertise in low-level system and OS internals, including multi-threading, process scheduling, NUMA-aware memory allocation, lock-free data structures, context switching, I/O stack tuning (NVMe, RDMA), kernel bypass (DPDK, io_uring), and CPU/GPU affinity optimization for large-scale serving pipelines.</p>\n<p>#MicrosoftAI Software Engineering IC5 – The typical base pay range for this role across the U.S. is USD $139,900 – $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 – $304,200 per year.</p>\n<p>Certain roles may be eligible for benefits and other compensation.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2c095439-13b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-41/","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,900 - $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","NVIDIA Triton Inference Server","CUDA","TensorRT","Kafka","Flink","Spark Streaming","GPU inference frameworks","LLM inference optimization","model sharding","tensor/kv-cache parallelism","paged attention","continuous batching","quantization","AWQ/FP8","hybrid CPU–GPU orchestration","SLA-based capacity forecasting","autoscaling","performance telemetry","cross-functional architecture initiatives","technical mentorship","performance engineering","observability","deep systems debugging","low-level system and OS internals","multi-threading","process scheduling","NUMA-aware memory allocation","lock-free data structures","context switching","I/O stack tuning","kernel bypass","CPU/GPU affinity optimization"],"x-skills-preferred":[],"datePosted":"2026-04-24T12:12:57.301Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, NVIDIA Triton Inference Server, CUDA, TensorRT, Kafka, Flink, Spark Streaming, GPU inference frameworks, LLM inference optimization, model sharding, tensor/kv-cache parallelism, paged attention, continuous batching, quantization, AWQ/FP8, hybrid CPU–GPU orchestration, SLA-based capacity forecasting, autoscaling, performance telemetry, cross-functional architecture initiatives, technical mentorship, performance engineering, observability, deep systems debugging, low-level system and OS internals, multi-threading, process scheduling, NUMA-aware memory allocation, lock-free data structures, context switching, I/O stack tuning, kernel bypass, CPU/GPU affinity optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f2196e99-854"},"title":"Software Engineer - GenAI inference","description":"<p>As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers Databricks&#39; Foundation Model API. You&#39;ll work at the intersection of research and production, ensuring our large language model (LLM) serving systems are fast, scalable, and efficient.</p>\n<p>Your work will touch the full GenAI inference stack , from kernels and runtimes to orchestration and memory management. You will contribute to the design and implementation of the inference engine, and collaborate on model-serving stack optimized for large-scale LLMs inference.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Collaborating with researchers to bring new model architectures or features (sparsity, activation compression, mixture-of-experts) into the engine</li>\n<li>Optimizing for latency, throughput, memory efficiency, and hardware utilization across GPUs, and accelerators</li>\n<li>Building and maintaining instrumentation, profiling, and tracing tooling to uncover bottlenecks and guide optimizations</li>\n<li>Developing and enhancing scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads</li>\n<li>Supporting reliability, reproducibility, and fault tolerance in the inference pipelines, including A/B launches, rollback, and model versioning</li>\n<li>Integrating with federated, distributed inference infrastructure – orchestrate across nodes, balance load, handle communication overhead</li>\n<li>Collaborating cross-functionally: with platform engineers, cloud infrastructure, and security/compliance teams</li>\n<li>Documenting and sharing learnings, contributing to internal best practices and open-source efforts when possible</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>BS/MS/PhD in Computer Science, or a related field</li>\n<li>Strong software engineering background (3+ years or equivalent) in performance-critical systems</li>\n<li>Solid understanding of ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc.</li>\n<li>Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc.)</li>\n<li>Comfortable designing and operating distributed systems, including RPC frameworks, queuing, RPC batching, sharding, memory partitioning</li>\n<li>Demonstrated ability to uncover and solve performance bottlenecks across layers (kernel, memory, networking, scheduler)</li>\n<li>Experience building instrumentation, tracing, and profiling tools for ML models</li>\n<li>Ability to work closely with ML researchers, translate novel model ideas into production systems</li>\n<li>Ownership mindset and eagerness to dive deep into complex system challenges</li>\n<li>Bonus: published research or open-source contributions in ML systems, inference optimization, or model serving</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f2196e99-854","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8202670002","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$142,200-$204,600 USD","x-skills-required":["software engineering","performance-critical systems","ML inference internals","CUDA","GPU programming","distributed systems","RPC frameworks","queuing","RPC batching","sharding","memory partitioning","instrumentation","tracing","profiling tools","ML researchers","complex system challenges"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:54:17.777Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, performance-critical systems, ML inference internals, CUDA, GPU programming, distributed systems, RPC frameworks, queuing, RPC batching, sharding, memory partitioning, instrumentation, tracing, profiling tools, ML researchers, complex system challenges","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":142200,"maxValue":204600,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_58d220e6-02a"},"title":"Senior Site Reliability Engineer, Tenant Services: Geo","description":"<p>Job Title: Senior Site Reliability Engineer, Tenant Services: Geo</p>\n<p>We are looking for a skilled Senior Site Reliability Engineer to join our Tenant Services, Geo team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our user-facing services and production systems.</p>\n<p>About Us</p>\n<p>GitLab is the intelligent orchestration platform for DevSecOps. It enables organisations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Execute Dedicated Geo migrations and cutovers end-to-end, including planning, pre-cutover validation, execution, and post-cutover verification and cleanup.</li>\n<li>Join the team&#39;s shift and weekend coverage rotation for Dedicated cutovers across EMEA and US hours, and participate in the SaaS Site Reliability Engineering (SRE) on-call rotation to respond to incidents that impact GitLab.com availability.</li>\n<li>Operate and improve the Geo operational surface for Dedicated, including:</li>\n<li>Environment preparation and data hygiene checks prior to migrations.</li>\n<li>Execution of replication, validation, and cutover procedures.</li>\n<li>Handling Geo-related escalations from Support and internal partners.</li>\n<li>Design, build, and maintain automation, tooling, and runbooks that make migrations, cutovers, and Geo escalations as &#39;boring&#39; and repeatable as possible.</li>\n<li>Run our infrastructure with tools such as Ansible, Chef, Terraform, GitLab CI/CD, and Kubernetes; contribute improvements back to GitLab&#39;s product and infrastructure where appropriate.</li>\n<li>Build and maintain monitoring, alerting, and dashboards that:</li>\n<li>Detect symptoms early, not just outages.</li>\n<li>Track migration and cutover success rates, duration, rollback frequency, and related SLOs.</li>\n<li>Collaborate closely with:</li>\n<li>The core Geo team on improving Geo features and operability.</li>\n<li>Dedicated migrations and Support on migration planning, customer communications, and escalation handling.</li>\n<li>Other Infrastructure teams on capacity planning, disaster recovery, and reliability improvements.</li>\n<li>Contribute to readiness reviews, incident reviews, and root cause analyses, turning learnings into changes in automation, process, or product.</li>\n<li>Document every action, including runbooks, architecture decisions, and post-incident reviews, so your findings turn into repeatable practices and automation.</li>\n<li>Proactively identify and reduce toil by automating repetitive operational work and simplifying migration workflows.</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>Experience operating highly-available distributed systems at scale, ideally in a SaaS environment with customer-facing SLAs.</li>\n<li>Hands-on experience with at least one major cloud provider (e.g., Google Cloud Platform or Amazon Web Services), including networking, storage, and managed services.</li>\n<li>Experience with Kubernetes and its ecosystem (e.g., Helm), including deploying and troubleshooting workloads.</li>\n<li>Experience with infrastructure as code and configuration management tools such as Terraform, Ansible, or Chef.</li>\n<li>Strong programming skills in at least one general-purpose language (preferably Go or Ruby) and proficiency with scripting (e.g., Shell, Python).</li>\n<li>Experience with observability systems (e.g., Prometheus, Grafana, logging stacks) and using metrics and logs to troubleshoot performance and reliability issues.</li>\n<li>Practical exposure to data replication, backup/restore, or migration scenarios (e.g., database replication, storage replication, or Geo-like technologies) where data integrity and downtime risk must be carefully managed.</li>\n<li>Comfort participating in an on-call rotation, investigating incidents across the stack, and driving follow-through on corrective actions.</li>\n<li>Ability to engage directly with enterprise customers during migrations and incidents, including on live calls and through clear written updates.</li>\n<li>Ability to clearly define problems, propose options, and think beyond immediate fixes to improve systems and processes over time.</li>\n<li>Ability to be a &#39;manager of one&#39;: self-directed, organized, and able to drive work to completion in a remote, asynchronous environment.</li>\n<li>Strong written and verbal communication skills, with a bias toward clear, asynchronous documentation and collaboration.</li>\n<li>Alignment with our company values and a commitment to working in accordance with those values.</li>\n</ul>\n<p>Nice to Have</p>\n<ul>\n<li>Experience working with disaster recovery technologies.</li>\n<li>Experience with managed/hosted environments similar to GitLab Dedicated, including regulated or compliance-sensitive customers (e.g., SOC2, ISO).</li>\n<li>Prior work on large-scale data migrations or cutovers where customer data integrity, performance, and downtime risk had to be carefully balanced.</li>\n<li>Hands-on experience designing and operating database replication, backup/restore, and cutover workflows (for example, PostgreSQL or cloud-managed equivalents such as AWS RDS), including planning and executing low-risk migrations for large datasets.</li>\n<li>Experience with multi-tenant architectures, sharding, or routing strategies in high-traffic SaaS platforms.</li>\n<li>Familiarity with GitLab (self-managed or SaaS), and/or contributions to open source projects.</li>\n</ul>\n<p>Benefits</p>\n<ul>\n<li>Benefits to support your health, finances, and well-being</li>\n<li>Flexible Paid Time Off</li>\n<li>Team Member Resource Groups</li>\n<li>Equity Compensation &amp; Employee Stock Purchase Plan</li>\n<li>Growth and Development Fund</li>\n<li>Parental leave</li>\n<li>Home office support</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_58d220e6-02a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"GitLab","sameAs":"https://about.gitlab.com/","logo":"https://logos.yubhub.co/about.gitlab.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/gitlab/jobs/8490453002","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Experience operating highly-available distributed systems at scale","Hands-on experience with at least one major cloud provider","Experience with Kubernetes and its ecosystem","Experience with infrastructure as code and configuration management tools","Strong programming skills in at least one general-purpose language"],"x-skills-preferred":["Experience working with disaster recovery technologies","Experience with managed/hosted environments similar to GitLab Dedicated","Prior work on large-scale data migrations or cutovers","Hands-on experience designing and operating database replication, backup/restore, and cutover workflows","Experience with multi-tenant architectures, sharding, or routing strategies in high-traffic SaaS platforms"],"datePosted":"2026-04-18T15:51:05.184Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote, India"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Experience operating highly-available distributed systems at scale, Hands-on experience with at least one major cloud provider, Experience with Kubernetes and its ecosystem, Experience with infrastructure as code and configuration management tools, Strong programming skills in at least one general-purpose language, Experience working with disaster recovery technologies, Experience with managed/hosted environments similar to GitLab Dedicated, Prior work on large-scale data migrations or cutovers, Hands-on experience designing and operating database replication, backup/restore, and cutover workflows, Experience with multi-tenant architectures, sharding, or routing strategies in high-traffic SaaS platforms"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dd6ebd20-17d"},"title":"Research Scientist, Gemini Diffusion","description":"<p>We&#39;re looking for a Research Scientist to join our team in London and help us accelerate our mission. As a Research Scientist, you will apply your deep scientific knowledge and research skills to advance paradigm-shifting research at a large scale. You will be at the heart of our efforts to deliver step-changes in the capabilities of our frontier models, with a significant focus on our Gemini Diffusion project.</p>\n<p>Your work may involve brainstorming new disruptive ideas that could become the next generation of frontier AI models, particularly within the text diffusion space. You will prototype and develop these ideas with the rest of the team, contributing directly to Gemini Diffusion research. You will solve key research challenges by designing and executing experimental research on text diffusion models, sharing analyses, and proposing next steps. You will rigorously validate the theoretical and practical impact of our work at a large scale. You will work collaboratively with other Generative AI teams to move the technologies we develop out of the lab and into production. You will advance the fundamental architecture, algorithmic design, and capabilities of large-scale diffusion models. You will bring deep scientific expertise into our projects, sharing your insights and knowledge with other researchers and engineers.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dd6ebd20-17d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Google DeepMind","sameAs":"https://deepmind.com/","logo":"https://logos.yubhub.co/deepmind.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/deepmind/jobs/7700399","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Advanced degree in computer science, electrical engineering, science, mathematics, or equivalent experience","Academic research experience in machine learning, publications, or research experience in related fields","Experience with some or all LLMs, Transformers, Diffusion models, Text diffusion, Large-scale distributed training","Strong communication skills (via discussion, presentation, technical and research writing, whiteboarding, etc.)","Programming experience, particularly with Python-based scientific libraries such as Numpy, Scipy, JAX, PyTorch, or TensorFlow"],"x-skills-preferred":["A track record of building software, either in open source or as part of a company product or research papers","Large-scale system design, distributed systems","Distributed computation for ML, especially in the context of accelerators (e.g., sharding, multi-host computation)","C++ or broader programming experience","Data engineering and visualisation"],"datePosted":"2026-03-16T14:39:03.737Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, UK"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Advanced degree in computer science, electrical engineering, science, mathematics, or equivalent experience, Academic research experience in machine learning, publications, or research experience in related fields, Experience with some or all LLMs, Transformers, Diffusion models, Text diffusion, Large-scale distributed training, Strong communication skills (via discussion, presentation, technical and research writing, whiteboarding, etc.), Programming experience, particularly with Python-based scientific libraries such as Numpy, Scipy, JAX, PyTorch, or TensorFlow, A track record of building software, either in open source or as part of a company product or research papers, Large-scale system design, distributed systems, Distributed computation for ML, especially in the context of accelerators (e.g., sharding, multi-host computation), C++ or broader programming experience, Data engineering and visualisation"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_4e51470c-8f1"},"title":"Software Engineer, Accelerators","description":"<p><strong>Software Engineer, Accelerators</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $380K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>The Kernels team at OpenAI builds the low-level software that accelerates our most ambitious AI research.</p>\n<p>We work at the boundary of hardware and software, developing high-performance kernels, distributed system optimizations, and runtime improvements to make large-scale training and inference more efficient.</p>\n<p>Our work enables OpenAI to push the limits by ensuring models - from LLMs to recommender systems - to run reliably on advanced supercomputing platforms. That includes adapting our software stack to new types of accelerators, tuning system performance end-to-end, and removing bottlenecks across every layer of the stack.</p>\n<p><strong>About the Role</strong></p>\n<p>On the Accelerators team, you will help OpenAI evaluate and bring up new compute platforms that can support large-scale AI training and inference.</p>\n<p>Your work will range from prototyping system software on new accelerators to enabling performance optimizations across our AI workloads.</p>\n<p>You’ll work across the stack, collaborating with both hardware and software aspects - working on kernels, sharding strategies, scaling across distributed systems, and performance modeling.</p>\n<p>You&#39;ll help adapt OpenAI&#39;s software stack to non-traditional hardware and drive efficiency improvements in core AI workloads. This is not a compiler-focused role, rather bridging ML algorithms with system performance - especially at scale.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Prototype and enable OpenAI&#39;s AI software stack on new, exploratory accelerator platforms.</li>\n</ul>\n<ul>\n<li>Optimize large-scale model performance (LLMs, recommender systems, distributed AI workloads) for diverse hardware environments.</li>\n</ul>\n<ul>\n<li>Develop kernels, sharding mechanisms, and system scaling strategies tailored to emerging accelerators.</li>\n</ul>\n<ul>\n<li>Collaborate on optimizations at the model code level (e.g. PyTorch) and below to enhance performance on non-traditional hardware.</li>\n</ul>\n<p>Perform system-level performance modeling, debug bottlenecks, and drive end-to-end optimization.</p>\n<ul>\n<li>Work with hardware teams and vendors to evaluate alternatives to existing platforms and adapt the software stack to their architectures.</li>\n</ul>\n<ul>\n<li>Contribute to runtime improvements, compute/communication overlapping, and scaling efforts for frontier AI workloads.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>3+ years of experience working on AI infrastructure, including kernels, systems, or hardware-software co-design</li>\n</ul>\n<ul>\n<li>Hands-on experience with accelerator platforms for AI at data center scale (e.g., TPUs, custom silicon, exploratory architectures).</li>\n</ul>\n<ul>\n<li>Strong understanding of kernels, sharding, runtime systems, or distributed scaling techniques.</li>\n</ul>\n<ul>\n<li>Familiarity with optimizing LLMs, CNNs, or recommender models for hardware efficiency.</li>\n</ul>\n<ul>\n<li>Experience with performance modeling, system debugging, and software stack adaptation for novel architectures.</li>\n</ul>\n<ul>\n<li>Exposure to mobile accelerators is welcome, but experience enabling data center-scale AI hardware is preferred.</li>\n</ul>\n<ul>\n<li>Ability to operate across multiple levels of the stack, rapidly prototype solutions, and navigate ambiguity in early hardware bring-up phases</li>\n</ul>\n<ul>\n<li>Interest in shaping the future of AI compute through exploration of alternatives to mainstream accelerators.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_4e51470c-8f1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/f386b209-1259-4b79-bf5a-aa97fc7ce77b","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$295K – $380K • Offers Equity","x-skills-required":["AI infrastructure","kernels","systems","hardware-software co-design","accelerator platforms","TPUs","custom silicon","exploratory architectures","kernels","sharding","runtime systems","distributed scaling techniques","LLMs","CNNs","recommender models","hardware efficiency","performance modeling","system debugging","software stack adaptation","novel architectures"],"x-skills-preferred":["mobile accelerators","data center-scale AI hardware"],"datePosted":"2026-03-06T18:27:12.141Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI infrastructure, kernels, systems, hardware-software co-design, accelerator platforms, TPUs, custom silicon, exploratory architectures, kernels, sharding, runtime systems, distributed scaling techniques, LLMs, CNNs, recommender models, hardware efficiency, performance modeling, system debugging, software stack adaptation, novel architectures, mobile accelerators, data center-scale AI hardware","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":380000,"unitText":"YEAR"}}}]}