{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/tokenizers"},"x-facet":{"type":"skill","slug":"tokenizers","display":"Tokenizers","count":1},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_540ce49c-271"},"title":"Member of Technical Staff - Multimodal Understanding","description":"<p><strong>About the Role</strong></p>\n<p>You will join the multimodal team to push toward superhuman multimodal intelligence. Advance understanding and generation across modalities,image, video, audio, and text,spanning the full stack: data curation/acquisition, tokenizer training, large-scale pre-training, post-training/alignment, infrastructure/scaling, evaluation, tooling/demos, and end-to-end product experiences.</p>\n<p>Collaborate cross-functionally with pre-training, post-training, reasoning, data, applied, and product teams to deliver frontier capabilities in multimodal reasoning, world modeling, tool use, agentic behaviors, and interactive human-AI collaboration. Contribute to building models that can see, hear, reason about, and interact with the world in real time at unprecedented levels.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale.</li>\n<li>Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text).</li>\n<li>Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding &amp; generation, real-time video processing, and noisy data handling.</li>\n<li>Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models.</li>\n<li>Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy.</li>\n<li>Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance.</li>\n<li>Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback.</li>\n<li>Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions.</li>\n</ul>\n<p><strong>Basic Qualifications</strong></p>\n<ul>\n<li>Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal).</li>\n<li>Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA.</li>\n<li>Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design).</li>\n<li>Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data.</li>\n<li>Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors).</li>\n<li>Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers.</li>\n<li>Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences.</li>\n</ul>\n<p><strong>Preferred Skills and Experience</strong></p>\n<ul>\n<li>Experience leading major improvements in model capabilities through better data, modeling, algorithms, or scaling.</li>\n<li>Familiarity with state-of-the-art in multimodal LLMs, scaling laws, tokenizers, compression techniques, reasoning, or agentic systems.</li>\n<li>Proficiency in Rust and/or C++ for performance-critical components.</li>\n<li>Hands-on work with large-scale orchestration tools such as Spark, Ray, or Kubernetes.</li>\n<li>Background building full-stack tooling: performant interfaces, real-time research demos/apps, or end-to-end product ownership.</li>\n<li>Passion for end-to-end user experience in interactive, real-time multimodal AI systems.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_540ce49c-271","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/5111374007","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000 USD","x-skills-required":["Multimodal pre-training","Post-training","Fine-tuning","Python","JAX","PyTorch","XLA","Large-scale distributed ML systems","Data pipelines","Evaluation design","Benchmarks","Reward modeling","RL techniques"],"x-skills-preferred":["State-of-the-art in multimodal LLMs","Scaling laws","Tokenizers","Compression techniques","Reasoning","Agentic systems","Rust","C++","Spark","Ray","Kubernetes","Full-stack tooling"],"datePosted":"2026-04-18T15:23:05.119Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Multimodal pre-training, Post-training, Fine-tuning, Python, JAX, PyTorch, XLA, Large-scale distributed ML systems, Data pipelines, Evaluation design, Benchmarks, Reward modeling, RL techniques, State-of-the-art in multimodal LLMs, Scaling laws, Tokenizers, Compression techniques, Reasoning, Agentic systems, Rust, C++, Spark, Ray, Kubernetes, Full-stack tooling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}}]}