{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/entity-recognition"},"x-facet":{"type":"skill","slug":"entity-recognition","display":"Entity Recognition","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_6aab7ed8-23a"},"title":"Senior Software Engineer - Data","description":"<p>We are seeking an experienced Senior Software Engineer (Data) to join our fast-paced, collaborative data team. In this role, you will have broad authority to drive the direction of our technographic data services, building world-class data pipelines and systems to process billions of signals and data points.</p>\n<p>This is an exciting opportunity to solve challenging problems and make a big impact as we invest in making technographics a first-class offering.</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li>Build and optimize big data pipelines to extract and process signals from the web, job postings, and other sources</li>\n<li>Design and implement data architectures and storage solutions to efficiently handle massive data volumes</li>\n<li>Collaborate closely with data scientists to support and integrate ML models into data workflows</li>\n<li>Continuously improve data quality, performance, and scalability of our technographic data platform</li>\n<li>Drive technical strategy and roadmap for the data processing infrastructure</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Extensive experience building and scaling big data pipelines and architectures from scratch</li>\n<li>Deep expertise in big data frameworks (Hadoop, Spark) and the JVM stack (Java, Scala)</li>\n<li>Strong software engineering fundamentals and ability to write efficient, high-quality code</li>\n<li>Experience with entity recognition and NLP techniques a plus</li>\n<li>Proven track record delivering results and driving projects in a fast-paced environment</li>\n<li>Excellent collaboration and communication skills to work with data scientists, analysts and product teams</li>\n<li>Passion for leveraging huge datasets to power valuable insights</li>\n</ul>\n<p>Ideal Background:</p>\n<ul>\n<li>8+ years of experience in software engineering roles</li>\n<li>Experience working with very large datasets and distributed systems</li>\n<li>Familiarity building data pipelines at large tech companies or data-driven organisations</li>\n<li>Bachelor&#39;s or advanced degree in Computer Science, Engineering or related technical field</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_6aab7ed8-23a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"ZoomInfo","sameAs":"https://www.zoominfo.com/","logo":"https://logos.yubhub.co/zoominfo.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/zoominfo/jobs/8486808002","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$140,000-$220,000 USD","x-skills-required":["big data pipelines","data architectures","storage solutions","ML models","data quality","performance","scalability","data processing infrastructure","Hadoop","Spark","Java","Scala","entity recognition","NLP techniques"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:49:24.766Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bethesda, Maryland, United States; Waltham, Massachusetts, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"big data pipelines, data architectures, storage solutions, ML models, data quality, performance, scalability, data processing infrastructure, Hadoop, Spark, Java, Scala, entity recognition, NLP techniques","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":140000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_601ca6bf-9b1"},"title":"Senior Machine Learning Engineer, Natural Language Processing - PhD Early Career","description":"<p><strong>[2026] Senior Machine Learning Engineer, Natural Language Processing - PhD Early Career</strong></p>\n<p>San Mateo, CA, United States</p>\n<p>Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators.</p>\n<p>At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.</p>\n<p>A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.</p>\n<p>Natural Language Processing (NLP) is central to enabling massive-scale communication, creation, and safety across the Roblox platform. This role offers the unique opportunity to build and deploy cutting-edge <strong>NLP, speech, and generative AI models</strong> that operate at an unprecedented scale, impacting hundreds of millions of daily users.</p>\n<p>You will solve an extremely diverse range of high-scale language-related problems—from <strong>real-time moderation of voice and text</strong> to <strong>automatically localizing experiences</strong> and empowering users through <strong>LLM-driven creation tools</strong>. We combine cutting-edge research with large-scale engineering to bridge experimentation and production, designing algorithms that shape the next generation of language services for our immersive, user-generated content platform.</p>\n<p><strong><strong>Teams Hiring for This Role</strong></strong></p>\n<ul>\n<li><strong>Safety AI Systems:</strong>Dedicated to building end-to-end ML systems for maintaining civility and safety across the platform, operating at massive scale. This includes:</li>\n</ul>\n<ul>\n<li><strong>Real-time Moderation:</strong> Building world-class NLP and speech models for <strong>real-time moderation of voice and text</strong> (processing over 6 billion messages daily) and advanced interventions that measurably improve user civility.</li>\n</ul>\n<ul>\n<li><strong>Critical Harms &amp; Advanced Detection:</strong> Developing specialized LLM agents, behavioral analysis, and graph systems for detecting and preventing rare, high-risk scenarios (e.g., child safety, terrorism), requiring adversarial thinking and multi-step reasoning.</li>\n</ul>\n<ul>\n<li><strong>Safety Data Quality:</strong> Ensuring all Safety AI systems are robust by managing the core data infrastructure, MLOps, and Active Learning initiatives for continuous model improvement.</li>\n</ul>\n<p><strong>You Will</strong></p>\n<ul>\n<li>Design and implement <strong>deep learning-based NLP and speech solutions</strong> that address problems across Roblox, from creation to safety.</li>\n</ul>\n<ul>\n<li>Develop advanced models, including <strong>Large Language Models (LLMs), machine translation, and generative AI</strong>, for user interactions, content creation, and moderation.</li>\n</ul>\n<ul>\n<li>Have the independence and <strong>end-to-end responsibility</strong> to develop NLP/ML-based services that are scalable and resilient.</li>\n</ul>\n<ul>\n<li>Be a <strong>technical bar-raiser</strong> for cutting-edge ML technology, high code quality, and architectural designs.</li>\n</ul>\n<ul>\n<li>Work backward from user and product needs to deliver ML solutions that drive engagement, safety, and ecosystem growth.</li>\n</ul>\n<p><strong>You Have</strong></p>\n<ul>\n<li>Possessing or pursuing a Ph.D. in Computer Science, Computer Engineering, Mathematics, Statistics, or a related technical field, with a thesis aligned to Roblox’s research areas.</li>\n</ul>\n<ul>\n<li>Expertise in one or more areas: NLP, Speech Models, Large Language Models, Machine Translation, or Generative AI (including diffusion models).</li>\n</ul>\n<ul>\n<li>Experience with transformer-based model design, training, serving, and product integration.</li>\n</ul>\n<ul>\n<li>A strong research track record, evidenced by multiple publications and presentations in top-tier, peer-reviewed venues (e.g., ACL, EMNLP, Interspeech, ICML, NeurIPS).</li>\n</ul>\n<ul>\n<li>Proficiency in one or more programming languages (e.g., Python, C++, Go, Java) and experience building and optimizing large-scale systems.</li>\n</ul>\n<p>You may redact age, date of birth, and dates of attendance/graduation from your resume if you prefer.</p>\n<p>As you apply, you can find more information about our process by signing up for Speak\\_. You&#39;ll gain access to our practice assessment, comprehensive guides, FAQs, and modules designed to help you ace the hiring process.</p>\n<p>For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits as described on <strong>this page</strong>.</p>\n<p>Annual Salary Range</p>\n<p>$195,780—$242,100 USD</p>\n<p>Roles that are based in an office are onsite Tuesday, Wednesday, and Thursday, with optional presence on Monday and Friday (unless otherwise noted).</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_601ca6bf-9b1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Roblox","sameAs":"https://careers.roblox.com","logo":"https://logos.yubhub.co/careers.roblox.com.png"},"x-apply-url":"https://careers.roblox.com/jobs/7324377","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$195,780—$242,100 USD","x-skills-required":["NLP","Speech Models","Large Language Models","Machine Translation","Generative AI","Python","C++","Go","Java","Transformer-based model design","Training","Serving","Product integration"],"x-skills-preferred":["Deep learning","Computer vision","Natural language processing","Speech recognition","Text analysis","Sentiment analysis","Named entity recognition","Part-of-speech tagging","Dependency parsing","Semantic role labeling"],"datePosted":"2026-03-06T14:18:50.958Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Mateo, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"NLP, Speech Models, Large Language Models, Machine Translation, Generative AI, Python, C++, Go, Java, Transformer-based model design, Training, Serving, Product integration, Deep learning, Computer vision, Natural language processing, Speech recognition, Text analysis, Sentiment analysis, Named entity recognition, Part-of-speech tagging, Dependency parsing, Semantic role labeling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":195780,"maxValue":242100,"unitText":"YEAR"}}}]}