{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/data-science-libraries"},"x-facet":{"type":"skill","slug":"data-science-libraries","display":"Data Science Libraries","count":4},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ed5725bb-311"},"title":"Applied Research Engineer, Agents","description":"<p>Shape the Future of AI</p>\n<p>At Labelbox, we&#39;re building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we&#39;ve been pioneering data-centric approaches that are fundamental to AI development, and our work becomes even more essential as AI capabilities expand exponentially.</p>\n<p>As an Applied Research Engineer at Labelbox, you&#39;ll sit at the junction of advanced AI research and real product impact, with a focus on the data that makes modern agents work,browser interactions, SWE/code traces, GUI sessions, and multi-turn workflows. You&#39;ll drive the data landscape required to advance capable, adaptable agents and help shape Labelbox&#39;s strategy for collecting, synthesizing, and evaluating it.</p>\n<p>Create frameworks and tools to construct, train, benchmark and evaluate autonomous agent capabilities.</p>\n<p>Design agent-focused data programs using supervised fine-tuning (SFT) and reinforcement learning (RL) methodologies.</p>\n<p>Develop data pipelines from diverse sources like code repositories, web browsers, and computer systems.</p>\n<p>Implement and adapt popular open-source agent libraries and benchmarks with proprietary datasets and models.</p>\n<p>Engage with research teams in frontier AI labs and the wider AI community to understand evolving agent data needs for frontier models and share best practices.</p>\n<p>Collaborate closely with frontier AI lab customers to understand requirements and guide model development.</p>\n<p>Publish research findings in academic journals, conferences, and blog posts.</p>\n<p>What You Bring</p>\n<p>Ph.D. or Master&#39;s degree in Computer Science, Machine Learning, AI, or related field.</p>\n<p>At least 3 years of experience addressing sophisticated ML problems with successful delivery to customers.</p>\n<p>Experience building and training autonomous agents,tool use, structured outputs, multi-step planning,across browsers/GUI, codebases, and databases using SFT and RL.</p>\n<p>Constructed and evaluated agentic benchmarks (e.g. SWE-bench, WebArena, τ-bench, OSWorld) and reliability/efficiency suites (e.g. WABER).</p>\n<p>Adept at interpreting research literature and quickly turning new ideas into prototypes.</p>\n<p>Deep understanding of frontier models (autoregressive, diffusion), post-training (SFT, RLVR, RLAIF, RLHF, et al.), and their human data requirements.</p>\n<p>Proficient in Python, data science libraries and deep learning frameworks (e.g., PyTorch, JAX, TensorFlow).</p>\n<p>Strong analytical and problem-solving abilities in ambiguous situations.</p>\n<p>Excellent communication skills.</p>\n<p>Track record of publications in top-tier AI/ML venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, etc.).</p>\n<p>Labelbox Applied Research</p>\n<p>At Labelbox Applied Research, we&#39;re committed to pushing the boundaries of AI and data-centric machine learning, with a particular focus on advanced human-AI interaction techniques. We believe that high-quality human data and sophisticated human feedback integration methods are key to unlocking the next generation of AI capabilities. Our research team works at the intersection of machine learning, human-computer interaction, and AI ethics to develop innovative solutions that can be practically applied in real-world scenarios.</p>\n<p>Life at Labelbox</p>\n<p>Location: Join our dedicated tech hubs in San Francisco or Wrocław, Poland</p>\n<p>Work Style: Hybrid model with 2 days per week in office, combining collaboration and flexibility</p>\n<p>Environment: Fast-paced and high-intensity, perfect for ambitious individuals who thrive on ownership and quick decision-making</p>\n<p>Growth: Career advancement opportunities directly tied to your impact</p>\n<p>Vision: Be part of building the foundation for humanity&#39;s most transformative technology</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ed5725bb-311","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Labelbox","sameAs":"https://www.labelbox.com/","logo":"https://logos.yubhub.co/labelbox.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/labelbox/jobs/4829775007","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$250,000-$300,000 USD","x-skills-required":["Python","data science libraries","deep learning frameworks","PyTorch","JAX","TensorFlow","supervised fine-tuning","reinforcement learning","agent libraries","benchmarks","proprietary datasets","human-AI interaction","AI ethics"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:52:38.777Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco Bay Area"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, data science libraries, deep learning frameworks, PyTorch, JAX, TensorFlow, supervised fine-tuning, reinforcement learning, agent libraries, benchmarks, proprietary datasets, human-AI interaction, AI ethics","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":250000,"maxValue":300000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c8eee058-0f9"},"title":"Data Analyst: Personalization","description":"<p><strong>Role Overview</strong></p>\n<p>The Personalization team, within the Machine Learning Chapter and Engineering Department, plays a central role in implementing algorithms that utilize personalization signals to optimize for business KPIs like revenue &amp; conversions.</p>\n<p>You will work closely with engineers and product managers to define success, uncover insights, and influence roadmap decisions through data. Your work will help shape innovative user experiences that drive business KPIs, interpret user behavior, and drive product and algorithm improvements.</p>\n<p><strong>Challenges you will tackle</strong></p>\n<ul>\n<li>Understand Shopper Behavior: Investigate how product changes affect user behavior and conversion metrics. Use SQL, Python, and Spark to uncover usage patterns, anomalies, and opportunities for optimization.</li>\n<li>Design &amp; Validate Metrics: Define new metrics to measure personalization, and model performance. Ensure metrics align with user experience and business goals through rigorous validation.</li>\n<li>Build Analytics Infrastructure: Create scalable dashboards and reporting tools for product, engineering, and leadership teams. Develop debugging tools to explain ranking decisions and identify performance issues.</li>\n<li>Drive Data-Informed Decisions: Partner cross-functionally to design experiments, validate hypotheses, and communicate insights that directly influence product roadmap and ML strategy.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>3+ years analyzing complex experiments and extracting actionable insights from large, noisy datasets. Experience with statistical testing and practical experiment design.</li>\n<li>Write optimized SQL queries for terabyte-scale data extraction and transformation. Proficiency with distributed systems like Spark for large-scale data processing.</li>\n<li>Strong skills in exploratory analysis and building internal tools. Experience with data science libraries and automation.</li>\n<li>Understanding of ML pipelines, training data quality, and ranking/recommendation metrics. Familiarity with search relevance and personalization concepts.</li>\n<li>Design metrics that accurately reflect model and product performance. Ensure alignment between technical metrics and business outcomes.</li>\n<li>Create compelling dashboards using Tableau, Looker, or custom dashboards in Python. Present complex findings clearly to both technical and executive audiences.</li>\n<li>Influence product and engineering decisions through data storytelling. Collaborate effectively across teams to drive ML and product improvements.</li>\n<li>Deep curiosity about user behavior and business impact. Connect algorithm changes to real-world customer outcomes.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Unlimited vacation time - we strongly encourage all of our employees take at least 3 weeks per year</li>\n<li>Fully remote team - choose where you live</li>\n<li>Work from home stipend! We want you to have the resources you need to set up your home office</li>\n<li>Apple laptops provided for new employees</li>\n<li>Training and development budget for every employee, refreshed each year</li>\n<li>Maternity &amp; Paternity leave for qualified employees</li>\n<li>Base salary: $80k–$120k USD, depending on knowledge, skills, experience, and interview results</li>\n<li>Stock options - offered in addition to the base salary</li>\n<li>Regular team offsites to connect and collaborate</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c8eee058-0f9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Constructor","sameAs":"https://apply.workable.com","logo":"https://logos.yubhub.co/j.com.png"},"x-apply-url":"https://apply.workable.com/j/DA5744F6E7","x-work-arrangement":"remote","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$80k–$120k USD","x-skills-required":["SQL","Python","Spark","statistical testing","practical experiment design","data science libraries","automation","ML pipelines","training data quality","ranking/recommendation metrics","search relevance","personalization concepts"],"x-skills-preferred":["Tableau","Looker","custom dashboards in Python"],"datePosted":"2026-03-09T10:58:06.276Z","jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"SQL, Python, Spark, statistical testing, practical experiment design, data science libraries, automation, ML pipelines, training data quality, ranking/recommendation metrics, search relevance, personalization concepts, Tableau, Looker, custom dashboards in Python","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":80000,"maxValue":120000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fb622500-15e"},"title":"Data Scientist, Marketing","description":"<p>You will directly impact Replit&#39;s growth by turning user behavior into actionable insights that optimize our marketing efforts, improve conversion funnels, and drive sustainable revenue growth across our self-serve and enterprise segments.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design and analyse marketing experiments to optimise campaigns, messaging, and channel performance across email, paid ads, social, and content marketing.</li>\n<li>Build attribution models and multi-touch conversion funnels to understand the customer journey from first touch to paid conversion.</li>\n<li>Develop predictive models to identify high-intent prospects, optimise lead scoring, and improve targeting for paid acquisition campaigns.</li>\n<li>Partner with marketing, growth, and revenue teams to translate business questions into rigorous analysis and clear recommendations.</li>\n<li>Create self-service dashboards and automated reporting that surface key marketing metrics (CAC, LTV, ROAS, conversion rates) for go-to-market teams.</li>\n<li>Build and maintain data pipelines that integrate marketing platforms (Google Ads, Meta, Iterable, Segment, etc.) with our product analytics.</li>\n</ul>\n<p><strong>Examples of what you could do</strong></p>\n<ul>\n<li>Build propensity models to identify which free users are most likely to convert to plans based on usage patterns and engagement signals.</li>\n<li>Analyse cohort behaviour and retention patterns to optimise lifecycle marketing campaigns and reduce churn.</li>\n<li>Develop segmentation models to personalise messaging and targeting for different user personas (students, hobbyists, professional developers, enterprise teams).</li>\n<li>Build real-time alerting systems to flag anomalies in campaign performance or conversion metrics, automate bidding adjustments across platforms.</li>\n</ul>\n<p><strong>Required skills and experience</strong></p>\n<ul>\n<li>Bachelor&#39;s degree in Computer Science, Statistics, Mathematics, Economics, or related field, OR equivalent real-world experience in data roles.</li>\n<li>4+ years of experience in data science or related roles with a focus on marketing, growth, or business analytics.</li>\n<li>Strong SQL skills and experience working with large datasets, particularly event-level user behaviour data, and designing ETL workflows using dbt</li>\n<li>Proficiency in Python and data science libraries (pandas, scikit-learn, statsmodels, etc.).</li>\n<li>Experience designing and analysing A/B tests and experiments, including statistical rigor around sample sizing, significance testing, and causal inference.</li>\n<li>Experience building dashboards and visualisations (Looker, Tableau, Mode, or similar tools).</li>\n<li>Ability to translate ambiguous business questions into structured analysis and communicate findings clearly to non-technical stakeholders.</li>\n</ul>\n<p><strong>Preferred Qualifications</strong></p>\n<ul>\n<li>Experience with modern data stack (dbt, BigQuery, Snowflake, Fivetran, etc.).</li>\n<li>Background in growth analytics, marketing analytics, or conversion rate optimisation at a SaaS or PLG company.</li>\n<li>Familiarity with marketing technology platforms (Google Analytics, Segment, Iterable, Marketo, HubSpot, etc.).</li>\n<li>Experience with attribution modelling, marketing mix modelling, or incrementality testing.</li>\n<li>Understanding of PLG (product-led growth) motions and self-serve conversion funnels.</li>\n</ul>\n<p><strong>Bonus Points</strong></p>\n<ul>\n<li>Experience analysing freemium or usage-based pricing models.</li>\n<li>Understanding of developer tools, collaborative coding environments, or technical products.</li>\n<li>Experience with causal inference methods (difference-in-differences, synthetic control, propensity score matching).</li>\n<li>Familiarity with customer data platforms (CDPs) and event tracking implementation.</li>\n<li>Experience working with sales and customer success data to analyse expansion revenue and upsell opportunities.</li>\n</ul>\n<p><strong>Full-Time Employee Benefits Include</strong></p>\n<ul>\n<li>Competitive Salary &amp; Equity</li>\n<li>401(k) Program with a 4% match</li>\n<li>Health, Dental, Vision and Life Insurance</li>\n<li>Short Term and Long Term Disability</li>\n<li>Paid Parental, Medical, Caregiver Leave</li>\n<li>Commuter Benefits</li>\n<li>Monthly Wellness Stipend</li>\n<li>Autonomous Work Environment</li>\n<li>In Office Set-Up Reimbursement</li>\n<li>Flexible Time Off (FTO) + Holidays</li>\n<li>Quarterly Team Gatherings</li>\n<li>In Office Amenities</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fb622500-15e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Replit","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/replit.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/replit/c05749db-f413-4091-a95c-c8e0aa1b5630","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$180K - $250K","x-skills-required":["SQL","Python","data science libraries (pandas, scikit-learn, statsmodels, etc.)","ETL workflows using dbt","A/B tests and experiments","dashboard and visualisation tools (Looker, Tableau, Mode, etc.)"],"x-skills-preferred":["modern data stack (dbt, BigQuery, Snowflake, Fivetran, etc.)","growth analytics, marketing analytics, or conversion rate optimisation","marketing technology platforms (Google Analytics, Segment, Iterable, etc.)","attribution modelling, marketing mix modelling, or incrementality testing"],"datePosted":"2026-03-07T15:20:03.203Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Foster City, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"SQL, Python, data science libraries (pandas, scikit-learn, statsmodels, etc.), ETL workflows using dbt, A/B tests and experiments, dashboard and visualisation tools (Looker, Tableau, Mode, etc.), modern data stack (dbt, BigQuery, Snowflake, Fivetran, etc.), growth analytics, marketing analytics, or conversion rate optimisation, marketing technology platforms (Google Analytics, Segment, Iterable, etc.), attribution modelling, marketing mix modelling, or incrementality testing","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_afd939a6-f8c"},"title":"Data Scientist - Prompt","description":"<p>We are hiring a Data Scientist to join our Localization Data &amp; AI team, reporting to the Data Scientist Lead. The Loc Data &amp; AI team’s mission is to empower EA Localization through intelligent, data-driven solutions leveraging advanced analytics, scalable AI systems, and collaborative tools that enhance the quality and efficiency of localized content.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<p>As a Data Scientist in our team, you will focus on building end-to-end data science solutions from exploratory analysis to model development and performance evaluation working closely with Machine Learning Engineers and Data Engineers.</p>\n<ul>\n<li>Analyze large, multilingual datasets to generate actionable insights for localization workflows.</li>\n<li>Develop and refine feature engineering pipelines tailored for multilingual and multimodal datasets.</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>2+ years of hands-on experience in applied data science, ideally in NLP, prompt engineering or multilingual domains.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_afd939a6-f8c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Electronic Arts","sameAs":"https://jobs.ea.com","logo":"https://logos.yubhub.co/jobs.ea.com.png"},"x-apply-url":"https://jobs.ea.com/en_US/careers/JobDetail/Data-Scientist-Prompt/212016","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","core data science libraries","ML and DL frameworks","statistical methods","hypothesis testing","experimental design"],"x-skills-preferred":["NLP concepts and tools","BLEU","BERTScore","spacy","nltk","quality estimation"],"datePosted":"2026-01-16T02:04:41.381Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Madrid"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, core data science libraries, ML and DL frameworks, statistical methods, hypothesis testing, experimental design, NLP concepts and tools, BLEU, BERTScore, spacy, nltk, quality estimation"}]}