{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/table-formats"},"x-facet":{"type":"skill","slug":"table-formats","display":"Table Formats","count":3},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_27ad557d-d6e"},"title":"Senior Software Engineer II","description":"<p>We are seeking a Senior Software Engineer to join our Platform team. This role will involve leading projects end-to-end and contributing to impactful platform initiatives that power R&amp;D, operational excellence, and business intelligence analytics.</p>\n<p>As a Senior Software Engineer, you will partner with engineers, scientists, product managers, and business teams to identify high-leverage opportunities and build common solutions. You will integrate open-source, enterprise, and SaaS technologies into our evolving stack, design and ship components of a new platform architecture to enable multi-tenancy, fine-grained data governance, workload isolation, and scaling.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading projects end-to-end and contributing to impactful platform initiatives</li>\n<li>Partnering with engineers, scientists, product managers, and business teams to identify high-leverage opportunities and build common solutions</li>\n<li>Integrating open-source, enterprise, and SaaS technologies into our evolving stack</li>\n<li>Designing and shipping components of a new platform architecture to enable multi-tenancy, fine-grained data governance, workload isolation, and scaling</li>\n<li>Contributing to the growth of our Data Lakehouse platforms, enabling well-governed data products for analytical and operational use cases</li>\n<li>Continuously improving the Research Platform to meet our Science&#39;s evolving needs for experimentation, ML Ops, data processing, and analysis</li>\n<li>Helping shape how we approach data modeling, context engineering, and emerging semantic layers to make data easier to discover and use</li>\n<li>Advocating for a product mindset within Platform and Data engineering at Freenome, focusing on developer effectiveness and platform usability</li>\n<li>Exploring and piloting AI-assisted or agentic workflows to enhance individual and team productivity, sharing learnings with the broader organization</li>\n<li>Collaborating through system design, code reviews, and pairing, promoting a strong team culture of accountability, learning, and psychological safety</li>\n<li>Supporting platform users to troubleshoot issues and unblock critical work</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>6+ years of experience building and operating highly reliable production software systems, preferably in platform engineering teams or similar</li>\n<li>Proficiency with Python and experience with one or more other high-level programming languages</li>\n<li>Strong knowledge of Linux fundamentals, including networking and containerization</li>\n<li>Hands-on experience operating cloud services, storage, and compute using IaC with at least one major cloud provider, preferably Azure or GCP</li>\n<li>Operational experience managing and optimizing large Kubernetes clusters, preferably in single or multi-cluster environments with thousands of nodes</li>\n<li>A pragmatic approach to reliability, observability, performance tuning, and operational excellence</li>\n<li>Excellent communication and documentation skills</li>\n<li>Comfort with cross-functional collaboration and navigating tradeoffs</li>\n<li>BS or higher in computer science or a related technical field, or comparable experience</li>\n</ul>\n<p>Nice to have includes columnar data processing, open lakehouse technologies, and table formats, supporting researchers, data scientists, AI/ML teams, flyte or other modern workflow orchestrators, everything as code approach to infrastructure, policies, data, operating production systems in Microsoft Azure, open-source contribution and maintenance.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_27ad557d-d6e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Freenome","sameAs":"https://freenome.com/","logo":"https://logos.yubhub.co/freenome.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/freenome/jobs/8414748002","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$161,925 - $227,325","x-skills-required":["Python","Linux","Cloud services","Kubernetes","IaC","Containerization","Networking"],"x-skills-preferred":["Columnar data processing","Open lakehouse technologies","Table formats","Flyte","Workflow orchestrators","Everything as code","Microsoft Azure"],"datePosted":"2026-04-17T12:36:57.568Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Linux, Cloud services, Kubernetes, IaC, Containerization, Networking, Columnar data processing, Open lakehouse technologies, Table formats, Flyte, Workflow orchestrators, Everything as code, Microsoft Azure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":161925,"maxValue":227325,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d48b0655-2fa"},"title":"Data/Infrastructure Advocate Engineer","description":"<p>At Hugging Face, we&#39;re on a journey to democratise good AI. As our first Data/Infrastructure Advocate Engineer, you&#39;ll bridge the gap between cutting-edge data infrastructure and the global community of data engineers, researchers, and developers.</p>\n<p>You&#39;ll champion Xet storage on the Hugging Face Hub, empowering users to efficiently store, version, and collaborate on large-scale datasets. This role is for someone who thrives at the intersection of technical depth (storage, Parquet, deduplication) and community advocacy—helping define the future of open data workflows.</p>\n<p>Your main missions will be:</p>\n<ul>\n<li>Grow and nurture the open-source data/infra community—launch initiatives, collaborate with data-focused groups, and organise events or challenges.</li>\n<li>Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration—curate and showcase datasets, benchmarks, and tools like Xet.</li>\n<li>Highlight use cases like efficient large dataset updates, Parquet editing, and deduplication to demonstrate the Hub&#39;s value for data workflows.</li>\n<li>Create demos, benchmarks, and tools (e.g., Colab notebooks) to illustrate best practices for data storage and versioning.</li>\n<li>Experiment with Xet, Parquet, and other data formats to showcase their potential for ML and data engineering.</li>\n<li>Produce high-quality tutorials, blog posts, and videos that make complex topics accessible.</li>\n<li>Share insights on storage optimisation, dataset versioning, and deduplication to empower developers.</li>\n<li>Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.</li>\n<li>Ensure datasets and tools released on the Hub are well-documented, with clear examples, benchmarks, and use cases.</li>\n</ul>\n<p><strong>About you</strong></p>\n<p>You&#39;re a great fit if you:</p>\n<ul>\n<li>Have strong technical skills in Python, data libraries (e.g., pandas, pyarrow, huggingface/datasets), and storage systems (Parquet, Open Table Formats, S3).</li>\n<li>Are a hands-on builder who loves experimenting with data tools, storage optimisation, and dataset versioning.</li>\n<li>Can clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.</li>\n<li>Are active in developer communities (GitHub, Discord, forums) and passionate about open source and knowledge sharing.</li>\n<li>Thrive in fast-moving environments and enjoy building in public to inspire others.</li>\n</ul>\n<p>If you&#39;re interested in joining us but don&#39;t tick every box above, we still encourage you to apply! We&#39;re building a diverse team whose skills, experiences, and backgrounds complement one another.</p>\n<p><strong>More about Hugging Face</strong></p>\n<p>We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where you feel respected and supported—regardless of who you are or where you come from.</p>\n<p>Hugging Face is an equal opportunity employer, and we do not discriminate based on race, ethnicity, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or ability status.</p>\n<p>We value development. You will work with some of the smartest people in our industry.</p>\n<p>We provide all employees with reimbursement for relevant conferences, training, and education.</p>\n<p>We care about your well-being. We offer flexible working hours and remote options.</p>\n<p>We offer health, dental, and vision benefits for employees and their dependents.</p>\n<p>We also offer parental leave and flexible paid time off.</p>\n<p>We support our employees wherever they are. While we have office spaces in NYC and Paris, we&#39;re very distributed, and all remote employees have the opportunity to visit our offices.</p>\n<p>If needed, we&#39;ll also outfit your workstation to ensure you succeed.</p>\n<p>We want our teammates to be shareholders. All employees have company equity as part of their compensation package.</p>\n<p>If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d48b0655-2fa","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Hugging Face","sameAs":"https://huggingface.co/"},"x-apply-url":"https://apply.workable.com/j/5CA82A9A98","x-work-arrangement":"remote","x-experience-level":"entry","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","data libraries","pandas","pyarrow","huggingface/datasets","storage systems","Parquet","Open Table Formats","S3"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:34:41.656Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, data libraries, pandas, pyarrow, huggingface/datasets, storage systems, Parquet, Open Table Formats, S3"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f81a1dc8-ca4"},"title":"Data/Infrastructure Advocate Engineer - EMEA Remote","description":"<p>At Hugging Face, we&#39;re on a journey to democratize good AI. We are building the fastest growing platform for AI builders with over 5 million users &amp; 100k organisations who collectively shared over 1M models, 300k datasets &amp; 300k apps. Our open-source libraries have more than 400k+ stars on Github.</p>\n<p>As our first Data/Infrastructure Advocate Engineer, you&#39;ll bridge the gap between cutting-edge data infrastructure and the global community of data engineers, researchers, and developers. You&#39;ll champion Xet storage on the Hugging Face Hub, empowering users to efficiently store, version, and collaborate on large-scale datasets.</p>\n<p>This role is for someone who thrives at the intersection of technical depth (storage, Parquet, deduplication) and community advocacy—helping define the future of open data workflows. You&#39;ll collaborate with teams like Datasets, Hub, and Infrastructure to shape how developers interact with data on our platform, and inspire a community to build better, faster, and more scalable data pipelines.</p>\n<p>Your Main Missions:</p>\n<ul>\n<li>Grow and nurture the open-source data/infra community—launch initiatives, collaborate with data-focused groups, and organise events or challenges. Engage with communities like Apache Parquet, Open Tables Formats, and data engineering forums to promote best practices and Hugging Face tools.</li>\n</ul>\n<ul>\n<li>Promote the Hugging Face Hub as the go-to platform for data storage, versioning, and collaboration—curate and showcase datasets, benchmarks, and tools like Xet.</li>\n</ul>\n<ul>\n<li>Highlight use cases like efficient large dataset updates, Parquet editing, and deduplication to demonstrate the Hub’s value for data workflows.</li>\n</ul>\n<ul>\n<li>Create demos, benchmarks, and tools (e.g., Colab notebooks) to illustrate best practices for data storage and versioning.</li>\n</ul>\n<ul>\n<li>Experiment with Xet, Parquet, and other data formats to showcase their potential for ML and data engineering.</li>\n</ul>\n<ul>\n<li>Produce high-quality tutorials, blog posts, and videos that make complex topics accessible.</li>\n</ul>\n<ul>\n<li>Share insights on storage optimisation, dataset versioning, and deduplication to empower developers.</li>\n</ul>\n<ul>\n<li>Actively participate in online communities (Discord, GitHub, forums) to highlight contributions, answer questions, and foster collaboration.</li>\n</ul>\n<ul>\n<li>Ensure datasets and tools released on the Hub are well-documented, with clear examples, benchmarks, and use cases.</li>\n</ul>\n<p><strong>About you</strong></p>\n<p>You’re a great fit if you:</p>\n<ul>\n<li>Have strong technical skills in Python, data libraries (e.g., pandas, pyarrow, huggingface/datasets), and storage systems (Parquet, Open Table Formats, S3).</li>\n</ul>\n<ul>\n<li>Are a hands-on builder who loves experimenting with data tools, storage optimisation, and dataset versioning.</li>\n</ul>\n<ul>\n<li>Can clearly explain complex topics (e.g., deduplication, compression, Parquet editing) through writing, demos, or talks.</li>\n</ul>\n<ul>\n<li>Are active in developer communities (GitHub, Discord, forums) and passionate about open source and knowledge sharing.</li>\n</ul>\n<ul>\n<li>Thrive in fast-moving environments and enjoy building in public to inspire others.</li>\n</ul>\n<p>If you&#39;re interested in joining us but don&#39;t tick every box above, we still encourage you to apply! We&#39;re building a diverse team whose skills, experiences, and backgrounds complement one another. We&#39;re happy to consider where you might be able to make the biggest impact.</p>\n<p><strong>More about Hugging Face</strong></p>\n<p>We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where you feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community, as well as the future of machine learning more broadly. Hugging Face is an equal opportunity employer, and we do not discriminate based on race, ethnicity, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or ability status.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f81a1dc8-ca4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Hugging Face","sameAs":"https://huggingface.co/"},"x-apply-url":"https://apply.workable.com/j/7C7F63E87A","x-work-arrangement":"remote","x-experience-level":"entry","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Python","data libraries","pandas","pyarrow","huggingface/datasets","storage systems","Parquet","Open Table Formats","S3"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:34:10.184Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, data libraries, pandas, pyarrow, huggingface/datasets, storage systems, Parquet, Open Table Formats, S3"}]}