Senior Data Engineer

About kaiko

In cancer care, treatment decisions can take many days—but patients don’t have that time. One of the reasons for delays? Cancer patients' data is scattered across many places: doctor’s notes, medical imagery, genomics data. At kaiko, we are developing AI foundational models to bring this data together and integrate it into clinical workflows, enabling doctors to make faster, more effective treatments decisions.

We also collaborate closely with the leading Dutch cancer research institute (NKI) on multiple AI research projects and a joint clinical validation initiative. In 2025, we plan on expanding our partnerships to even more hospitals.

We raised significant long-term funding and have offices in Zurich and Amsterdam. Over the past year, our team has nearly doubled in size, now comprising 70+ people from 25 countries. Our multidisciplinary team brings expertise in LLM and foundational model development, data science, product management, compliance, growth, and operations.

About the role

We are seeking a highly skilled Senior Data Engineer with a passion for building robust and reproducible data pipelines to empower our AI research team in their daily work. You'll play a vital role in making our ambitious AI healthcare solutions a practical reality. This exciting role will be based in either The Netherlands or Switzerland.

Your responsibilities

Engineer robust and reproducible data pipelines that enable machine learning projects.

Be involved in the design and development of data-centric processes, as well as in their integration with existing information systems, data lakehouse and data platform.

Manage the development infrastructure requirements for data exploration, data processing and data storage.

Collaborate across domain teams with researchers, product teams and other stakeholders to support their data needs.

Why kaiko
At kaiko, we believe the best ideas come from collaboration, ownership and ambition. We’ve built a team of international experts where your work has direct impact. Here’s what we value:

We act like owners: You’ll have the autonomy to set your own goals, make critical decisions, and see the direct impact of your work.

We thrive on collaboration: You’ll have to approach disagreement with curiosity, build on common ground and create solutions together.

We work with ambitious people: You’ll be surrounded by people who set high standards for themselves and others, who see obstacles as opportunities, and who are relentless in their work to create better outcomes for patients.

In addition, we offer:

An attractive and competitive salary, a good pension plan and 25 vacation days per year.

Great offsites and team events to strengthen the team and celebrate successes together.

A EUR 1000 learning and development budget to help you grow.

Autonomy to do your work the way that works best for you, whether you have a kid or prefer early mornings.

An annual commuting subsidy.

About you

3+ years of experience building production data pipelines and data platform.

Expert level experience in modern data engineering tools: data transformation (e.g. Spark, dbt), data replication (e.g. Fivetran, Airbyte), data orchestration (e.g. Airflow, Dagster, Prefect)

Understanding of data modelling techniques and principles (e.g. Data Vault, Medallion Architecture)

Experience with cloud-based data platform/warehouse services (e.g. Databricks, Snowflake, BigQuery)

Hands-on experience with cloud-based object storage (e.g. S3, Minio, Azure Blob Storage) and storage formats such as Parquet, Delta & Iceberg.

Experience with at least one cloud platform (e.g. AWS, Azure or Google Cloud).

Strong coding skills in at least one programming language (e.g. Python, Scala, Java, C++).

Excellent problem-solving and communication skills

Self-motivated and able to work well in a fast-paced startup environment

Nice to have:

Experience in AI/ML environment.

Track record of engineering large-scale data pipelines that process and serve petabytes of data.

Experience with OCR and/or anonymization of tabular and free text data.

Understanding of data standards in the medical domain, such as DICOM, FHIR, pathology slide images (Whole Slide Images).

Experience with CI/CD tools (e.g. GitLab CI/CD, Github Actions or CircleCI), containerization (e.g. Docker) and orchestration tools (e.g. Kubernetes, Helm, Kustomize)

Knowledge of monitoring, logging, alerting and observability tools (e.g. Prometheus, Grafana, ELK Stack or Datadog)

Experience working and interacting with the Open-Source community.

This Senior Data Engineer position is a full-time role. It is important for the applicant to be a resident in The Netherlands or Switzerland, have a valid work permit and preferably be within commutable distance from our offices in Amsterdam or Zürich. Given the nature of kaiko’s business and the fact that it deals with sensitive data, a Certificate of Conduct will be required upon finalizing the employment contract.

We are excited to gather a broad range of perspectives in our team, as we believe it will help us build better products to support a broader set of people. If you’re excited about us but don’t fit every single qualification, we still encourage you to apply: we’ve had incredible team members join us who didn’t check every box!

Senior Data Engineer

Open Roles

Senior Data Engineer

Already working at Kaiko?