Senior ML Engineer – Agentic AI
About kaiko
kaiko.ai is building a next-generation agentic clinical AI assistant that helps clinicians reason across patient data, guidelines, and diagnostics.
Healthcare decisions are rarely made by a single person or from a single data source. kaiko’s assistant maintains longitudinal patient context across encounters, clinicians, and institutions, enabling collaboration, second opinions, and complex diagnostic workflows. The system is designed to operate safely in real clinical environments, with human oversight, auditability, and regulatory alignment at its core.
Our assistant core supports broadly applicable clinical tasks such as patient data navigation, guideline interaction, multimodal interaction (chat and voice), and care coordination. On top of this foundation, we are developing specialized diagnostic agents in areas such as oncology, radiology, and pathology.
We build in close collaboration with leading hospitals and research centers, including the Netherlands Cancer Institute (NKI). kaiko is a well-funded company with a growing international team, operating from Zurich and Amsterdam.
About the role
You will join Kaiko’s ML Engineering team building the agentic system, the software harness that turns powerful models into something clinicians can rely on in real workflows.
In healthcare, this matters more than anywhere else. Doctors don’t need another interface that produces fluent text. They need a system that supports structured clinical thinking and collaboration, synthesizes messy context, makes uncertainty explicit, and produces artifacts that can be inspected, discussed, and improved with expert feedback.
As a Senior ML Engineer, you will design, ship, and evaluate the harness components that make our agentic system safe and effective. You’ll contribute to reliability, evaluation, and the engineering that brings agents into clinical practice.
You’ll be based in Zurich or Amsterdam, with an expectation to spend around half your time in the office.
You will help build and evaluate:
context lifecycle management from intent to verified, persistent outputs
tools and integrations across internal systems and external data sources
durable memory and state that support long-running, multi-step clinical work
evaluation and verification loops that reduce drift, context loss, and hallucinations
About you
Strong Python skills and solid Git collaboration experience (PRs, branching, code review)
Experience building LLM-driven features such as prompt and context design, RAG-based retrieval and grounding, tool use, and practical failure handling
Experience building agents with a clear view of what makes them succeed or fail including state management, planning and execution tradeoffs, tool reliability, and guardrails
Experience with at least one agentic framework such as LangChain, AutoGen, or similar. Or experience building custom production-ready agentic systems
Strong ML foundation with particular strength in transformers and how LLMs and VLMs behave in practice including capabilities, limitations, and evaluation
Experience designing evaluation for LLM and agentic systems including deterministic test suites, scenario-based evaluations, and rubric-based or LLM-as-judge approaches
Clear communicator comfortable with design discussions, code review, and cross-functional collaboration
Nice to have:
Experience with knowledge graphs or structured representations for reasoning and retrieval
Experience using workflow orchestration tools such as Dagster or similar
Familiarity with distributed execution frameworks (e.g., Ray) and scaling workloads cleanly
You stay up to date with the latest developments and literature on agentic systems and can turn new ideas into shippable engineering
Some experience with Reinforcement learning
We are excited to gather a broad range of perspectives in our team, as we believe it will help us build better products to support a broader set of people. If you’re excited about us but don’t fit every single qualification, we still encourage you to apply: we’ve had incredible team members join us who didn’t check every box!
Why kaiko
At kaiko, we believe that building transformative healthcare technology requires collaboration, ownership, and ambition.
Ownership: You will have real responsibility for shaping kaiko’s commercial direction and long-term impact.
Collaboration: You will work closely with world-class researchers, clinicians, and product teams.
Ambition: You will help build a platform that meaningfully improves how care is delivered and decisions are made.
In addition, we offer
An attractive and competitive salary, a good pension plan and 25 vacation days per year.
Great offsites and team events to strengthen the team and celebrate successes together.
A EUR 1000 learning and development budget to help you grow.
Autonomy to do your work the way that works best for you, whether you have a kid or prefer early mornings.
An annual commuting subsidy.
Our interview process
Our interview process is designed to assess mutual fit across skills, motivation, and values. It typically includes the following steps, though the exact process may vary:
Screening call: A short conversation to align on your motivation, career goals, and initial fit for the role.
Online coding assessment: focused on core programming skills, problem-solving ability, and fundamental data structures and algorithms.
Coding assessment review: follow-up discussion to review your submission.
ML interview: A in-depth discussion on ML foundations, with particular focus on LLM knowledge and experience.
Onsite presentation: A technical presentation on a project of your choosing that’s relevant to the role. This is followed by a deep-dive discussion on your problem-solving approach and key decisions, plus conversations with team members to assess collaboration style and day-to-day fit.
- Locations
- Amsterdam, Zürich (Puls 5)
- Remote status
- Hybrid