PhD in Data Infrastructure and AI Workflows for Self-Driving Laboratories

You cannot apply for this job anymore (deadline was 30 Sep)

Please note: You cannot apply for this job anymore (deadline was 30 Sep). Browse the current job offers or choose an item in the top navigation above.

PhD in Data Infrastructure and AI Workflows for Self-Driving Laboratories

Join the energy transition at DIFFER! At the Dutch Institute for Fundamental Energy Research (DIFFER), we explore cutting-edge solutions for a sustainable future. We are looking for a motivated PhD candidate to join our international team and contribu...

Deadline Published Vacancy ID 3428

Academic fields

Natural sciences

Job types

PhD

Education level

University graduate

Weekly hours

38 hours per week

Salary indication

€2968—€3801 per month

Location

De Zaale 20, 5612AJ, Eindhoven

View on Google Maps

Job description

Accelerating the discovery of clean energy materials requires autonomous experimentation environments known as Self-Driving Laboratories (SDLs). These labs depend on robust data infrastructures that support automation, reproducibility, and integration with machine learning tools. At DIFFER, in close collaboration with our research partners, we are developing such infrastructure to support a remote physical SDL dedicated to AI-driven experimentation in energy materials.

The goal of this PhD project is to design and implement a machine-learning-ready data infrastructure to manage and structure experimental data generated by this remote SDL. This includes ensuring that data from synthesis, characterization, and analytical instruments is consistently captured, harmonized, and annotated in formats suitable for AI-assisted experimentation. The candidate will focus on developing standardized data schemas and processing pipelines for experimental outputs, implementing metadata management and provenance tracking to ensure transparency, and building interfaces for accessing structured data through machine learning tools and workflows.

As a secondary objective, the candidate will explore how the structured datasets can be used in basic machine learning tasks, such as trend identification, clustering, or dimensionality reduction. This will help evaluate the quality and readiness of the data infrastructure for more complex AI applications and provide initial insights into its potential to support future experiment design.

This interdisciplinary project offers a unique opportunity to advance clean energy innovation at the intersection of data engineering, AI, and energy materials research.

Requirements

Responsibilities:
  1. Design and implement data pipelines to transform experimental outputs into structured, machine-learning-ready formats using standardized schemas and metadata models.
  2. Facilitate the use of structured data by machine learning tools through appropriate access and formatting strategies.
  3. Explore basic machine learning tasks such as trend detection, clustering, or dimensionality reduction to assess data quality and infrastructure readiness.
  4. Collaborate with researchers in the SDL consortium to align infrastructure design with experimental workflows and project objectives.
  5. Supervise BSc/MSc student projects when appropriate.
  6. Contribute to scientific dissemination, including research publications, presentations at conferences, and stakeholder meetings.
  7. Complete a PhD thesis based on the research within four years.

Requirements:
  1. A Master’s degree in computer science, data science, artificial intelligence, or a related field with a strong focus on data engineering or applied machine learning.
  2. Experience with data structuring, transformation, and pipeline development, including database design (SQL/NoSQL), data preprocessing, and data integration.
  3. Proficiency in Python programming and familiarity with relevant libraries for data handling and processing (e.g. pandas, NumPy, h5py, xarray).
  4. Knowledge of graph databases and their application for managing and querying interconnected datasets.
  5. Experience in applying machine learning techniques (e.g. clustering, dimensionality reduction) for insight generation and exploratory data analysis.
  6. Awareness of FAIR data principles or experience handling simulation or experimental data in a reproducible and structured way.
  7. Good communication skills and the ability to work effectively in a multidisciplinary and collaborative environment.
  8. Proficiency in written and spoken English.

Conditions of employment

This position is for 1 FTE, will be for a period of 4 years and is graded in pay scale PhD (currently gross € 2.968,--/month in year 1 till gross € 3.801,-- per month in year 4). The position will be based at DIFFER (www.differ.nl) and the working location will be at TU Eindhoven. When fulfilling a position at DIFFER, you will have an employee status at NWO. You can participate in all the employee benefits NWO offers. We have a number of regulations that support employees in finding a good work-life balance. At DIFFER we believe that a workforce diverse in gender, age and cultural background is key to performing excellent research. We therefore strongly encourage everyone to apply. More information on working at NWO can be found at the NWO website (https://www.nwo-i.nl/en/working-at-nwo-i/jobsatnwoi/)

Employer

Dutch Institute for Fundamental Energy Research

The Dutch Institute for Fundamental Energy Research (DIFFER) performs leading fundamental research on materials, processes, and systems for a global sustainable energy infrastructure. We work in close partnership with (inter)national academia and industry. Our user facilities are open to industry and university researchers. As an institute of the Dutch Research Council (NWO) DIFFER plays a key role in fundamental research for the energy transition.

We use a multidisciplinary approach applicable on two key areas, chemical energy for the conversion and storage of renewable energy and nuclear fusion – as a clean source of energy.