Job description
Apply for this PhD position to research the next generation of mobile autonomous robotics. You will work on cutting-edge AI technologies for robotics, including end-to-end transformer models that enable robots to autonomously navigate and interact with open-world environments. As this PhD is part of the NWO Perspectief FIND project, you will collaborate with leading research organisations and companies, TNO and NXP.
Information
This PhD position concentrates on designing innovative AI architectures and training techniques for end-to-end robotics. Specifically, you will investigate Vision Language Models (VLMs), Multi-modal Large Language Models (MLLMs), and Vision Language Action models (VLAs) for applications involving spatial scene understanding and spatial reasoning. The challenge is achieving a level of spatial understanding that allows robots to handle new environments and tasks with minimal human-provided demonstrations or descriptions, while maintaining a certain level of reliable and safe operation. Another challenge particular to robotics is that these end-to-end models must be efficient enough to operate in real time. Certain tasks, e.g., collision prevention in manipulation, require guaranteed timely responses, while others, e.g., reasoning, are allowed to take longer. These ‘fast’ and ‘slow’ tasks need to be supported simultaneously by the end-to-end architecture and should be compatible with the limited compute and energy resources of embedded systems.
Your daily activities will include reviewing the latest developments in the field, identifying current limitations, hypothesising possible causes and solutions, designing improved network architectures and training methods, setting up validation methods to test hypotheses, and reporting on findings via presentations and publications. It is expected that you perform these tasks as a professional with great independence, and that you possess the ability to engage in in-depth discussions, make informed choices, and be open about the limitations of your research. Additionally, given the rapid pace of AI progress, we find it very important that PhDs not only work individually but also be eager to work in teams with students, peers, and supervisors.
Research group and company
This position is embedded in the Mobile Perception Systems (MPS) laboratory of the Signal Processing Systems (SPS) group within the Electrical Engineering department at Eindhoven University of Technology (TU/e). The MPS lab has a strong track record in researching AI architectures, specifically for vision modalities, that achieve state-of-the-art accuracy while being orders of magnitude more efficient than earlier methods. The MPS lab regularly publishes in top-tier venues such as CVPR, ICCV, and IEEE RAL. This PhD position will be supervised by Gijs Dubbelman, associate professor and head of the MPS Lab, Daan de Geus, assistant professor in the MPS Lab, and expert researchers from TNO and NXP.
This PhD project is executed in close collaboration with TNO and NXP and is part of the FIND program. FIND is a research program funded by the Dutch government and industry that brings together 5 universities, 11 companies (startups to multinationals), and 2 knowledge institutes to develop foundation models (large AI models) for the Dutch high‑tech industry, with strong emphasis on edge deployment, privacy, and timely decision‑making. Partners include ASML, NXP, Canon, ASMPT, Technolution, Signify, Shell, TNO, and others. A total of 12 PhDs will be employed on the FIND program, covering topics from foundation model pre-training and multimodal adaptation to architectures and compression for edge deployment, while targeting real-world validation in domains such as HealthTech, smart industry, and autonomous mobility.