Are you eager to advance research at the crossroads of language, AI, and geography? As a PhD candidate in the ERC-funded GeoTrAnsQData project at Utrecht University, you will build models and methods to parse natural language questions into geo-analytical workflows, combining NLP and semantic representations to improve how complex spatial questions can be answered.
Your jobAnswering geographic questions like “What is the potential to reduce urban heat in Amsterdam by installing green roofs on existing buildings?” is important in fields such as urban planning, sustainability, and public health. Most current Geographic Question Answering (GeoQA) tools only return short factual answers, but many real-world questions—like this example—require deeper spatial analysis.
In such cases, maps must be created or transformed from data rather than simply retrieved. The GeoTrAnsQData project addresses this by developing a GeoQA method that converts questions into executable geo-analytical workflows, turning geodata into new answer maps. We use knowledge graphs to model these transformations and apply AI methods to scale them across large map repositories, enabling users to explore many ways maps can be reused to answer different kinds of questions.
This PhD position focuses on developing a method for parsing and formalising geo-analytical questions. You will explore hybrid AI supported approaches to help users formulate and translate natural language questions into structured representations that can be linked to geospatial data sources and workflows.
You will contribute to the design of the linguistic and conceptual interface between natural language questions and formal workflow models over a geodata repository. In this project, you will:
- build and annotate a corpus of geo-analytical questions and their associated purposes, data needs, and analytical steps in geo-analytical standard scenarios;
- develop a model of geo-analytical purposes (transformation requests) and a corresponding question grammar;
- perform a user study on geo-analytical question formulation to express such purposes;
- contribute to the formalisation of spatial question types using a purpose-driven taxonomy;
- develop a hybrid question parsing pipeline using NLP and formal semantic representations; investigate Large Language Models (LLMs) as well as symbolic AI for question parsing;
- evaluate models based on a gold standard of geo-analytic purposes and questions;
- collaborate with a technical assistant, another PhD candidate (on geodata source modelling), and a postdoc (on the GeoQA reasoning engine).
It is part of the ERC-funded project GeoTrAnsQData, which develops the foundations of a transformative GeoQA methodology through an integrated research program across geoinformatics, AI, and geography. The project is based at the Department of Human Geography and Spatial Planning, Utrecht University, and contributes to cutting-edge research on spatial reasoning, semantic technologies, and interdisciplinary AI for geosciences and geography.