PhD position on understanding algorithmic collusion by decentralized multiagent reinforcement learning

Published	Deadline	Location
2 Feb	4 Mar	Enschede

You cannot apply for this job anymore (deadline was 4 Mar 2024).

Browse the current job offers or choose an item in the top navigation above.

Job description

How can we design decentralized multiagent reinforcement learning (MARL) algorithms with equilibrium selection control? This is the central research question of this PhD project. We will be particularly interested in determining whether decentralized MARL algorithms used for pricing or trading can learn to collaborate to raise prices. By developing an understanding of if and how this is possible, we can inform policymakers on how to regulate the use of algorithms in such environments.

To address this issue, we will begin by characterizing the probability of reaching different equilibria under the dynamics of provably convergent decentralized MARL algorithms. With this characterization in hand, we will study how design choices in the algorithmic structure affect these probabilities. These two steps will initially be executed for simple environments and with deterministic dynamics, and then they will be scaled up to more complex environments with stochastic dynamics. Our starting point will be the Decentralized Q-learning algorithm, as it offers significant flexibility in its design and is provably convergent in a large class of games.

Specifications

max. 40 hours per week
€2770—€3539 per month
Enschede View on Google Maps

University of Twente (UT)

Requirements

You are an enthusiastic and highly motivated researcher.
The societal impact of your work plays an important role in your motivation.
You have, or will shortly, acquire a master degree in the field of mathematics, theoretical computer science or theoretical physics (or a related field).
You have a creative mindset and excellent analytical and communication skills.
You have a good team spirit and like to work in an interdisciplinary and internationally oriented environment.
You are proficient in English.
You preferably have previous experience with reinforcement learning and/or game theory.
The UT and the faculty EEMCS are inclusive toward underrepresented groups and strive to increase the proportion of female staff. Female applicants are particularly welcome.

Conditions of employment

As a PhD candidate at UT, you will be appointed to a full-time position for four years, with a qualifier in the first year, within a very stimulating and exciting scientific environment;
The University offers a dynamic ecosystem with enthusiastic colleagues;
Your salary and associated conditions are in accordance with the collective labour agreement for Dutch universities (CAO-NU);
You will receive a gross monthly salary ranging from € 2.770,- (first year) to € 3.539,- (fourth year);
There are excellent benefits including a holiday allowance of 8% of the gross annual salary, an end-of-year bonus of 8.3%, and a solid pension scheme;
The flexibility to work (partially) from home;
A minimum of 232 leave hours in case of full-time employment based on a formal workweek of 38 hours. A full-time employment in practice means 40 hours a week, therefore resulting in 96 extra leave hours on an annual basis.
Free access to sports facilities on campus
A family-friendly institution that offers parental leave (both paid and unpaid);
You will have a training programme as part of the Twente Graduate School where you and your supervisors will determine a plan for a suitable education and supervision;
We encourage a high degree of responsibility and independence, while collaborating with close colleagues, researchers and other staff.

Department

The position will be in the Applied Mathematics department. The Applied Mathematics department has an active research portfolio in stochastic operations research, algorithmic discrete mathematics, complex networks, statistics, systems theory, computational science, and artificial intelligence with applications in health care, energy systems, traffic, and imaging. See MOR and SACS, and MDS

Our research group, Stochastic Operations Research (SOR), conducts mathematical education and research of internationally high standards in the areas of stochastic processes and mathematics of operations research to contribute to the development of mathematics in a multidisciplinary engineering environment and contribute to a better understanding and functioning of our increasingly complex society. See SOR.

High Tech and Human Touch

Join the university of technology that puts people first. Create new possibilities for yourself, your colleagues and society as a whole. Using modern technology and science to drive innovation, change and progress. That’s what it means to work at the University of Twente.

Looking for a job that matters?

Facebook

Mail

Facebook

Mail

Specifications

PhD
Engineering
max. 40 hours per week
€2770—€3539 per month
University graduate
1633

Employer

University of Twente (UT)

Learn more about this employer

Location

Drienerlolaan 5, 7522NB, Enschede

View on Google Maps

PhD position on understanding algorithmic collusion by decentralized multiagent reinforcement learning

PhD position on understanding algorithmic collusion by decentralized multiagent reinforcement learning

Job description

Specifications

Requirements

Conditions of employment

Department

High Tech and Human Touch

Specifications

Employer

Location

Interesting for you

Hi, welcome!