Third-cycle subject: Electrical Engineering
Reinforcement Learning (RL) agents tackle sequential decision-making problems by interacting with an environment and refining their behavior through trial and error. These methods have demonstrated remarkable success across various domains, even surpassing human performance. However, a major drawback is their reliance on vast amounts of training data to achieve such results. In domains like video games or recommendation systems, abundant training data is readily available. However, in fields such as finance or healthcare, data collection can be risky, expensive, or even legally restricted. RL algorithms can be broadly categorized into Model-Free and Model-Based approaches. Model-Free RL learns directly from interactions with the real (target) environment, whereas Model-Based RL trains agents using a learned model of the environment. As a result, Model-Based methods are generally more sample-efficient than their Model-Free counterparts. However, these methods are not exempt from challenges, two of the most critical being compounding errors and the Sim2Real gap.
In this thesis, the objective is to address these issues by exploring innovative diffusion models. These models will aim at enhancing both the learning process and the robustness of the learned policy in RL. Specifically, we plan to leverage guiding techniques within these diffusion models to steer the generation of trajectories toward those that are most beneficial for the learning process. By 'beneficial,' we refer to two key aspects: (1) fostering policies that are robust to modeling errors and adaptable to unseen environments, and (2) accelerating the learning process, ultimately improving sample efficiency. We will also explore ways to provide analytical guarantees to these methods.
Supervision: Professor Alexandre Proutiere
To be admitted to postgraduate education (Chapter 7, 39 § Swedish Higher Education Ordinance), the applicant must have basic eligibility in accordance with either of the following:
In addition to the above, there is also a mandatory requirement for English equivalent to English B/6.
In order to succeed as a doctoral student at KTH you need to be goal oriented and persevering in your work. During the selection process, candidates will be assessed upon their ability to:
A strong background in statistics, probability and optimization is appreciated.
After the qualification requirements, great emphasis will be placed on personal skills.
Target degree: Doctoral degree
Only those admitted to postgraduate education may be employed as a doctoral student. The total length of employment may not be longer than what corresponds to full-time doctoral education in four years' time. An employed doctoral student can, to a limited extent (maximum 20%), perform certain tasks within their role, e.g. training and administration. A new position as a doctoral student is for a maximum of one year, and then the employment may be renewed for a maximum of two years at a time. In the case of studies that are to be completed with a licentiate degree, the total period of employment may not be longer than what corresponds to full-time doctoral education for two years.
Contact information for union representatives.
Contact information for doctoral section.
Apply for the position and admission through KTH's recruitment system. It is the applicant’s responsibility to ensure that the application is complete in accordance with the instructions in the advertisement.
Applications must be received at the last closing date at midnight, CET/CEST (Central European Time/Central European Summer Time).
Applications must include the following elements:
Striving towards gender equality, diversity and equal conditions is both a question of quality for KTH and a given part of our values.
For information about processing of personal data in the recruitment process.
It may be the case that a position at KTH is classified as a security-sensitive role in accordance with the Protective Security Act (2018:585). If this applies to the specific position, a security clearance will be conducted for the applicant in accordance with the same law with the applicant's consent. In such cases, a prerequisite for employment is that the applicant is approved following the security clearance.
We firmly decline all contact with staffing and recruitment agencies and job ad salespersons.
Disclaimer: In case of discrepancy between the Swedish original and the English translation of the job announcement, the Swedish version takes precedence.
Type of employment | Temporary position |
---|---|
Contract type | Full time |
First day of employment | According to agreement |
Salary | Monthly salary according to KTH's doctoral student salary agreement |
Number of positions | 2 |
Full-time equivalent | 100% |
City | Stockholm |
County | Stockholms län |
Country | Sweden |
Reference number | PA-2025-1789 |
Contact |
|
Published | 05.Jun.2025 |
Last application date | 12.Aug.2025 |