Modelling variations in human learning in probabilistic decision-making tasks

Hunt, Dominic. 2020. Modelling variations in human learning in probabilistic decision-making tasks. Doctoral thesis, Goldsmiths, University of London [Thesis]

Text (Modelling variations in human learning in probabilistic decision-making tasks)
PSY_thesis_HuntD_2020.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (25MB) | Preview

Abstract or Description

This thesis focused on evaluating the capacity of models of human learning to encapsulate the action choices of a range of individuals performing probabilistic decision-making tasks.

To do so, an extensible evaluation framework, Tinker Taylor py (TTpy), was developed in Python allowing models to be compared like-for-like across a range of tasks. TTpy allows models, tasks and fitting methods to be added or replaced without affecting the other parts of the simulation and fitting process.

Models were drawn from the reinforcement learning literature along with a few similarly structured Bayesian learning models. The fitting assumed that the same model was used throughout a task to make all the choices.

Using TTpy, significant uncertainty was found in parameter recovery for short, simple tasks across a range of models. This was traced back to significant overlap in the action sequences plausibly produced by different combinations of parameters. Replacing softmax with epsilon greedy, as the way of calculating the action choice probabilities, was found to improve parameter recovery in simulated data.

Datasets from three existing unpublished probabilistic decision-making tasks were examined. These datasets were chosen as they contained information on extraversion for all their participants, their tasks were well established, and the tasks had a gains-only promotion focus. Only one of the three tasks provided models where most of the model participant fits had strong evidence that they were better fits than uniform random action choices.

In light of the difficulties in parameter recovery for individual participants, the unusual step was taken of averaging the recovered parameters across a subset of the best performing and most consistently recovered models within the same family. A significant correlation was found between this learning rate parameter and the participant extraversion measure when the softmax parameter variance was taken into account.

Item Type:

Thesis (Doctoral)

Identification Number (DOI):


Reinforcement learning, Bayesian learning, individual differences, extraversion, fitting, probabilisitic decision-making

Departments, Centres and Research Units:



31 March 2020

Item ID:


Date Deposited:

08 Jun 2021 09:40

Last Modified:

07 Sep 2022 17:18


View statistics for this item...

Edit Record Edit Record (login required)