Individual Submission Summary
Share...

Direct link:

Poster #27 - Developmental changes in explore-exploit strategy in adolescents

Thu, March 21, 4:00 to 5:15pm, Baltimore Convention Center, Floor: Level 1, Exhibit Hall B

Integrative Statement

Introduction
Should someone pursue what he knows and get rewards close to what he expects (‘exploit’) or search for more information to gain something potentially better (‘explore’)? Advantageous decision-making is an adaptive trade-off between exploiting known sources of reward and exploring alternatives. Although risk-taking has been considered as a personality characteristic in adolescents, exploratory decisions in adolescents have not been closely examined. Using computational decision-making models and a bandit task, we aim to depict the developmental trajectory of strategic exploration in adolescents.

Methods
One hundred and twenty-four 16--23-year-old volunteers (59 female, aged 16--23-year-old) completed an explore-exploit paradigm in which participants chose one of four slot machines to discover the payoffs of each machine (Daw et al., 2006). The task consisted of two sessions of 150 trials each. On each trial, subjects were presented with pictures of four different colored slot machines and selected one using a button box. After chosen a slot machine, the number of points earned for the chosen slot machine was displayed (see Figure 1). The participants’ objective was to accumulate as much reward as they can across the trials. The expected value of each slot machine changed with time, therefore, there was a direct trade-off between exploiting a single arm for its expected payoff and exploring other arms for potentially larger rewards.

Results
We used a standard Q-learning model to describe decision makers’ choices. By using this model, we estimated two parameters for each participant: α is the learning rate that determines the degree of the update of the value of the option and β is the degree of stochasticity in making the choice (i.e., the exploration/exploitation parameter). To assess the developmental trend of the learning rate (α) and the exploration/exploitation parameter (β), regression analyses for both linear and quadratic models were conducted with age as a continuous variable and α or β as an outcome variable. Results showed that the learning rate improved linearly with age (beta coefficient of age = 0.907, p = 0.002, adjusted R-squared = 0.066, Figure 2), whereas the exploration/exploitation parameter followed a quadratic regression curve over the course of development (beta coefficient of age-squared= -15.5242, p = 0.017, adjusted R-squared = 0.031, Figure 3).

Conclusions
The present study suggests that individual’s ability of adaptive decision making based on unexpected rewards develops with age during adolescence. However, adolescents around 19-year-old are more reward-driven (exploitation) and less information-seeking (exploration) than other age groups across adolescence.

Authors