2014
Humans use directed and random exploration to solve the explore–exploit dilemma.
Abstract: All adaptive organisms face the fundamental tradeoff between pursuing a known reward (exploitation) and sampling lesser-known options in search of something better (exploration). Theory suggests at least two strategies for solving this dilemma: a directed strategy in which choices are explicitly biased toward information seeking, and a random strategy in which decision noise leads to exploration by chance. In this work we investigated the extent to which humans use these two strategies. In our “Horizon task,” …
Search citation statements
Paper Sections
Select...
493
172
156
41
Citation Types
63
960
9
1
Year Published
2015
2026
Publication Types
Select...
391
240
26
23
Relationship
24
656
Authors
Journals
Cited by 659 publications
(1,037 citation statements)
References 39 publications
63
960
9
1
“…In this study, we adopted behavioral, self-reported, and computational measures to investigate the processes underlying healthy and pathological information-seeking. Our results showed that in contrast to previous bandit studies, which found HCs to accord value to general information 4 5 , our careful analyses indicate that HCs have a specific novelty bonus, and little to no effect of general information-seeking. Moreover, we found that HCs and PGs adopt distinct information-seeking modes.…”
Section: Discussioncontrasting
confidence: 99%
“…In this study, we adopted behavioral, self-reported, and computational measures to investigate the processes underlying healthy and pathological information-seeking. Our results showed that in contrast to previous bandit studies, which found HCs to accord value to general information 4 5 , our careful analyses indicate that HCs have a specific novelty bonus, and little to no effect of general information-seeking. Moreover, we found that HCs and PGs adopt distinct information-seeking modes.…”
Section: Discussioncontrasting
confidence: 99%
“…Consistent with earlier studies [ 6 ], we found a clear increase of exploratory behaviour in the long horizon condition (t-test on the difference between information parameters in long versus short horizon, Experiment 1: t(149) = 9.492248, p<.001, BF10 = 1.071e+14; Experiment 2: t(99) = 7.47, p<0.001, BF10 = 2.782e+08, t-test on the choice temperature difference quantifying random exploration, Experiment 1: t(149) = 7.77164, p = p<.001, BF10 = 6.225e+09, Experiment 2: t(99) = 6.22, p<0.001, BF10 = 9.401e+05; Figs 2 and 3 ). The 3-way ANOVA on confidence judgements revealed a significant effect of choosing a lower value option (Experiment 1: F = 181.36, p<0.001; Experiment 2: F = 126.61, p = p<0.001) and choosing an uncertain option (Experiment 1: F = 101.72, p<0.001; Experiment 2: F = 71.442, p<0.001).…”
Section: Resultssupporting
confidence: 94%
“…Consistent with earlier studies [6], we found a clear increase of exploratory behaviour in the long horizon condition (t-test on the difference between information parameters in long versus short horizon, Experiment 1: t(149) = 9.492248, p < .001, BF10 = 1.071e+14; Experiment 2: t(99) = 7.47, p < 0.001, BF10 = 2.782e+08, t-test on the choice temperature difference quantifying random exploration, Experiment 1: t(149) = 7.77164, p = p < .001, BF10 = 6.225e+09, Experiment 2: t(99) = 6.22, p < 0.001, BF10 = 9.401e+05; Fig 2, 3). The 3-way ANOVA on confidence judgements revealed a significant effect of choosing a lower value option (Experiment 1: F = 181.36, p < 0.001; Experiment 2: F = 126.61, p = p < 0.001) and choosing an uncertain option (Experiment 1: F = 101.72, p < 0.001; Experiment 2: F = 71.442, p < 0.001).…”
Section: Resultssupporting
confidence: 94%
