Overview
This package modularizes the Markov Decision Process (MDP) into six core components, enabling users to flexibly construct the Rescorla-Wagner Model for Multi-Armed Bandit tasks (see Sutton & Barto, 2018). Beginners can define models using simple if-else logic, making model construction more accessible (built-in three basic models, see Niv et al., 2012).
-
Step 1: Build Reinforcement Learning Models
run_m() -
Step 2: Parameter and Model Recovery
rcv_d() -
Step 3: Fit Real Data
fit_p() -
Step 4: Replay the Experiment
rpl_e()
These four steps follow the ten simple rules for the computational modeling of behavioral data (Wilson & Collins 2019)
Installation
# Install the stable version from CRAN
install.packages("multiRL")
# Install the latest version from GitHub
remotes::install_github("yuki-961004/multiRL@*release")
# Load package
library(multiRL)
# Obtain help document
?binaryRLMarkov Decision Process

\[ Q_{new} = Q_{old} + \alpha \cdot (R - Q_{old}) \]
\[ P_{t}(a) = \frac{ \exp\left( \beta \cdot \left( Q_t(a) - \max_{j} Q_t(a_j) \right) \right) }{ \sum_{i=1}^{k} \exp\left( \beta \cdot \left( Q_t(a_i) - \max_{j} Q_t(a_j) \right) \right ) } \]
\[ U(R) = {R}^{\gamma} \]
\[ \text{Bias} = \delta \cdot \sqrt{\frac{\log(N + e)}{N + 10^{-10}}} \]
\[ P(x) = \begin{cases} \epsilon, & x=1 \\ 1-\epsilon, & x=0 \end{cases} \]
\[ W_{new} = W_{old} + \zeta \cdot (W_{0} - W_{old}) \]
Reference
- Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed). MIT press.
- Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. Elife, 8, e49547. https://doi.org/10.7554/eLife.49547
- Niv, Y., Edlund, J. A., Dayan, P., & O’Doherty, J. P. (2012). Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. Journal of Neuroscience, 32(2), 551-562. https://doi.org/10.1523/JNEUROSCI.5498-10.2012
- Collins, A. G., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35(7), 1024-1035. https://doi.org/10.1111/j.1460-9568.2011.07980.x
- Eckstein, M. K., & Collins, A. G. (2020). Computational evidence for hierarchically structured reinforcement learning in humans. Proceedings of the National Academy of Sciences, 117(47), 29381-29389. https://doi.org/10.1073/pnas.1912330117

