Learning Rate: \(\alpha\)
$$Q_{new} = Q_{old} + \alpha_{-} \cdot (R - Q_{old}), R < Q_{old}$$ $$Q_{new} = Q_{old} + \alpha_{+} \cdot (R - Q_{old}), R \ge Q_{old}$$
Inverse Temperature: \(\beta\)
$$ P_{t}(a) = \frac{ \exp(\beta \cdot Q_{t}(a)) }{ \sum_{i=1}^{k} \exp(\beta \cdot Q_{t}(a_{i})) } $$
Arguments
- params
Parameters used by the model’s internal functions, see params
Body
RSTD <- function(params){
params <- list(
free = list(alphaN = params[1], alphaP = params[2], beta = params[3])
)
multiRL.model <- multiRL::run_m(
data = data,
behrule = behrule,
colnames = colnames,
params = params,
funcs = funcs,
priors = priors,
settings = settings
)
assign(x = "multiRL.model", value = multiRL.model, envir = multiRL.env)
return(.return_result(multiRL.model))
}