Learning Rate: \(\alpha\)
$$Q_{new} = Q_{old} + \alpha \cdot (R - Q_{old})$$
Inverse Temperature: \(\beta\)
$$ P_{t}(a) = \frac{ \exp(\beta \cdot Q_{t}(a)) }{ \sum_{i=1}^{k} \exp(\beta \cdot Q_{t}(a_{i})) } $$
Arguments
- params
Parameters used by the model’s internal functions, see params
Body
TD <- function(params){
params <- list(
free = list(alpha = params[1], beta = params[2])
)
multiRL.model <- multiRL::run_m(
data = data,
behrule = behrule,
colnames = colnames,
params = params,
funcs = funcs,
priors = priors,
settings = settings
)
assign(x = "multiRL.model", value = multiRL.model, envir = multiRL.env)
return(.return_result(multiRL.model))
}