$$Q_{new} = Q_{old} + \alpha \cdot (R - Q_{old})$$
Arguments
- shown
Which options shown in this trial.
- qvalue
The expected Q values of different behaviors produced by different systems when updated to this trial.
- reward
The feedback received by the agent from the environment at trial(t) following the execution of action(a)
- utility
The subjective value (internal representation) assigned by the agent to the objective reward.
- params
Parameters used by the model's internal functions, see params
- system
When the agent makes a decision, is a single system at work, or are multiple systems involved? see system
- ...
It currently contains the following information; additional information may be added in future package versions.
idinfo:
subid
block
trial
exinfo: contains information whose column names are specified by the user.
Frame
RT
NetWorth
...
behave: includes the following:
action: the behavior performed by the human in the given trial.
latent: the object updated by the agent in the given trial.
simulation: the actual behavior performed by the agent.
position: the position of the stimulus on the screen.
cue and rsp: Cues and responses within latent learning rules, see behrule
state: The state stores the stimuli shown in the current trial—split into components by underscores—and the rewards associated with them.
Body
func_alpha <- function(
shown,
qvalue,
reward,
utility,
params,
system,
...
){
list2env(list(...), envir = environment())
# If you need extra information(...)
# Column names may be lost(C++), indexes are recommended
# e.g.
# Trial <- idinfo[3]
# Frame <- exinfo[1]
# Action <- behave[1]
alpha <- params[["alpha"]]
alphaN <- params[["alphaN"]]
alphaP <- params[["alphaP"]]
# Determine the model currently in use based on which parameters are free.
if (
system == "RL" && !(is.null(alpha)) && is.null(alphaN) && is.null(alphaP)
) {
model <- "TD"
} else if (
system == "RL" && is.null(alpha) && !(is.null(alphaN)) && !(is.null(alphaP))
) {
model <- "RSTD"
} else if (
system == "WM"
) {
model <- "WM"
alpha <- 1
} else {
stop("Unknown Model! Plase modify your learning rate function")
}
# TD
if (model == "TD") {
update <- qvalue + alpha * (reward - qvalue)
# RSTD
} else if (model == "RSTD" && reward < qvalue) {
update <- qvalue + alphaN * (reward - qvalue)
} else if (model == "RSTD" && reward >= qvalue) {
update <- qvalue + alphaP * (reward - qvalue)
# WM
} else if (model == "WM") {
update <- qvalue + alpha * (reward - qvalue)
}
return(update)
}