Skip to contents

$$W_{new} = W_{old} + \zeta \cdot (W_{0} - W_{old})$$

Usage

func_zeta(
  shown,
  is.nb,
  value0,
  values,
  reward,
  utility,
  system,
  rownum,
  params,
  hidden,
  ...
)

Arguments

shown

Which options shown in this trial.

is.nb

Is it the new block?

value0

The initial values for all actions.

values

The current expected values for all actions.

reward

The feedback received by the agent from the environment at trial(t) following the execution of action(a)

utility

The subjective value (internal representation) assigned by the agent to the objective reward.

system

When the agent makes a decision, is a single system at work, or are multiple systems involved? see system

rownum

The trial number

params

Parameters used by the model's internal functions, see params

hidden

All hidden variables within the MDP process belong here.

...

It currently contains the following information; additional information may be added in future package versions.

  • idinfo:

    • subid

    • block

    • trial

  • exinfo: contains information whose column names are specified by the user.

    • Frame

    • RT

    • NetWorth

    • ...

  • behave: includes the following:

    • action: the behavior performed by the human in the given trial.

    • latent: the object updated by the agent in the given trial.

    • simulation: the actual behavior performed by the agent.

    • position: the position of the stimulus on the screen.

  • cue and rsp: Cues and responses within latent learning rules, see behrule

  • state: The state stores the stimuli shown in the current trial—split into components by underscores—and the rewards associated with them.

Value

A List

  • output [NumericVector]

    The values of unchosen options after decay according to the specified decay rate.

  • hidden [CharacterVector]

    User-defined internal variables generated by this function. These represent intermediate (latent) states produced during computation, which can be read or modified by other functions in the MDP process.

Body

func_zeta <- function(
    shown,
    value0,
    values,
    reward,
    utility,
    system,
    rownum,
    params,
    hidden,
    ...
){

  list2env(list(...), envir = environment())

  # If you need extra information(...)
  # Column names may be lost(C++), indexes are recommended
  # e.g.
  # Trial  <- idinfo[3]
  # Frame  <- exinfo[1]
  # Action <- behave[1]

  zeta       <-  params[["zeta"]]
  bonus      <-  params[["bonus"]]
  reset      <-  params[["reset"]]

  # If reset all Q values
  if (is.nb && !is.na(reset)) {
    decay <- rep(reset, length(values))
    hidden[6] <- "reset"
    return(list(output = decay, hidden = hidden))
  }

  if (reward == 0) {
    decay <- values + zeta * (value0 - values)
  } else if (reward < 0) {
    decay <- values + zeta * (value0 - values) + bonus
  } else if (reward > 0) {
    decay <- values + zeta * (value0 - values) - bonus
  }

  return(list(output = decay, hidden = hidden))
}