Skip to contents

This is an internal helper function for fit_p. Its primary purpose is to provide a unified interface for users to interact with various optimization algorithm packages. It adapts the inputs and outputs to be compatible with eight distinct algorithms, ensuring a seamless experience regardless of the underlying solver used.

The function provides several optimization algorithms:

For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/

Usage

optimize_para(
  policy = "off",
  estimate = "MLE",
  data,
  id,
  n_trials,
  n_params,
  obj_func,
  lower,
  upper,
  priors = NULL,
  initial_params = NA,
  initial_size = 50,
  iteration = 10,
  seed = 123,
  algorithm
)

Arguments

policy

[character]

Specifies the learning policy to be used. This determines how the model updates action values based on observed or simulated choices. It can be either "off" or "on".

  • Off-Policy (Q-learning): This is the most common approach for modeling reinforcement learning in Two-Alternative Forced Choice (TAFC) tasks. In this mode, the model's goal is to learn the underlying value of each option by observing the human participant's behavior. It achieves this by consistently updating the value of the option that the human actually chose. The focus is on understanding the value representation that likely drove the participant's decisions.

  • Off-Policy (SARSA): In this mode, the target policy and the behavior policy are identical. The model first computes the selection probability for each option based on their current values. Critically, it then uses these probabilities to sample its own action. The value update is then performed on the action that the model itself selected. This approach focuses more on directly mimicking the stochastic choice patterns of the agent, rather than just learning the underlying values from a fixed sequence of actions.

default: policy = "off"

estimate

[character]

Estimation method. Can be either "MLE" or "MAP".

  • "MLE": (Default) Maximum Likelihood Estimation. This method finds the parameter values that maximize the log-likelihood of the data. A higher log-likelihood indicates that the parameters provide a better explanation for the observed human behavior. In other words, data simulated using these parameters would most closely resemble the actual human data. This method does not consider any prior information about the parameters.

  • "MAP": Maximum A Posteriori Estimation. This method finds the parameter values that maximize the posterior probability. It is an iterative process based on the Expectation-Maximization (EM) framework.

    • Initialization: The process begins by assuming a uniform distribution as the prior for each parameter, making the initial log-prior zero. The first optimization is thus equivalent to MLE.

    • Iteration: After finding the best parameters for all subjects, the algorithm assesses the actual distribution of each parameter and fits a normal distribution to it. This fitted distribution becomes the new empirical prior.

    • Re-estimation: The parameters are then re-optimized to maximize the updated posterior probability.

    • Convergence: This cycle repeats until the posterior probability converges or the maximum number of iterations (specified by iteration_g) is reached.

    Using this method requires that the priors argument be specified to define the initial prior distributions.

default: estimate = "MLE"

data

[data.frame]

This data should include the following mandatory columns:

  • "sub"

  • "time_line" (e.g., "Block", "Trial")

  • "L_choice"

  • "R_choice"

  • "L_reward"

  • "R_reward"

  • "sub_choose"

id

[character]

Specifies the ID of the subject whose optimal parameters will be fitted. This parameter accepts either string or numeric values. The provided ID must correspond to an existing subject identifier within the raw dataset provided to the function.

n_trials

[integer]

The total number of trials in your experiment.

n_params

[integer]

The number of free parameters in your model.

obj_func

[function]

The objective function that the optimization algorithm package accepts. This function must strictly take only one argument, fit_p (a vector of model parameters). Its output must be a single numeric value representing the loss function to be minimized. For more detailed requirements and examples, please refer to the relevant documentation ( TD, RSTD, Utility ).

lower

[vector]

Lower bounds of free parameters

upper

[vector]

Upper bounds of free parameters

priors

[list] A list specifying the prior distributions for the model parameters. This argument is mandatory when using estimate = "MAP". There are two primary scenarios for its use:

1. Static MAP Estimation (Non-Hierarchical) This approach is used when you have a strong, pre-defined belief about the parameter priors and do not want the model to update them iteratively.

Configuration:

  • Set estimate = "MAP".

  • Provide a list defining your confident prior distributions.

  • Keep iteration_g = 0 (the default).

Behavior:

The algorithm maximizes the posterior probability based solely on your specified priors. It will not use the EM (Expectation-Maximization) framework to learn new priors from the data.

2. Hierarchical Bayesian Estimation via EM This approach is used to let the model learn the group-level (hierarchical) prior distributions directly from the data.

Configuration:

  • Set estimate = "MAP".

  • Specify a weak or non-informative initial prior, such as a uniform distribution for all parameters.

  • Set iteration_g to a value greater than 0.

Behavior:

With a uniform prior, the initial log-posterior equals the log-likelihood, making the first estimation step equivalent to MLE. The algorithm then initiates the EM procedure: it iteratively assesses the actual parameter distribution across all subjects and updates the group-level priors. This cycle continues until the posterior converges or iteration_g is reached.

default: priors = NULL

initial_params

[vector]

Initial values for the free parameters that the optimization algorithm will search from. These are primarily relevant when using algorithms that require an explicit starting point, such as L-BFGS-B. If not specified, the function will automatically generate initial values close to zero.

default: initial_params = NA.

initial_size

[integer]

This parameter corresponds to the population size in genetic algorithms (GA). It specifies the number of initial candidate solutions that the algorithm starts with for its evolutionary search. This parameter is only required for optimization algorithms that operate on a population, such as `GA` or `DEoptim`.

default: initial_size = 50.

iteration

[integer]

The number of iterations the optimization algorithm will perform when searching for the best-fitting parameters during the fitting phase. A higher number of iterations may increase the likelihood of finding a global optimum but also increases computation time.

seed

[integer]

Random seed. This ensures that the results are reproducible and remain the same each time the function is run.

default: seed = 123

algorithm

[character]

Choose an algorithm package from L-BFGS-B, GenSA,GA,DEoptim,PSO, Bayesian, CMA-ES.

In addition, any algorithm from the nloptr package is also supported. If your chosen nloptr algorithm requires a local search, you need to input a character vector. The first element represents the algorithm used for global search, and the second element represents the algorithm used for local search.

Value

the result of binaryRL with optimal parameters

Examples

if (FALSE) { # \dontrun{
binaryRL.res <- binaryRL::optimize_para(
  data = binaryRL::Mason_2024_G2,
  id = 1,
  obj_func = binaryRL::RSTD,
  n_params = 3,
  n_trials = 360,
  lower = c(0, 0, 0),
  upper = c(1, 1, 1),
  iteration = 10,
  seed = 123,
  algorithm = "L-BFGS-B"   # Gradient-Based (stats)
  #algorithm = "GenSA"    # Simulated Annealing (GenSA)
  #algorithm = "GA"       # Genetic Algorithm (GA)
  #algorithm = "DEoptim"  # Differential Evolution (DEoptim)
  #algorithm = "PSO"      # Particle Swarm Optimization (pso)
  #algorithm = "Bayesian" # Bayesian Optimization (mlrMBO)
  #algorithm = "CMA-ES"   # Covariance Matrix Adapting (cmaes)
  #algorithm = c("NLOPT_GN_MLSL", "NLOPT_LN_BOBYQA")
)
summary(binaryRL.res)
} # }