Reinforcement Learning
Fill me in
Grid World
VLQuantitativeFinancePackage.MyPeriodicRectangularGridWorldModel
— Typemutable struct MyPeriodicRectangularGridWorldModel <: AbstractWorldModel
The MyPeriodicRectangularGridWorldModel
mutable struct represents a rectangular grid world model.
Required fields
number_of_rows::Int
: The number of rows in the gridnumber_of_cols::Int
: The number of columns in the gridcoordinates::Dict{Int,Tuple{Int,Int}}
: A dictionary that holds the coordinates of each cell in the grid where thekey
is the cell index and thevalue
is a tuple of the row and column indexstates::Dict{Tuple{Int,Int},Int}
: A dictionary that holds the states of each cell in the grid where thekey
is the cell coordinates and thevalue
is the state indexmoves::Dict{Int,Tuple{Int,Int}}
: A dictionary that holds the moves that can be made from each cell in the grid where thekey
is the move index and thevalue
is a tuple of the row and column changes associated with the moverewards::Dict{Int,Float64}
: A dictionary that holds the rewards for each cell in the grid where thekey
is the cell index and thevalue
is the reward
VLQuantitativeFinancePackage.build
— Methodfunction build(type::MyPeriodicRectangularGridWorldModel, nrows::Int, ncols::Int,
rewards::Dict{Tuple{Int,Int}, Float64}; defaultreward::Float64 = -1.0) -> MyPeriodicRectangularGridWorldModel
The build
method constructs an instance of the MyPeriodicRectangularGridWorldModel
type using the data in the NamedTuple.
Arguments
type::MyPeriodicRectangularGridWorldModel
: The type of model to build.data::NamedTuple
: The data to use to build the model.
The data::NamedTuple
must contain the following keys
:
nrows::Int
: The number of rows in the gridncols::Int
: The number of columns in the gridrewards::Dict{Tuple{Int,Int}, Float64}
: A dictionary that maps the coordinates of the grid to the rewards at those coordinatesdefaultreward::Float64
: The default reward for the grid. This is set to-1.0
by default.
Return
This function returns a populated instance of the MyPeriodicRectangularGridWorldModel
type.
Wolfram policies and grids
Fill me in
VLQuantitativeFinancePackage.MyOneDimensionalElementarWolframRuleModel
— Typemutable struct MyOneDimensionalElementarWolframRuleModel <: AbstractPolicyModel
The MyOneDimensionalElementarWolframRuleModel
mutable struct represents a one-dimensional elementary Wolfram rule model.
Required fields
index::Int
: The index of the ruleradius::Int
: The radius, i.e, the number of cells that influence the next state for this rulerule::Dict{Int,Int}
: A dictionary that holds the rule where thekey
is the binary representation of the neighborhood and thevalue
is the next state
VLQuantitativeFinancePackage.build
— Methodfunction build(modeltype::Type{MyOneDimensionalElementarWolframRuleModel}, data::NamedTuple) -> MyOneDimensionalElementarWolframRuleModel
This build
method constructs an instance of the MyOneDimensionalElementarWolframRuleModel
type using the data in a NamedTuple.
Arguments
modeltype::Type{MyOneDimensionalElementarWolframRuleModel}
: The type of model to build, in this case, theMyOneDimensionalElementarWolframRuleModel
type.data::NamedTuple
: The data to use to build the model.
The data::NamedTuple
must contain the following keys
:
index::Int64
: The index of the Wolfram rulecolors::Int64
: The number of colors in the ruleradius::Int64
: The radius, i.e., the number of cells to consider in the rule
Return
This function returns a populated instance of the MyOneDimensionalElementarWolframRuleModel
type.
VLQuantitativeFinancePackage.MyTwoDimensionalTotalisticWolframRuleModel
— Typemutable struct MyTwoDimensionalTotalisticWolframRuleModel <: AbstractPolicyModel
The MyTwoDimensionalTotalisticWolframRuleModel
mutable struct represents a two-dimensional totalistic Wolfram rule model.
Required fields
index::Int
: The index of the ruleradius::Int
: The radius, i.e, the number of cells that influence the next state for this rulenumber_of_colors::Int
: The number of colors in the rule, i.e., the number of states a cell can exist inrule::Dict{Int,Int}
: A dictionary that holds the rule where thekey
is index of the neighborhood state and thevalue
is the next stateneighborhoodstatesmap::Dict{Float64, Int64}
: A dictionary whosekeys
are the average of the neighborhood states and thevalue
the neighborhood state index
VLQuantitativeFinancePackage.build
— Methodfunction build(modeltype::Type{MyTwoDimensionalTotalisticWolframRuleModel}, data::NamedTuple) -> MyTwoDimensionalTotalisticWolframRuleModel
This build
method constructs an instance of the MyTwoDimensionalTotalisticWolframRuleModel
type using the data in a NamedTuple.
Arguments
modeltype::Type{MyTwoDimensionalTotalisticWolframRuleModel}
: The type of model to build.data::NamedTuple
: The data to use to build the model.
The data::NamedTuple
must contain the following keys
:
index::Int64
: The index of the Wolfram rulecolors::Int64
: The number of colors in the ruleradius::Int64
: The radius, i.e., the number of neighbor cells to consider in the rule
Return
This function returns a populated instance of the MyTwoDimensionalTotalisticWolframRuleModel
type.
VLQuantitativeFinancePackage.solve
— Methodsolve(rulemodel::MyTwoDimensionalTotalisticWolframRuleModel, initialstate::Array{Int64,2}; steps::Int64 = 100) -> Dict{Int64, Array{Int64,2}}
The solve
function solves the two-dimensional totalistic Wolfram rule model for a given initial state.
Arguments
rulemodel::MyTwoDimensionalTotalisticWolframRuleModel
: An instance of theMyTwoDimensionalTotalisticWolframRuleModel
type that defines the rule model parameters.initialstate::Array{Int64,2}
: A two-dimensional array that represents the initial state of the world.steps::Int64
: The number of steps to simulate. The default value is100
.
Returns
Dict{Int64, Array{Int64,2}}
: A dictionary where the keys are integers (time steps) and the values are two-dimensional arrays that represent the state of the world at each time step.
Wolfram Q-learning
Fill me in
VLQuantitativeFinancePackage.MyWolframRuleQLearningAgentModel
— Typemutable struct MyWolframRuleQLearningAgentModel <: AbstractLearningModel
The MyWolframRuleQLearningAgentModel
mutable struct represents a Q-learning agent model for a Wolfram rule.
Required fields
states::Array{Int,1}
: The states of the modelactions::Array{Int,1}
: The actions of the modelγ::Float64
: The discount factorα::Float64
: The learning rateQ::Array{Float64,2}
: The Q-table of the model
VLQuantitativeFinancePackage.build
— Methodfunction build(modeltype::Type{MyWolframRuleQLearningAgentModel}, data::NamedTuple) -> MyWolframRuleQLearningAgentModel
VLQuantitativeFinancePackage.MyWolframGridWorldModel
— Typemutable struct MyWolframGridWorldModel <: AbstractWorldModel
The MyWolframGridWorldModel
mutable struct represents a grid world model for a Wolfram rule.
Required fields
number_of_states::Int64
: The number of states of the agentdata::Dict{Int64, Array{Int64,1}}
: A dictionary that holds temporal playback data where thekey
is the time step and thevalue
is an array holding the neighborhood (first radius elemnts) of the agent and the next state (last element).policymap::Dict{Float64, Int64}
: A dictionary that holds the policy map where thekey
is the average of the neighborhood and thevalue
is the state indexworld::Function
: A function that represents the world model. The function takes the world model, the time stept
, the states
and proposed actiona
and returns the next state of the agent, and the immediate reward for taking actiona
in states
.
VLQuantitativeFinancePackage.build
— Methodfunction build(modeltype::Type{MyWolframGridWorldModel}, data::NamedTuple) -> MyWolframGridWorldModel
VLQuantitativeFinancePackage.sample
— Methodsample(agent::MyWolframRuleQLearningAgentModel, environment::MyWolframGridWorldModel; maxsteps::Int = 100, ϵ::Float64 = 0.2) -> MyWolframRuleQLearningAgentModel
The sample
function samples the environment using the Q-learning agent model for a given number of steps using an epsilon-greedy method. At each step, the agent selects an action based on the epsilon-greedy method and updates the Q-table using the Q-learning update rule by calling the world
function contained in the environment
model.
Arguments
agent::MyWolframRuleQLearningAgentModel
: An instance of theMyWolframRuleQLearningAgentModel
type that defines the agent model parameters.environment::MyWolframGridWorldModel
: An instance of theMyWolframGridWorldModel
type that defines the world model.maxsteps::Int
: The number of steps to sample. The default value is100
.ϵ::Float64
: The epsilon value for the epsilon-greedy method. The default value is0.2
.
Returns
MyWolframRuleQLearningAgentModel
: An updated instance of theMyWolframRuleQLearningAgentModel
type, where the Q-table has been updated.