In Part 2 we switch to Julia and setup the PythonCall and CondaPkg Julia packages to interoperate with the reused Python code. We also collect the code to be reused in a Python module called env_simulator.py. Then we load this Python module into this Julia notebook. After this we follow the usual structure followed for RxInfer projects (even though RxInfer is not yet used in this part).
0 Active Inference: Bridging Minds and Machines
In recent years, the landscape of machine learning has undergone a profound transformation with the emergence of active inference, a novel paradigm that draws inspiration from the principles of biological systems to inform intelligent decision-making processes. Unlike traditional approaches to machine learning, which often passively receive data and adjust internal parameters to optimize performance, active inference represents a dynamic and interactive framework where agents actively engage with their environment to gather information and make decisions in real-time.
At its core, active inference is rooted in the notion of agents as embodied entities situated within their environments, constantly interacting with and influencing their surroundings. This perspective mirrors the fundamental processes observed in living organisms, where perception, action, and cognition are deeply intertwined to facilitate adaptive behavior. By leveraging this holistic view of intelligence, active inference offers a unified framework that seamlessly integrates perception, decision-making, and action, thereby enabling agents to navigate complex and uncertain environments more effectively.
One of the defining features of active inference is its emphasis on the active acquisition of information. Rather than waiting passively for sensory inputs, agents proactively select actions that are expected to yield the most informative outcomes, thus guiding their interactions with the environment. This active exploration not only enables agents to reduce uncertainty and make more informed decisions but also allows them to actively shape their environments to better suit their goals and objectives.
Furthermore, active inference places a strong emphasis on the hierarchical organization of decision-making processes, recognizing that complex behaviors often emerge from the interaction of multiple levels of abstraction. At each level, agents engage in a continuous cycle of prediction, inference, and action, where higher-level representations guide lower-level processes while simultaneously being refined and updated based on incoming sensory information.
The applications of active inference span a wide range of domains, including robotics, autonomous systems, neuroscience, and cognitive science. In robotics, active inference offers a promising approach for developing robots that can adapt and learn in real-time, even in unpredictable and dynamic environments. In neuroscience and cognitive science, active inference provides a theoretical framework for understanding the computational principles underlying perception, action, and decision-making in biological systems.
In conclusion, active inference represents a paradigm shift in machine learning, offering a principled and unified framework for understanding and implementing intelligent behavior in artificial systems. By drawing inspiration from the principles of biological systems, active inference holds the promise of revolutionizing our approach to building intelligent machines and understanding the nature of intelligence itself.
1 BUSINESS UNDERSTANDING
This project deals with a client need relating to making optimal investment decisions regarding a given portfolio. Although we only provide for two financial instruments (stocks, bonds, funds, etc.) in the present example, the code can easily be expanded to provide for multiple financial instruments. A decision can handle buy, hold, or sell options.
versioninfo() ## Julia version# VERSION ## Julia version
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_NUM_THREADS =
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
CondaPkg Found dependencies: /home/vscode/.julia/environments/v1.10/CondaPkg.toml
CondaPkg Found dependencies: /home/vscode/.julia/packages/PythonCall/Nr75f/CondaPkg.toml
CondaPkg Dependencies already up to date
CondaPkg Found dependencies: /home/vscode/.julia/environments/v1.10/CondaPkg.toml
CondaPkg Found dependencies: /home/vscode/.julia/packages/PythonCall/Nr75f/CondaPkg.toml
CondaPkg Dependencies already up to date
CondaPkg Found dependencies: /home/vscode/.julia/environments/v1.10/CondaPkg.toml
CondaPkg Found dependencies: /home/vscode/.julia/packages/PythonCall/Nr75f/CondaPkg.toml
CondaPkg Dependencies already up to date
CondaPkg Status /home/vscode/.julia/environments/v1.10/CondaPkg.toml
Environment
/home/vscode/.julia/environments/v1.10/.CondaPkg/env
Packages
matplotlib v3.9.1 (3.9.1)
numpy v2.1.0
pandas v2.2.2
2 DATA UNDERSTANDING
There is no pre-existing data to be analyzed.
3 DATA PREPARATION
There is no pre-existing data to be prepared.
4 MODELING
4.1 Narrative
Please review the narrative in section 1.
4.2 Core Elements
This section attempts to answer three important questions:
What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?
For this problem, the only metric we are interested in is the amount of profit we make after each decision window. A single type of decision needs to be made at the start of each window - whether for each asset we want to buy, hold, or sell the asset at its current price. The only source of uncertainty is the prices of the assets.
4.3 System-Under-Steer / Environment / Generative Process
The simulation of the sustr/envir/genpr will be handled by the reused Python code included in the env_simulator.py Python module.
Restart if packages are not found in previous step
Could happen when rebuilding container
The system-under-steer/environment/generative process is an investment portfolio with a 5-dimensional state vector.
4.3.1 State variables
The state variables represent what we need to know. The state at time \(t\) of the system-under-steer (sustr), also referred to as the environment (envir), or the generative process (genpr) will be given by:
In the Python project, we have
\[
S_t = (R_t, R^0_t, p_t)
\] where
\(\mathcal E = \{ \text{AAA}, \text{BBB} \}\)
\(R_t = (R_{te})_{e \in \cal E}\)
\(R_{te}\) = Our position (in shares) in a stock \(e \in \cal E\), where \(R_{te}\) can be either
The exogenous information variables represent what we did not know (when we made a decision). These are the variables that we cannot control directly. The information in these variables become available after we make the decision \(x_t\).
When we assume that the price in each time period is revealed, without any model to predict the price based on past prices, we have, using approach 1:
\[
p_{t+1} = W_{t+1}
\]
Alternatively, when we assume that we observe the change in price \(\hat{p}_{t+1}=p_{t+1}-p_{t}\), we have, using approach 2:
We will make use of approach 2 which means that the exogenous information, \(W_{t+1}\), is the observed change in price of the share.
The exogenous information is obtained by a call to
SIM = PriceSimulator.simulate(...)
where SIM is a global variable.
The latest exogenous information can be accessed by calling the following method from class Model(), which returns a simulated price change for each asset:
The observation function can be represented by \[
\mathbf{y}_t = \tilde{\mathbf{s}}_t
\]
because we will not have any observation noise.
4.3.5 Objective function
The objective function is such that the Bethe free energy is minimized. This aspect will be handled by the RxInfer Julia package.
4.3.6 Implementation of the System-Under-Steer / Environment / Generative Process
The agent and the environment interact through a Markov blanket. Because states of the agent are unknown to the world, we wrap them in a comprehension that only returns functions for interacting with the agent. Internal beliefs cannot be directly observed, and interaction is only allowed through the Markov blanket of the agent (i.e. the sensors and actuators).
In Part 2 of this project we will not use RxInfer yet.
The Python code for the implementation can be found in the env_simulator.py Python module:
from collections import namedtuple, defaultdictimport pandas as pdimport numpy as npimport randomimport matplotlib.pyplot as pltimport matplotlib as mpl## PARAMETERSSNAMES = [ ## state variable names'R_t', ## resource'R0_t', ## cash'p_t', ## price]xNAMES = ['x_t'] ## decision variable nameseNAMES = ['AAA', 'BBB']piNAMES = ['X__HighLow'] ## policy namesthNAMES = ['thLo', 'thHi'] ## theta namesSEED_TRAIN =77777777x = ['Up', 'Neutral', 'Down']W_BIAS_CDFS = pd.DataFrame( [[.9, 1., 1.], ## 'Up' cdf [.2, .8, 1.], ## 'Neutral' cdf [0., .1, 1.]], ## 'Down' cdf index=x, columns=x,)INIT_PRICE = {eNAMES[0]: 100.0, eNAMES[1]: 50.0}W_UP_STEP =1W_DOWN_STEP =-1W_VARIANCE =2INIT_RESOURCE = {eNAMES[0]: 0, eNAMES[1]: 0}INIT_CASH =1_000.00S_0_INFO = {'R_t': {en: INIT_RESOURCE[en] for en in eNAMES},'R0_t': INIT_CASH,'p_t': {en: INIT_PRICE[en] for en in eNAMES},}## 'AAA' 'BBB'TH_HI = {eNAMES[0]: (200.0, 200.5, .1), eNAMES[1]: (90.0, 90.5, .1)}TH_LO = {eNAMES[0]: (168.0, 168.5, .1), eNAMES[1]: (72.0, 72.5, .1)}class PriceSimulator():def__init__(self, biasCdfs=W_BIAS_CDFS, upStep=W_UP_STEP, downStep=W_DOWN_STEP, variance=W_VARIANCE, seed=None):self.biasCdfs = biasCdfsself.upStep = upStepself.downStep = downStepself.variance = varianceself.prng = np.random.RandomState(seed)self.bias ='Neutral'def simulate(self): ## assume the change in price is normal with mean bias and variance 2 phat_tt1_dict = {} b_tt1_dict = {} b_tt1_val_dict = {}for e in eNAMES: b_t =self.prng.choice(['Down', 'Neutral', 'Up']) biasCdf =self.biasCdfs.loc[[b_t]] coin =self.prng.random_sample()if (coin <float(biasCdf['Up'].iloc[0])): b_tt1 ='Up'## new bias b_tt1_val =self.upStep ## biaselif (coin >=float(biasCdf['Up'].iloc[0]) and coin <float(biasCdf['Neutral'].iloc[0])): #. b_tt1 ='Neutral'## new bias b_tt1_val =0## biaselse: b_tt1 ='Down'## new bias b_tt1_val =self.downStep ## biasself.bias = b_tt1 phat_tt1_dict[e] =self.prng.normal(b_tt1_val, self.variance) ## change in price b_tt1_dict[e] = b_tt1 b_tt1_val_dict[e] = b_tt1_val W_tt1 = {"p_t": {e: phat_tt1_dict[e] for e in eNAMES},"b_t": {e: b_tt1_dict[e] for e in eNAMES}, ## just for display"b_t_val": {e: b_tt1_val_dict[e] for e in eNAMES} ## just for display }return W_tt1def create_data(T__sim): price_sim = PriceSimulator(seed=SEED_TRAIN) PriceData = []for i inrange(T__sim): res = price_sim.simulate() entry = [itm[1][e] for itm inlist(res.items()) for e in eNAMES] PriceData.append(entry) labels = [f'{itm[0]}_{e}'for itm inlist(res.items()) for e in eNAMES] df = pd.DataFrame.from_records(data=PriceData, columns=labels); df[:10]## df = pd.DataFrame.from_records(data=PriceData); df[:10]return dfclass Model():def__init__(self, S_0_info):self.S_0_info = S_0_infoself.State = namedtuple('State', SNAMES) ## 'class'self.S_t =self.build_state(S_0_info) ## 'instance'self.Decision = namedtuple('Decision', xNAMES) ## 'class'self.Ccum =0.0## cumulative rewarddef build_state(self, info):returnself.State(*[info[sn] for sn in SNAMES])def build_decision(self, info):returnself.Decision(*[info[xn] for xn in xNAMES])## exogenous information, dependent on a random process (the change in price)def W_fn(self, SIM): W_tt1 = SIM.simulate()return W_tt1def S__M_fn(self, S_t, x_t, W_tt1, theta, piName):## print(f'...in S__M_fn()...\n\t{S_t=}\n\t{x_t=}\n\t{W_tt1=}\n\t{theta=}')## R_t R_tt1 = {}for en in eNAMES: R_tt1[en] = S_t.R_t[en] + x_t.x_t[en]## R0_t cost =0.0for en in eNAMES: cost += x_t.x_t[en]*S_t.p_t[en] R0_tt1 = S_t.R0_t - cost## p_t## W_tt1['p_t'] has CHANGE in price## clipped at a penny, else division by zero in X__HighLow p_t = S_t.p_t p_tt1 = {}for en in eNAMES: p_tt1[en] =max(0.01, p_t[en] + W_tt1['p_t'][en]) S_tt1 =self.build_state({'R_t': R_tt1,'R0_t': R0_tt1,'p_t': p_tt1, })return S_tt1def C_fn(self, S_t, x_t, W_tt1):## print(f'...in C_fn()...\n\t{S_t=}\n\t{x_t=}\n\t{W_tt1=}') C_t =0.0for en in eNAMES: C_t +=-S_t.p_t[en]*x_t.x_t[en]return C_tdef step(self, x_t, theta, piName, SIM):## print(f'...in step()...\n\t{x_t=}\n\t{theta=}') W_tt1 =self.W_fn(SIM) C =self.C_fn(self.S_t, x_t, W_tt1)self.Ccum += Cself.S_t =self.S__M_fn(self.S_t, x_t, W_tt1, theta, piName)return (self.S_t, self.Ccum, x_t, W_tt1['b_t_val']) ## for plotting
We will simulate the share price \(p_{t+1} = p_t + \hat{p}_{t+1} = p_t + W_{t+1}\) as described in section 2, for each asset.
4.5 Agent / Generative Model
4.5.1 State variables
According to the agent the state of the system-under-steer/environment/generative process will be \(s_t\), rather than \(\tilde{\mathbf{s}}_t\) which will then be given by
According to the agent the action on the environment at time \(t\) will be represented by \(u_t\), also known as the control state of the agent.
4.5.3 Implementation of the Agent / Generative Model / Internal Model
We will not have a probabilistic model for the agent yet. In this part of the project the agent will behave according to a rule-based policy. For the Python implementation, the rule is given by:
\[
X^{HighLow}(S_{te}|\theta^{HighLow}) =
\begin{cases}
-1 & \text{if } p_{te} < \theta^{low}_e \text{ or } p_{te} > \theta^{high}_e \\
-1 & \text{if } t = T \text{ and } R_{te} = 1 \\
0 & \text{otherwise }
\end{cases}
\] for each asset \(e \in \mathcal{E}\)
A slight change in symbols represents its form for the Julia case:
\[
\pi^{HighLow}(\mathbf{s}_{t,e}|\theta^{HighLow}) =
\begin{cases}
-1 & \text{if } p_{te} < \theta^{low}_e \text{ or } p_{te} > \theta^{high}_e \\
-1 & \text{if } t = T \text{ and } R_{te} = 1 \\
0 & \text{otherwise }
\end{cases}
\] for each asset \(e \in \mathcal{E}\)
4.5.3.1 Generative Model for the portfolio
In Part 2 of this project we will not use RxInfer yet.
The Python code for the implementation can be found in the env_simulator.py Python module:
class Policy():def__init__(self, model):self.model = modelself.Policy = namedtuple('Policy', piNAMES) ## 'class'self.Theta = namedtuple('Theta', thNAMES) ## 'class'def build_policy(self, info):returnself.Policy(*[info[pin] for pin in piNAMES])def build_theta(self, info):returnself.Theta(*[info[thn] for thn in thNAMES])def X__HighLow(self, t, S_t, theta, N): ## T is for lookahead horizon in AIF## print(f'...in X__HighLow()...\n\t{t=}\n\t{S_t=}\n\t{theta=}') x_t_info = {'x_t': {en: 0for en in eNAMES} ## default is hold }## print(f'\t%%% {S_t.R0_t=}, {S_t.R_t=}, {S_t.p_t=}') tickersToSell = [] tickersToBuy = []## sell all at endif (t == N -1): tickersToSell = [en for en in eNAMES]for ticker in tickersToSell: nShares = S_t.R_t[ticker] x_t_info['x_t'][ticker] =-nSharesreturnself.model.build_decision(x_t_info)## identify buys and sellsfor en in eNAMES:if (S_t.p_t[en] < theta.thLo[en]): ## buy tickersToBuy.append(en)elif (S_t.p_t[en] > theta.thHi[en]): ## sell tickersToSell.append(en) totalFunds = S_t.R0_t;##print(f'\t%%% {totalFunds=}')## sell## print(f'\t%%% {tickersToSell=}')iflen(tickersToSell) >0:for ticker in tickersToSell: nShares = S_t.R_t[ticker] x_t_info['x_t'][ticker] =-nShares totalFunds += nShares*S_t.p_t[ticker]## print(f'\t%%% totalFunds after selling: {totalFunds}')## buy## print(f'\t%%% {tickersToBuy=}')iflen(tickersToBuy) >0: availFundsPerTicker = totalFunds/len(tickersToBuy);##print(f'{availFundsPerTicker=}')for ticker in tickersToBuy: nShares =int(availFundsPerTicker/S_t.p_t[ticker]) x_t_info['x_t'][ticker] =+nShares totalFunds -= nShares*S_t.p_t[ticker]## print(f'\t%%% totalFunds after buying: {totalFunds}')returnself.model.build_decision(x_t_info)def run_policy_sample_paths(self, theta, piName, N, SIM): ## T is for lookahead horizon in AIF record = [] M = Model(S_0_INFO) P = Policy(M)for t inrange(N): ## for each transition/step## print(f'\t%%% {t=}') x_t =getattr(self, piName)(t, M.S_t, theta, N) S_t, Ccum, x_t, b_t_val = M.step(x_t, theta, piName, SIM) record_t = [t] +\ [S_t.R_t[en] for en in eNAMES] + [S_t.R0_t] + [S_t.p_t[en] for en in eNAMES] +\ [Ccum] +\ [x_t.x_t[en] for en in eNAMES] +\ [b_t_val[en] for en in eNAMES] ## rather than b_t which is text and not ordered record.append(record_t)return record
Next, we define the agent:
# function create_agent(; T=20, Rᵃ, x₊, s₀, ξ=0.1, σ=1e-4)functioncreate_agent(; s̃₀, theta, N, SIM, M, P)## Bayesian inference by message passing## The `infer` function is the heart of the agent## It calls the `RxInfer.infer` function to perform Bayesian inference by message passing compute = (υₜ::Float64, ŷₜ::Vector{Float64}) ->beginend## The `act` function returns the inferred best possible action act = (t, S_t, theta, N) ->begin S_t_INFO =Dict("R_t"=>Dict("AAA"=> S_t[1], "BBB"=> S_t[2]), "R0_t"=> S_t[3], "p_t"=>Dict("AAA"=> S_t[4], "BBB"=> S_t[5]) ) s̃ₜ₋₁ = _M.build_state(S_t_INFO) aₜ = P.X__HighLow(t, s̃ₜ₋₁, theta, N)return aₜend## The `future` function returns the inferred future states future = () ->beginend slide = () ->beginendreturn (act, future, compute, slide)end
create_agent (generic function with 1 method)
4.6 Agent Policy Evaluation
4.6.1 Training/Tuning
4.6.1.1 No actions
Just to setup the RxInfer procedure, we create an environment but do not apply any actions. The only dynamics will come from the exogenous variables. The name decoration naive is used for this case.
_M =Model(S_0_INFO)_P =Policy(_M)_SIM =PriceSimulator(seed=SEED_TRAIN)_theta = _P.build_theta(Dict("thLo"=>Dict("AAA"=>100, "BBB"=>50), "thHi"=>Dict("AAA"=>110, "BBB"=>60)))_Nⁿᵃⁱᵛᵉ =100## Total simulation time_s̃₀ = _M.build_state(S_0_INFO)(execute_naive, observe_naive) =create_envir(; s̃₀= _s̃₀, theta= _theta, N= _Nⁿᵃⁱᵛᵉ, SIM= _SIM, M= _M, P= _P);_yⁿᵃⁱᵛᵉ =Vector{Vector{Float64}}(undef, _Nⁿᵃⁱᵛᵉ) ## Observations_Ccum =Vector{Float64}(undef, _Nⁿᵃⁱᵛᵉ)_x =Vector{Vector{Float64}}(undef, _Nⁿᵃⁱᵛᵉ) ## Actionsfor t =1:_Nⁿᵃⁱᵛᵉ## 3. Execute (environmental process) pytmp = _M.build_decision(Dict("x_t"=>Dict("AAA"=>0, "BBB"=>0))) ## dummy action v = [pyconvert(Integer, pytmp.x_t[i]) for i in eNAMES] pytmp1, pytmp2 =execute_naive(t, v) _Ccum[t] =pyconvert(Float64, pytmp1) v = [pyconvert(Integer, pytmp2.x_t[i]) for i in eNAMES] _x[t] = v## 4. Observe pytmp =observe_naive() ## Observe external states v =vcat( [pyconvert(Integer, pytmp.R_t[i]) for i in eNAMES], [pyconvert(Float64, pytmp.R0_t)], [pyconvert(Float64, pytmp.p_t[i]) for i in eNAMES]) _yⁿᵃⁱᵛᵉ[t] = vend
Now we are going to apply the HighLow rule-based policy that was mentioned above. Actions will be generated according to the rule. Note that the name decoration ai (for active inference) is used even though this principle is not yet applied.
_M =Model(S_0_INFO)_P =Policy(_M)_SIM =PriceSimulator(seed=SEED_TRAIN)_theta = _P.build_theta(Dict("thLo"=>Dict("AAA"=>100, "BBB"=>50), "thHi"=>Dict("AAA"=>110, "BBB"=>60)))_Nᵃⁱ =100## Total simulation time_s̃₀ = _M.build_state(S_0_INFO)(execute_ai, observe_ai) =create_envir(; s̃₀= _s̃₀, theta= _theta, N= _Nᵃⁱ, SIM= _SIM, M= _M, P= _P);(act_ai, future_ai, compute_ai, slide_ai) =create_agent(; s̃₀= _s̃₀, theta= _theta, N= _Nᵃⁱ, SIM= _SIM, M= _M, P= _P ) _yᵃⁱ =Vector{Vector{Float64}}(undef, _Nᵃⁱ) ## Observations_yᵃⁱ_init = [0.1, 0.1, 0.1, 0.1, 0.1]_Ccum =Vector{Float64}(undef, _Nᵃⁱ)_x =Vector{Vector{Float64}}(undef, _Nᵃⁱ) ## Actionsfor t =1:_Nᵃⁱ## 1. Actif t >1 pytmp =act_ai(t, _yᵃⁱ[t-1], _theta, _Nᵃⁱ)else pytmp =act_ai(t, _yᵃⁱ_init, _theta, _Nᵃⁱ)end v = [pyconvert(Integer, pytmp.x_t[i]) for i in eNAMES] _x[t] = v## 2. Future## _fs[t] = future_ai() ## Fetch the predicted future states## 3. Execute pytmp1, pytmp2 =execute_ai(t, _x[t]) ## The action influences hidden external states _Ccum[t] =pyconvert(Float64, pytmp1) v = [pyconvert(Integer, pytmp2.x_t[i]) for i in eNAMES] _x[t] = v## 4. Observe pytmp =observe_ai() ## Observe external states v =vcat( [pyconvert(Integer, pytmp.R_t[i]) for i in eNAMES], [pyconvert(Float64, pytmp.R0_t)], [pyconvert(Float64, pytmp.p_t[i]) for i in eNAMES]) _yᵃⁱ[t] = v## 5. Infer:## compute_ai(_as[t], _ys[t]) ## Infer beliefs from current model state (update q)## 6. Slide:## slide_ai() ## Prepare for next iterationend