Back to Portfolio of Projects | LearnableLoopAI.com | Blog |

Hidden Markov Model with Control (Part 2)

1 BUSINESS UNDERSTANDING

This is the second part of an effort to add control to the Hidden Markov Model RxInfer example. Eventually this work will be used in a more elaborate scheduling project making use of active inference.

In this part, we will add control inputs to the Roomba, instructing it to move to a specific room. The 3 rooms are the master bedroom, the living room, and the bathroom. In addition, we add the act() function in the structure below:

create_envir()
- execute()
- observe()
create_agent()
- act() [added in this notebook]
- future() [not yet]
- compute()
- slide() [not yet]

In the compute() function, which takes care of inference, we will still estimate the $A$ matrix. However, we will not estimate the $B$ matrix (as was done in Part 1). Instead we will make use of a fixed $B$ -matrix tensor. This tensor can be considered a stack of 3 $B$ -matrices - each associated with a specific control action to go to a specific room.

Unfortunately, the implementation broke during experimentation. Hopefully this will be fixed soon. It was important to publish this effort (even with the breakage) to allow for assistance in context.

versioninfo() ## Julia version

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
  JULIA_NUM_THREADS =

import Pkg
Pkg.add(Pkg.PackageSpec(;name="RxInfer", version="3.6.0"))
Pkg.add(Pkg.PackageSpec(;name="Random"))
Pkg.add(Pkg.PackageSpec(;name="Plots", version="1.40.1"))
# Pkg.add(Pkg.PackageSpec(;name="XLSX"))
# Pkg.add(Pkg.PackageSpec(;name="DataFrames"))
# Pkg.add(Pkg.PackageSpec(;name="BenchmarkTools"))
# Pkg.add(Pkg.PackageSpec(;name="Distributions"))

# using RxInfer, Random, BenchmarkTools, Distributions, Plots, XLSX, DataFrames
using RxInfer, Plots, Random

   Resolving package versions...
  No Changes to `~/.julia/environments/v1.10/Project.toml`
  No Changes to `~/.julia/environments/v1.10/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.10/Project.toml`
  No Changes to `~/.julia/environments/v1.10/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.10/Project.toml`
  No Changes to `~/.julia/environments/v1.10/Manifest.toml`

Pkg.status()

Status `~/.julia/environments/v1.10/Project.toml`
  [6e4b80f9] BenchmarkTools v1.5.0
  [31c24e10] Distributions v0.25.111
⌃ [91a5bcdd] Plots v1.40.1
  [86711068] RxInfer v3.6.0
  [9a3f8284] Random
Info Packages marked with ⌃ have new versions available and may be upgradable.

2 DATA UNDERSTANDING

The data will consist of a time-series of 1-hot encoded position states. When the associated component of the state vector is a 1, the Roomba will be present in the associated room.

3 DATA PREPARATION

We will use the data from the simulation directly. There is no need to perform additional data preparation.

4 MODELING

4.1 Narrative

The next figure (from Dr. Bert de Vries at Eindhoven University of Technology) shows the interactions between the agent and the environment:

The grey area shows the markov blanket of the agent. The interactions between the agent and environment can be summarized by:

$\begin{aligned} a_{t} & \sim q (u_{t}) \\ x_{t} & = y_{t} \end{aligned}$

This means that actions on the environment are sampled from the posterior over control signals. We will explain a bit more down below.

In general, we will have the following symbol conventions:

$t$ : time
$k$ : future time
$l$ : past time
$x_{t}$ : scalar random variable
$x_{t}$ : sequence of scalar random variables
$x_{t}$ : 1-hot encoded random variable

Global code variables will always be prefixed with an underscore ’_’.

4.2 Core Elements

This section attempts to answer three important questions:

What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?

The only metric we are interested in is the postition of the Roomba, i.e. in which of the 3 rooms it finds itself.

There are no control/steering decisions to be made (yet). We are simply interested in the behavior of the Roomba.

There are two sources of uncertainty. The first has to do with the fact that the state transitions are not deterministic but rather stochastic. This will be captured in the transition matrix $B$ . The second relates to observations. The Roomba does not always measure accurately the floor surface in a room. This uncertainty will be captured in the observation matrix $A$ .

4.3 System-Under-Steer / Environment / Generative Process

First, in order to track the Roomba’s movements using RxInfer, we need to come up with a model. Since we have a discrete set of rooms in the apartment, we can use a categorical distribution to represent the Roomba’s position. There are three rooms in the apartment, meaning we need three states in our categorical distribution. At time $t$ , let’s call the Roomba’s true position ${\tilde{s}}_{t}$ .

However, we also know that some rooms are more accessible than others, meaning the Roomba is more likely to move between these rooms - for example, it’s rare to have a door directly between the bathroom and the master bedroom. We can encode this information using a transition matrix, which we will call $B$ .

Our Roomba is equipped with a small camera that tracks the surface it is moving over. We will use this camera to obtain our observations since we know that there is

hardwood floors in the master bedroom
carpet in the living room, and
tiles in the bathroom.

However, this method is not foolproof, and sometimes the Roomba will make mistakes and mistake the hardwood floor for tiles or the carpet for hardwood. Don’t be too hard on the little guy, it’s just a Roomba after all.

At time $t$ , we will call our observations $y_{t}$ and encode the mapping from the Roomba’s position to the observations in a matrix we call $A$ . $A$ also encodes the likelihood that the Roomba will make a mistake and get the wrong observation. This leaves us with the following model specification:

$\begin{aligned} s_{t} & \sim C a t (B s_{t - 1}), \\ x_{t} & \sim C a t (A s_{t}) . \end{aligned}$

This type of discrete state space model is known as a Hidden Markov Model or HMM for short. Our goal is to learn the matrices $A$ and $B$ so we can use them to track the whereabouts of our little cleaning agent.

_seed = 1909
_rng = MersenneTwister(_seed)

MersenneTwister(1909)

_s̃₀ = [1.0, 0.0, 0.0] ## initial state

3-element Vector{Float64}:
 1.0
 0.0
 0.0

4.3.1 State and Observation variables

The state variables represent what we need to know. The true state at time $t$ of the Roomba will be given by a 1-hot encoded vector ${\tilde{s}}_{t} = (S)$

where

$S = {bedroom, livingroom, bathroom}$

$[1, 0, 0]^{T}$ : bedroom
$[0, 1, 0]^{T}$ : livingroom
$[0, 0, 1]^{T}$ : bathroom

The 1-hot observation made by the Roomba at time $t$ will be given by

$y_{t} = (O)$

where

$O = {hardwood, carpet, tiles}$

$[1, 0, 0]^{T}$ : hardwood
$[0, 1, 0]^{T}$ : carpet
$[0, 0, 1]^{T}$ : tiles

4.3.2 Decision variables

There are no decisions to be made (yet). We are simply observing the behavior of the Roomba, i.e. there is no control/steering applied to the environment.

4.3.3 Exogenous information / Autonomous variables

There are no exogenous information variables (yet). We may add some in followup parts, for example, having certain doors closed at random.

4.3.4 Transition and Observation functions

The environment evolves accoring to

$\begin{aligned} {\tilde{s}}_{t} & \sim C a t (B {\tilde{s}}_{t - 1}), \\ y_{t} & \sim C a t (A {\tilde{s}}_{t}) . \end{aligned}$

where ${\tilde{s}}_{t}$ is the state and $y_{t}$ is the observation.

The agent/environment interaction may be expressed as:

$(y_{t}, \tilde{s_{t}}) = R_{t} ({\tilde{s}}_{t - 1}, a_{t}, w_{t})$

where

${\tilde{s}}_{t - 1}$ is the previous state
$a_{t}$ is the action [not yet]
$w_{t}$ is the exogenous information / autonomous state [not yet]
$y_{t}$ is the outcome or observation

Transition model $\tilde{B}$

In Part 1, we had this transition model:

states ${\tilde{s}}_{t - 1}$ states ${\tilde{s}}_{t}$	bed	living	bath
bed	0.9	0.0	0.1
living	0.0	0.9	0.1
bath	0.1	0.1	0.8

In this notebook, Part 2, we setup the following arrangement of rooms:

1 `bed`	2 `liv`
3 `bath`

Firthermore, there are doors between bed and liv, and between bed and bath. For now, these doors are all permanently open. Note that there is no door between bath and liv. For this notebook, if the action is to move from bath to liv, there will be a 50% chance to stay where it is, and a 50% chance to move to bed instead.

To capture the transition model $B_{u_{k}}$ , we setup 3 matrices, one for each of the control actions $a ₜ \in {1, 2, 3}$ :

Go to bedroom:

$a_{t} = 1$	1	2	3
1	1.0	1.0	1.0
2	0.0	0.0	0.0
3	0.0	0.0	0.0

Go to livingroom:

$a_{t} = 2$	1	2	3
1	0.0	0.0	0.5
2	1.0	1.0	0.0
3	0.0	0.0	0.5

Go to bathroom:

$a_{t} = 3$	1	2	3
1	0.0	0.5	0.0
2	0.0	0.5	0.0
3	1.0	0.0	1.0

_B̃ = [
    [1.  1.  1.; ## go to bedroom
     0.  0.  0.;
     0.  0.  0.],
    [0.  0.  .5; ## go to livingroom
     1.  1.  0.;
     0.  0.  .5],
    [0.  .5  0.; ## go to bathroom
     0.  .5  0.;
     1.  0.  1.]
]

3-element Vector{Matrix{Float64}}:
 [1.0 1.0 1.0; 0.0 0.0 0.0; 0.0 0.0 0.0]
 [0.0 0.0 0.5; 1.0 1.0 0.0; 0.0 0.0 0.5]
 [0.0 0.5 0.0; 0.0 0.5 0.0; 1.0 0.0 1.0]

Observation model $\tilde{A}$

states ${\tilde{s}}_{t}$ observations $y_{t}$	bed	living	bath
bed	0.9	0.05	0.05
living	0.05	0.9	0.05
bath	0.05	0.05	0.9

_Ã = [    
    0.9  0.05 0.05;
    0.05 0.9  0.05;
    0.05 0.05 0.9]

3×3 Matrix{Float64}:
 0.9   0.05  0.05
 0.05  0.9   0.05
 0.05  0.05  0.9

4.3.5 Objective function

The objective function is such that the Bethe free energy (BFE) or Generalized free energy (GFE) is minimized. This aspect will be handled by the RxInfer Julia package.

4.3.6 Implementation of the System-Under-Steer / Environment / Generative Process

## returns a one-hot encoding of a random sample from a categorical distribution
function rand_1hot_vec(rng, distribution::Categorical)
    K = ncategories(distribution)
    sample = zeros(K)
    drawn_category = rand(rng, distribution)
    sample[drawn_category] = 1.0
    return sample
end

rand_1hot_vec (generic function with 1 method)

The execute function now gets an argument $a_{t}$ which is the action to execute on the environment.

function create_envir(; Ã, B̃, s̃₀, seed=42)
    rng = MersenneTwister(seed)
    s̃ₜ₋₁ = s̃₀
    s̃ₜ = s̃ₜ₋₁
    yₜ = s̃ₜ
    ## Execute a move to position aₜ
    execute = (aₜ::Vector{Float64}) -> begin
        i = argmax(aₜ)
        s̃ₜ = rand_1hot_vec(rng, Categorical(B̃[i]*s̃ₜ₋₁))
        yₜ = rand_1hot_vec(rng, Categorical(Ã*s̃ₜ))
        s̃ₜ₋₁ = s̃ₜ
    end

    observe = () -> begin
        return yₜ
    end
    
    return (execute, observe)
end

create_envir (generic function with 1 method)

In order to generate data to mimic the observations of the Roomba, we need to specify two things: the actual transition probabilities between the states (i.e., how likely is the Roomba to move from one room to another), and the observation distribution (i.e., what type of texture will the Roomba encounter in each room). We can then use these specifications to generate observations from our hidden Markov model (HMM).

To generate our observation data, we’ll follow these steps: 1. Assume an initial state for the Roomba. For example, we can start the Roomba in the bedroom. 2. Determine where the Roomba went next by drawing from a Categorical distribution with the transition probabilities between the different rooms. 3. Determine the observation encountered in this room by drawing from a Categorical distribution with the corresponding observation probabilities. 4. Repeat steps 2-3 for as many samples as we want.

The following code implements this process and generates our observation data:

We will generate 600 data points to simulate 600 ticks of the Roomba moving through the apartment. _ys will contain the Roomba’s measurements of the floor it’s currently on, and _s̃s will contain information on the room the Roomba was actually in.

We will use random control actions, drawn from a uniform distribution.

_N = 600
_s̃₀ = [1.0, 0.0, 0.0] ## initial state
(execute_sim, observe_sim) = create_envir(; 
    Ã=_Ã,
    B̃=_B̃,
    s̃₀=_s̃₀
)

_s̃s = Vector{Vector{Float64}}(undef, _N) ## States
_ys = Vector{Vector{Float64}}(undef, _N) ## Observations
for t = 1:_N
    aₜ = rand_1hot_vec(_rng, Categorical(fill(1. / 3., 3)))
    _s̃s[t] = execute_sim(aₜ) ## Hidden external states
    _ys[t] = observe_sim() ## Observe the current environmental outcome
end

_s̃s ## true states

600-element Vector{Vector{Float64}}:
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [0.0, 1.0, 0.0]
 [0.0, 1.0, 0.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 ⋮
 [0.0, 1.0, 0.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [0.0, 0.0, 1.0]
 [0.0, 0.0, 1.0]

_ys ## observations

600-element Vector{Vector{Float64}}:
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [0.0, 1.0, 0.0]
 [0.0, 1.0, 0.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 ⋮
 [0.0, 1.0, 0.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [1.0, 0.0, 0.0]
 [0.0, 0.0, 1.0]
 [0.0, 0.0, 1.0]

plot(title="States for Simulated Data")
plot!(
    argmax.(_s̃s), linetype=:steppre, linestyle=:solid, leg=:right, 
    label="state", xlabel="Time", yticks=([1,2,3,4], ["Bedroom", "Living room", "Bathroom"]))

plot(title="Observations for Simulated Data")
plot!(
    argmax.(_ys), linetype=:steppre, linestyle=:solid, leg=:right, 
    label="observation", xlabel="Time", yticks=([1,2,3], ["Bedroom", "Living room", "Bathroom"]))

4.4 Uncertainty Model

The uncertainty model is captured in the transition matrix $B$ and the observation matrix $A$ .

4.5 Agent / Generative Model

The agent consists of:

A free energy functional $F [q] = E_{q} [\log \frac{q (z)}{p (x, z)}]$ where
- $p (x, z) = Π_{k} p (x_{k}, z_{k} ∣ z_{k - 1})$ is a generative model with:
  - observations ${x_{k}}$
  - latent variables ${z_{k}} = {{s_{k}}, {θ_{k}}, {u_{k}}}$
  - $k$ is a time index
- $q (z)$ is a recognition model
A procedure to minimize the free energy $F [q]$

4.5.1 State and Observation variables

On the agent’s side, the state of the environment at time $t$ will be given by a 1-hot encoded vector $s_{t}$ . The initial state prior is given by $p (s_{0}) = Cat (s_{0} ∣ d)$

where $d$ parameterizes the categorical distribution of $s_{0}$ .

The observation made by the Roomba at time $t$ will be given by $x_{t}$ where

$x_{t} = [1, 0, 0]^{T}$ : hardwood
$x_{t} = [0, 1, 0]^{T}$ : carpet
$x_{t} = [0, 0, 1]^{T}$ : tiles

4.5.2 Decision variables

On the agent’s side, the action on the environment at time $t$ will be represented by a 1-hot encoded vector $u_{t}$ . The control prior is given by $p (u_{k}) = Cat (u_{k} ∣ e_{k})$

where $e_{k}$ parameterizes the categorical distribution of $u_{k}$ .

4.5.3 Preference variables / Setpoints

There will be no preference variables (yet).

4.5.4 Transition and Observation functions

The transition function is given by

$\begin{aligned} p (s_{k} ∣ s_{k - 1}, u_{k}) & = \prod_{κ} Cat (s_{k} ∣ B_{κ} s_{k - 1})^{u_{κ k}} \\ = Cat (s_{k} ∣ B_{u_{k}} s_{k - 1}) \end{aligned}$

where

$B_{u_{k}} s_{k - 1}$ parameterizes the categorical distribution of $s_{k}$ , and $u_{κ k}$ selects the associated categorical distribution for the specific control action.

The observation function is given by

$\begin{aligned} p (x_{k} ∣ s_{k}) & = Cat (x_{k} ∣ A s_{k}) \end{aligned}$

where

$A s_{k}$ parameterizes the categorical distribution of $x_{k}$ .

An entry in $A$ captures the probability of a specific observation given a specific state. Each column in $A$ contains a categorical distribution. A specific column is selected by multiplying with $s$ .

4.5.5 Implementation of the Agent / Generative Model / Internal Model

We start by specifying a probabilistic model for the agent that describes the agent’s internal beliefs over the external dynamics of the environment. Assuming the current time is $t$ ,

The generative model is defined as follows:

$\begin{array}{r} p^{'} (x, s, A, u) = \underset{\begin{array}{c} Initial \\ state \\ prior \end{array}}{\underset{⏟}{p (s_{t - 1})}} \underset{\begin{array}{c} Parameter \\ prior for \\ A \end{array}}{\underset{⏟}{p (A)}} \prod_{k = t}^{t + T} \underset{\begin{array}{c} Observation \\ model \end{array}}{\underset{⏟}{p (x_{k} ∣ s_{k}, A)}} \underset{\begin{array}{c} Transition \\ model \end{array}}{\underset{⏟}{p (s_{k} ∣ s_{k - 1}, u_{k}}}) \underset{\begin{array}{c} Control \\ prior \end{array}}{\underset{⏟}{p (u_{k})}} \end{array}$

The generative model includes future time steps.

To infer goal-driven (i.e. purposeful) behavior, we add prior beliefs $p^{+} (x)$ about desired future observations$. This leads to an extended agent model:

$\begin{aligned} p (x, s, u) & = \frac{p^{'} (x, s, u) p^{+} (x)}{\int_{x} p^{'} (x, s, u) p^{+} (x) d x} \\ \propto \underset{original generative model}{\underset{⏟}{p (s_{t - 1}) \prod_{k = t}^{t + T} p (x_{k} ∣ s_{k}) p (s_{k} ∣ s_{k - 1}, u_{k}) p (u_{k})}} \underset{\begin{array}{c} extension \\ Goal \\ prior \end{array}}{\underset{⏟}{p^{+} (x_{k})}} \\ \propto \underset{\begin{array}{c} Initial \\ state \\ prior \end{array}}{\underset{⏟}{p (s_{t - 1})}} \prod_{k = t}^{t + T} \underset{\begin{array}{c} Observation \\ model \end{array}}{\underset{⏟}{p (x_{k} ∣ s_{k})}} \underset{\begin{array}{c} Transition \\ model \end{array}}{\underset{⏟}{p (s_{k} ∣ s_{k - 1}, u_{k}}}) \underset{\begin{array}{c} Control \\ prior \end{array}}{\underset{⏟}{p (u_{k})}} \underset{\begin{array}{c} Goal \\ prior \end{array}}{\underset{⏟}{p^{+} (x_{k})}} \end{aligned}$

Next, we place 1-hot encodings on all random variables. Using regular bold font to indicate the 1-hot encodings, and also setting $t = 1$ , we have:

$\begin{aligned} p (x, s, u) & \propto \underset{\begin{array}{c} Initial \\ state \\ prior \end{array}}{\underset{⏟}{p (s_{0})}} \prod_{k = 1}^{T + 1} \underset{\begin{array}{c} Observation \\ model \end{array}}{\underset{⏟}{p (x_{k} ∣ s_{k})}} \underset{\begin{array}{c} Transition \\ model \end{array}}{\underset{⏟}{p (s_{k} ∣ s_{k - 1}, u_{k}}}) \underset{\begin{array}{c} Control \\ prior \end{array}}{\underset{⏟}{p (u_{k})}} \underset{\begin{array}{c} Goal \\ prior \end{array}}{\underset{⏟}{p^{+} (x_{k})}} \end{aligned}$

where

$\begin{aligned} p (s_{0}) & = Cat (s_{0} ∣ d) \\ p (x_{k} ∣ s_{k}) & = Cat (x_{k} ∣ A s_{k}) \\ p (s_{k} ∣ s_{k - 1}, u_{k}) & = \prod_{κ} [Cat (s_{k} ∣ B_{κ} s_{k - 1})]^{u_{κ k}} \\ = Cat (s_{k} ∣ B_{u_{k}} s_{k - 1}) \\ p (u_{k}) & = Cat (u_{k} ∣ e_{k}) \\ p^{+} (x_{k}) & = Cat (x_{k} ∣ c_{k}) \end{aligned}$

In general, if $a$ is a 1-hot encoded random variable, and has a categorical (aka multinoulli) distribution, then

$p (a ∣ ρ) = Cat (a ∣ ρ) = \prod_{i} ρ_{i}^{a_{i}}$

This means the $i$ th component of vector $a$ selects the $i$ th component of the probability vector $ρ$ of the distribution.

If the probability vector is $ρ = [\begin{matrix} 0.05 \\ 0.05 \\ 0.50 \\ 0.10 \\ 0.10 \\ 0.20 \end{matrix}]$ and the random variable $a$ is $a = [\begin{matrix} 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0 \end{matrix}]$

then $\begin{aligned} p (a = 4 ∣ ρ) & = \prod_{i} ρ_{i}^{a_{i}} \\ = ({0.05}^{0}) ({0.05}^{0}) ({0.50}^{0}) ({0.10}^{1}) ({0.10}^{0}) ({0.20}^{0}) \\ = 0.10 \end{aligned}$

Now it is time to build our model. As mentioned earlier, we will use Categorical distributions for the states and observations. To learn the $A$ matrix we can use a MatrixDirichlet prior.

As for the observations, we have good reason to trust our Roomba’s measurements. To represent this, we will add large values to the diagonal of our prior on $A$ . However, we also acknowledge that the Roomba is not infallible, so we will add some noise on the off-diagonal entries.

Since we will use Variational Inference, we also have to specify inference constraints. We will use a structured variational approximation to the true posterior distribution, where we decouple the variational posterior over the states (q(s_0, s)) from the posterior over the transition matrices (q(A)). This dependency decoupling in the approximate posterior distribution ensures that inference is tractable. Let’s build the model!

@model function hidden_markov_model(x, B, T)
# @model function hidden_markov_model(x)    
    ## B ~ MatrixDirichlet(ones(3, 3))
    A ~ MatrixDirichlet([10.0 1.0 1.0; 
                         1.0 10.0 1.0; 
                         1.0 1.0 10.0 ])
    d = fill(1.0/3.0, 3) #.

    s₀ ~ Categorical(d) #.
    e = fill(1. / 3., 3) #.
    
    sₖ₋₁ = s₀
    ## for k in eachindex(x)
    for k in 1:T
        dist = Categorical(e)
        u[k] ~ dist
        i = rand(_rng, dist)
        s[k] ~ Transition(sₖ₋₁, B[i])
        x[k] ~ Transition(s[k], A)
        sₖ₋₁ = s[k]
    end
end

@constraints function hidden_markov_model_constraints()
    q(s₀, s, A) = q(s₀, s)q(A)
end

## callbacks 
function before_model_creation()
    println("The model is about to be created")
end
function after_model_creation(model::ProbabilisticModel)
    println("The model has been created")
    println("  The number of factor nodes is: ", length(RxInfer.getfactornodes(model)))
    println("  The number of latent states is: ", length(RxInfer.getrandomvars(model)))
    println("  The number of data points is: ", length(RxInfer.getdatavars(model)))
    println("  The number of constants is: ", length(RxInfer.getconstantvars(model)))
end
function before_inference(model::ProbabilisticModel)
    println("The inference procedure is about to start")
end
function after_inference(model::ProbabilisticModel)
    println("The inference procedure has ended")
end
function before_iteration(model::ProbabilisticModel, iteration::Int)
    println("The iteration ", iteration, " is about to start")
end
function after_iteration(model::ProbabilisticModel, iteration::Int)
    println("The iteration ", iteration, " has ended")
end
function before_data_update(model::ProbabilisticModel, data)
    println("The data is about to be processed")
end
function after_data_update(model::ProbabilisticModel, data)
    println("The data has been processed")
end
function on_marginal_update(model::ProbabilisticModel, name, update)
    ## println("New marginal update for ", name, " ", update)
    println("New marginal update for ", name) #.
end

on_marginal_update (generic function with 1 method)

Next, we define the agent and the time-stepping procedure.

# ////////////////////////////////////////////////////////////////////

_B̃

3-element Vector{Matrix{Float64}}:
 [1.0 1.0 1.0; 0.0 0.0 0.0; 0.0 0.0 0.0]
 [0.0 0.0 0.5; 1.0 1.0 0.0; 0.0 0.0 0.5]
 [0.0 0.5 0.0; 0.0 0.5 0.0; 1.0 0.0 1.0]

# aₜ = 2
aₜ = rand_1hot_vec(_rng, Categorical(fill(1. / 3., 3)))

3-element Vector{Float64}:
 0.0
 1.0
 0.0

argmax(aₜ)

_B̃[argmax(aₜ)]

3×3 Matrix{Float64}:
 0.0  0.0  0.5
 1.0  1.0  0.0
 0.0  0.0  0.5

mymodel = hidden_markov_model(B=_B̃[argmax(aₜ)], T=2)
# mymodel = hidden_markov_model()

GraphPPL.ModelGenerator{typeof(hidden_markov_model), Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, @NamedTuple{B::Matrix{Float64}, T::Int64}}, GraphPPL.PluginsCollection{Tuple{}}, RxInfer.ReactiveMPGraphPPLBackend}(hidden_markov_model, Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, @NamedTuple{B::Matrix{Float64}, T::Int64}}(:B => [0.0 0.0 0.5; 1.0 1.0 0.0; 0.0 0.0 0.5], :T => 2), GraphPPL.PluginsCollection{Tuple{}}(()), RxInfer.ReactiveMPGraphPPLBackend())

model = GraphPPL.create_model(mymodel)
# model = RxInfer.create_model(mymodel)

ErrorException: Model hidden_markov_model is not defined for 2 interfaces ((:B, :T)).

# \\\\\\\\\\\\

function create_agent(; T, s₀, ys)
    compute = (aₜ::Vector{Float64}, ŷₜ::Vector{Float64}) -> begin
        imarginals = @initialization begin
            q(A) = vague(MatrixDirichlet, 3, 3)
            ## q(B) = vague(MatrixDirichlet, 3, 3)
            q(s) = vague(Categorical, 3)
            q(u) = vague(Categorical, 3)
        end
        result = infer(
            ## model=hidden_markov_model(),
            model=hidden_markov_model(B=_B̃[argmax(aₜ)], T=T),
            data=(x=ys,),
            constraints=hidden_markov_model_constraints(),
            initialization=imarginals,
            returnvars = (    
                A=KeepLast(),
                s=KeepLast()),    
            iterations   =_its,
            free_energy  =true,
            ## callbacks      = (
            #     before_model_creation = before_model_creation,
            #     after_model_creation = after_model_creation,
            #     before_inference = before_inference,
            #     after_inference = after_inference,
            #     before_iteration = before_iteration,
            #     after_iteration = after_iteration,
            #     before_data_update = before_data_update,
            #     after_data_update = after_data_update,
            #     on_marginal_update = on_marginal_update)
        )
    end

    act = () -> begin
        ## if result !== nothing
            ## uₜ = mode(result.posteriors[:u][3])[1]
            e = fill(1. / 3., 3)
            uₜ = rand_1hot_vec(_rng, Categorical(e))
            return uₜ
        ## else
            ## return 0.0 ## Without inference result we return some 'random' action
            ## once proper inference of u happens, put the 'if-part' here 
            ## uₜ = [1, 0, 0]
            ## return uₜ ##for now, just always go to the bedroom
        ## end
    end

    return (act,   compute)
end

create_agent (generic function with 1 method)

4.6 Agent Evaluation

Now it’s time to perform inference and find out where the Roomba went in our absence. Did it remember to clean the bathroom?

We’ll be using Variational Inference to perform inference, which means we need to set some initial marginals as a starting point. RxInfer makes this easy with the vague function, which provides an uninformative guess. If you have better ideas, you can try a different initial guess and see what happens.

Since we’re only interested in the final result - the best guess about the Roomba’s position - we’ll only keep the last results. Let’s start the inference process!

4.6.1 Evaluate with simulated data

4.6.1.1 Naive approach {N/A}

4.6.1.2 Active inference approach

### Simulation parameters
## Total simulation time
_N = 600

## Lookahead time horizon
## _T = 50 ## Lookahead time horizon

## Initial state
_s₀ = [1.0, 0.0, 0.0]

## Iterations
_its = 20

### OVERRIDES

(execute_ai, observe_ai) = create_envir(;
    Ã=_Ã,
    B̃=_B̃,
    s̃₀=_s₀
)
##- _ys = Vector{Vector{Float64}}(undef, _N) ## Observations
##- ISSUE: UndefRefError (SG)
_ys = Vector{Float64}[ [0.0, 0.0, 0.0] for _ = 1:_N ]
(act_ai,   compute_ai) = create_agent(;
    T=_N,
    s₀=_s₀,
    ys=_ys ##used by compute() for infer()'s data
)

## Step through experimental protocol
_as = Vector{Vector{Float64}}(undef, _N) ##. Actions
_ss = Vector{Vector{Float64}}(undef, _N) ## States
for t = 1:_N
    _as[t] = act_ai() ##. Invoke an action from the agent
    _ss[t] = execute_ai(_as[t]) ##. The action influences hidden external states
    _ys[t] = observe_ai() ## Observe the current environmental outcome (update p)
    compute_ai(_as[t], _ys[t]) #.
end

ErrorException: Half-edge has been found: u_7. To terminate half-edges 'Uninformative' node can be used.

## _ss

## _ys

## println("Posterior Marginal for A:")

## _result = compute_ai([1.0, 0.0, 0.0], [1.0, 0.0, 0.0]) #.
## _Aᵉˢᵗ = mean(_result.posteriors[:A]) #.

This compares well with:

_Ã = [
0.9  0.05 0.05;
0.05 0.9  0.05;
0.05 0.05 0.9]

## _result.posteriors

Finally, we can check if we were successful in keeping tabs on our Roomba’s whereabouts. We can also check if our model has converged by looking at the Free Energy.

## p1 = scatter(argmax.(_s̃s),
#     title="Simulation results", 
#     label="Real state", 
#     legend=:right,
#     xlabel="Time" ,
#     yticks=([1,2,3],["Bedroom","Living room","Bathroom"]),
#     markercolor=:cyan,
#     markersize=6,
#     markerstrokewidth=0,
#     alpha=.5,
#     size=(900,550)
# )
# p1 = scatter!(p1, argmax.(ReactiveMP.probvec.(_result.posteriors[:s])),
#     label="Inferred state",
#     markercolor=:green,
#     markersize=3,
#     markerstrokewidth=0,
#     alpha=.5
# )
# p2 = plot(_result.free_energy, 
#     label="Free energy",
#     xlabel="Iteration Number"
# )
# plot(p1, p2, layout=@layout([ a; b ]))

## p1 = scatter(argmax.(_s̃s),
#     title="Simulation results", 
#     label="Real state", 
#     legend=:right,
#     xlabel="Time" ,
#     yticks=([1,2,3],["Bedroom","Living room","Bathroom"]),
#     markercolor=:cyan,
#     markersize=6,
#     markerstrokewidth=0,
#     alpha=.5,
#     size=(900,550)
# )
# p1 = scatter!(p1, argmax.(ReactiveMP.probvec.(_result.posteriors[:s])),
#     label="Inferred state",
#     markercolor=:green,
#     markersize=3,
#     markerstrokewidth=0,
#     alpha=.5
# )
# p2 = plot(_result.free_energy, 
#     label="Free energy",
#     xlabel="Iteration Number"
# )
# plot(p1, p2, layout=@layout([ a; b ]))

Hidden Markov Model with Control (Part 2)

1 BUSINESS UNDERSTANDING

2 DATA UNDERSTANDING

3 DATA PREPARATION

4 MODELING

4.1 Narrative

4.2 Core Elements

4.3 System-Under-Steer / Environment / Generative Process

4.3.1 State and Observation variables

4.3.2 Decision variables

4.3.3 Exogenous information / Autonomous variables

4.3.4 Transition and Observation functions

Transition model B~

Observation model A~

4.3.5 Objective function

4.3.6 Implementation of the System-Under-Steer / Environment / Generative Process

4.4 Uncertainty Model

4.5 Agent / Generative Model

4.5.1 State and Observation variables

4.5.2 Decision variables

4.5.3 Preference variables / Setpoints

4.5.4 Transition and Observation functions

4.5.5 Implementation of the Agent / Generative Model / Internal Model

4.6 Agent Evaluation

4.6.1 Evaluate with simulated data

4.6.1.1 Naive approach {N/A}

4.6.1.2 Active inference approach

Transition model $\tilde{B}$

Observation model $\tilde{A}$