Using Bayesian Inference and RxInfer to estimate daily litter events
Urban Management Industry
Bayesian Inference
Active Inference
RxInfer
Julia
Author
Kobus Esterhuysen
Published
March 15, 2024
Modified
November 12, 2024
In this project the client is responsible for the delittering of a mile-long beach walk-way in the Pacific Northwest in the USA. The density of foot traffic is roughly uniform along its length. Volunteers provide their services for cleaning up litter.
Symbols/Nomenclature/Notation (KUF)
[informed by Powell Universal Framework (PUF), Bert DeVries, AIF literature]
Taxonomy of Machine Learning
Supervised Learning (Regression, Classification)
next state function
next state starts with provision: aquisition of another state/datapoint
sequence of (ordered) correlated observations \(y\) (time/spatial)
Overall Structure
Experiment has one-to-many Batches
Batch (into the page) has one-to-many Sequences
Sequence (down the page) has one-to-many Datapoints
Datapoint (into the page) has one-to-many Matrices
Matrix (down the page) has one-to-many Vectors
Vector (towards right) has one-to-many Components
Component/Element of type
Numerical [continuous/proportional]
int/real/float (continuous)
Categorical [non-continuous/non-formal]
AIF calls it ‘discrete’
ordinal (ordered)
nominal (no order)
for computers, elements need to be numbers, so categoricals encoded as numbers too
Most complex Datapoint handled is a multispectral image, i.e. 3D
True vs Inferred variables:
True variables associated with Generative Process genpr
e.g. \(\breve{s}, \breve{\mathbf{s}}, \breve{\theta}\)
Inferred variables associated with Generative Model agent
e.g. \(s, \mathbf{s}, \theta\)
General
Global code variables will be prefixed with an underscore ’_’.
0 Active Inference: Bridging Minds and Machines
In recent years, the landscape of machine learning has undergone a profound transformation with the emergence of active inference, a novel paradigm that draws inspiration from the principles of biological systems to inform intelligent decision-making processes. Unlike traditional approaches to machine learning, which often passively receive data and adjust internal parameters to optimize performance, active inference represents a dynamic and interactive framework where agents actively engage with their environment to gather information and make decisions in real-time.
At its core, active inference is rooted in the notion of agents as embodied entities situated within their environments, constantly interacting with and influencing their surroundings. This perspective mirrors the fundamental processes observed in living organisms, where perception, action, and cognition are deeply intertwined to facilitate adaptive behavior. By leveraging this holistic view of intelligence, active inference offers a unified framework that seamlessly integrates perception, decision-making, and action, thereby enabling agents to navigate complex and uncertain environments more effectively.
One of the defining features of active inference is its emphasis on the active acquisition of information. Rather than waiting passively for sensory inputs, agents proactively select actions that are expected to yield the most informative outcomes, thus guiding their interactions with the environment. This active exploration not only enables agents to reduce uncertainty and make more informed decisions but also allows them to actively shape their environments to better suit their goals and objectives.
Furthermore, active inference places a strong emphasis on the hierarchical organization of decision-making processes, recognizing that complex behaviors often emerge from the interaction of multiple levels of abstraction. At each level, agents engage in a continuous cycle of prediction, inference, and action, where higher-level representations guide lower-level processes while simultaneously being refined and updated based on incoming sensory information.
The applications of active inference span a wide range of domains, including robotics, autonomous systems, neuroscience, and cognitive science. In robotics, active inference offers a promising approach for developing robots that can adapt and learn in real-time, even in unpredictable and dynamic environments. In neuroscience and cognitive science, active inference provides a theoretical framework for understanding the computational principles underlying perception, action, and decision-making in biological systems.
In conclusion, active inference represents a paradigm shift in machine learning, offering a principled and unified framework for understanding and implementing intelligent behavior in artificial systems. By drawing inspiration from the principles of biological systems, active inference holds the promise of revolutionizing our approach to building intelligent machines and understanding the nature of intelligence itself.
1 BUSINESS UNDERSTANDING
Although the current project covers a small part of the span of Active Inference, we would nevertheless like to execute it within this context.
The client is responsible for the delittering of a mile-long beach walk-way in the Pacific Northwest in the USA. The density of foot traffic is roughly uniform along its length. Volunteers provide their services for cleaning up litter. One of the key determinants of the client’s planning is an estimation of the number of daily litter events along this walkway. The client does not want to over-engage his team of volunteers, nor does he want litter to become too noticeable.
2 DATA UNDERSTANDING
The number of daily litter events will be modeled by a Poisson distribution with parameter \(\theta\). This parameter, usually denoted by \(\lambda\), represents both the mean as well as the variance of the Poisson distribution. The \(\theta\) parameter will be learned or inferred by a model.
For additional insight, we will simulate some litter event data.
versioninfo() ## Julia version
Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_NUM_THREADS =
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Pkg.status()
Status `/workspaces/2024-03-15^LitterModel/Project.toml`
⌃ [a93c6f00] DataFrames v1.6.1
⌃ [b964fa9f] LaTeXStrings v1.3.1
⌃ [91a5bcdd] Plots v1.40.2
[8162dcfd] PrettyPrint v0.2.0
[54e16d92] PrettyPrinting v0.4.2
⌃ [08abe8d2] PrettyTables v2.3.2
⌃ [86711068] RxInfer v3.6.0
⌃ [fdbf4ff8] XLSX v0.10.1
[9a3f8284] Random
Info Packages marked with ⌃ have new versions available and may be upgradable.
3 DATA PREPARATION
We will use simulated data to prepare the model. To apply the model we will use data gathered from observations along the walk-way. There is no need to perform additional data preparation.
4 MODELING
4.1 Narrative
Please review the narrative in section 1.
4.2 Core Elements
This section attempts to answer three important questions:
What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?
For this problem, the only metric we are interested in is the daily number of litter events so that we can use Bayesian inference to estimate the mean of the Poisson distribution that that represents the littering events.
4.3 Environment Model (Generative Process)
The number of daily litter events will be given by \[
n^{Daily} \sim Pois(\theta)
\]
4.3.1 State variables
We do not have state variables. The only variable that needs to be inferred is \(\theta\), the mean (and variance) of the generative process, i.e. the Poisson distribution.
4.3.2 Decision variables
There will be no decision variables for this project.
4.3.3 Exogenous information variables
We assume that the volunteers that inspect the walk-way do not miscount litter events. Consequently we will not make provision for exogenous information variables.
4.3.4 Next State function
The provision function, \(f_p()\), provides another state/datapoint, called the provision/pre-state. Because this is a combinatorial system, the provision function acquires the next state/datapoint making use of a simulation or a data set.
\[\mathbf{p}_{i} = f_p(i)\]
## provision function, provides another state/datapoint from simulationfunctionfˢⁱᵐₚ(s; θ̆, 𝙼, 𝚅, 𝙲, rng) dp =Vector{Vector{Vector{Float64}}}(undef, 𝙼)for m in1:𝙼 ## Matrices dp[m] =Vector{Vector{Float64}}(undef, 𝚅)for v in1:𝚅 ## Vectors dp[m][v] =Vector{Float64}(undef, 𝙲)for c in1:𝙲 ## Components dp[m][v][c] =float(rand(rng, Poisson(θ̆)))endendend s̆ = dpreturn s̆end_s =1## s for sequence_θ̆ˢⁱᵐ =15## lambda of Poisson distribution_rng =MersenneTwister(57)## _s̆ = fˢⁱᵐₚ(_s, θ̆=_θ̆ˢⁱᵐ, 𝙼=3, 𝚅=4, 𝙲=5, rng=_rng) ## color image with 3 colors, 4 rows, 5 cols of elements## _s̆ = fˢⁱᵐₚ(_s, θ̆=_θ̆ˢⁱᵐ, 𝙼=1, 𝚅=4, 𝙲=5, rng=_rng) ## b/w image with 4 rows, 5 cols of elements_s̆ =fˢⁱᵐₚ(_s, θ̆=_θ̆ˢⁱᵐ, 𝙼=1, 𝚅=1, 𝙲=5, rng=_rng) ## vector with 5 elements## _s̆ = fˢⁱᵐₚ(_s, θ̆=_θ̆ˢⁱᵐ, 𝙼=1, 𝚅=1, 𝙲=1, rng=_rng) ## vector with 1 element
## provision function, provides another state/datapoint from fieldfunctionfᶠˡᵈₚ(s; 𝙼, 𝚅, 𝙲, df) dp =Vector{Vector{Vector{Float64}}}(undef, 𝙼)for m in1:𝙼 ## Matrices dp[m] =Vector{Vector{Float64}}(undef, 𝚅)for v in1:𝚅 ## Vectors dp[m][v] =Vector{Float64}(undef, 𝙲)for c in1:𝙲 ## Components# dp[m][v][c] = float(rand(rng, Poisson(θ̆))) dp[m][v][c] = df[s, :incidents]endendend s̆ = dpreturn s̆end## _s = 1 ## s for sequence## dp = fᶠˡᵈₚ(_s, 𝙼=3, 𝚅=4, 𝙲=5, df=_fld_df) ## color image with 3 colors, 4 rows, 5 cols of elements## dp = fᶠˡᵈₚ(_s, 𝙼=1, 𝚅=4, 𝙲=5, df=_fld_df) ## b/w image with 4 rows, 5 cols of elements## dp = fᶠˡᵈₚ(_s, 𝙼=1, 𝚅=1, 𝙲=5, df=_fld_df) ## vector with 5 elements## dp = fᶠˡᵈₚ(_s, 𝙼=1, 𝚅=1, 𝙲=1, df=_fld_df) ## vector with 1 element
fᶠˡᵈₚ (generic function with 1 method)
Because there is no noise to be combined with, the next state becomes
\[\breve{\mathbf{s}}_{i} = \mathbf{p}_{i}\]
The breve/bowl indicates that the parameters and variables are hidden and not observed.
4.3.5 Observation function
The response function, \(f_r()\), provides the response to the state/datapoint, called the response: \[\mathbf{r}_{i} = f_{r}(\breve{\mathbf{s}}_{i})\]
## response function, provides the response to a state/datapointfunctionfᵣ(s̆)return s̆ ## no noiseendfᵣ(_s̆)
Because there is no noise to be combined with, the next observation becomes
\[\mathbf{y}_i = \mathbf{\breve{s}}_i\]
The breve/bowl indicates that the parameters and variables are hidden and not observed.
4.3.6 Implementation of the Environment Model (Generative Process)
Let’s simulate some data with IID observations from a Poisson distribution, that represents the litter incidents. We also assume that the mean incidents per day is 15:
## Data comes from either a simulation/lab (sim|lab) OR from the field (fld)## Data are handled either in batches (batch) OR online as individual points (point)functionsim_data(rng, 𝚂, 𝙳, 𝙼, 𝚅, 𝙲, θ̆) p =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) s̆ =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) r =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) y =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂)for s in1:𝚂 ## sequences p[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) s̆[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) r[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) y[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳)for d in1:𝙳 ## datapoints p[s][d] =fˢⁱᵐₚ(s; θ̆=θ̆, 𝙼=𝙼, 𝚅=𝚅, 𝙲=𝙲, rng=rng) s̆[s][d] = p[s][d] ## no system noise r[s][d] =fᵣ(s̆[s][d]) y[s][d] = r[s][d]endendreturn yend;functionfld_data(df, 𝚂, 𝙳, 𝙼, 𝚅, 𝙲) p =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) s̆ =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) r =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂) y =Vector{Vector{Vector{Vector{Vector{Float64}}}}}(undef, 𝚂)for s in1:𝚂 ## sequences p[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) s̆[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) r[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳) y[s] =Vector{Vector{Vector{Vector{Float64}}}}(undef, 𝙳)for d in1:𝙳 ## datapoints p[s][d] =fᶠˡᵈₚ(s; 𝙼=𝙼, 𝚅=𝚅, 𝙲=𝙲, df=df) s̆[s][d] = p[s][d] ## no system noise r[s][d] =fᵣ(s̆[s][d]) y[s][d] = r[s][d]endendreturn yend;
## number of Batches in an experiment## _𝙱 = 1 ## not used yet## number of Sequences/examples in a batch_𝚂 =365## _𝚂 = 3## number of Datapoints in a sequence_𝙳 =1## _𝙳 = 2## _𝙳 = 3## number of Matrices in a datapoint_𝙼 =1## number of Vectors in a matrix_𝚅 =1## number of Components in a vector_𝙲 =1_θ̆ˢⁱᵐ =15## hidden lambda of Poisson distribution_rng =MersenneTwister(57)
where \(x_i \in \{0, 1, ...\}\) is an observation induced by a Poisson likelihood while \(p(\theta)\) is a Gamma prior distribution on the parameter of the Poisson distribution. We are interested in inferring the posterior distribution of \(\theta\).
4.5.1 Implementation of the Agent Model (Generative Model)
We will use the RxInfer Julia package. RxInfer stands at the forefront of Bayesian inference tools within the Julia ecosystem, offering a powerful and versatile platform for probabilistic modeling and analysis. Built upon the robust foundation of the Julia programming language, RxInfer provides researchers, data scientists, and practitioners with a streamlined workflow for conducting Bayesian inference tasks with unprecedented speed and efficiency.
At its core, RxInfer leverages cutting-edge techniques from the realm of reactive programming to enable dynamic and interactive model specification and estimation. This unique approach empowers users to define complex probabilistic models with ease, seamlessly integrating prior knowledge, data, and domain expertise into the modeling process.
With RxInfer, conducting Bayesian inference tasks becomes a seamless and intuitive experience. The package offers a rich set of tools for performing parameter estimation, model comparison, and uncertainty quantification, all while leveraging the high-performance capabilities of Julia to deliver results in a fraction of the time required by traditional methods.
Whether tackling problems in machine learning, statistics, finance, or any other field where uncertainty reigns supreme, RxInfer equips users with the tools they need to extract meaningful insights from their data and make informed decisions with confidence.
RxInfer represents a paradigm shift in the world of Bayesian inference, combining the expressive power of Julia with the flexibility of reactive programming to deliver a state-of-the-art toolkit for probabilistic modeling and analysis. With its focus on speed, simplicity, and scalability, RxInfer is poised to become an indispensable tool for researchers and practitioners seeking to harness the power of Bayesian methods in their work.
To transfer the above factorized generative model to the RxInfer package, we need to include each of the factors:
\(N\) Kronecker-\(\delta\) factors (for the N observations)
\(1\) Gamma factor (for the prior distribution)
\(N\) Poisson factors (for the litter events)
## parameters for the prior distribution_αᴳᵃᵐ, _θᴳᵃᵐ =350., .05
(350.0, 0.05)
## Litter model: Gamma-Poisson@modelfunctionlitter_model(x, αᴳᵃᵐ, θᴳᵃᵐ)## prior on θ parameter of the model θ ~Gamma(αᴳᵃᵐ, θᴳᵃᵐ) ## 1 Gamma factor## assume daily number of litter incidents is a Poisson distributionfor i ineachindex(x) x[i] ~Poisson(θ) ## not θ̃; N Poisson factorsendend
The actual generative process actually had a much lower mean daily litter events, around about 8 events per day. The client can work with this value during planning of how to use his volunteers in the field.