Using Bayesian Inference and RxInfer to estimate daily litter events
Urban Management Industry
Bayesian Inference
Active Inference
RxInfer
Julia
Author
Kobus Esterhuysen
Published
March 15, 2024
Modified
September 4, 2024
0 Active Inference: Bridging Minds and Machines
In recent years, the landscape of machine learning has undergone a profound transformation with the emergence of active inference, a novel paradigm that draws inspiration from the principles of biological systems to inform intelligent decision-making processes. Unlike traditional approaches to machine learning, which often passively receive data and adjust internal parameters to optimize performance, active inference represents a dynamic and interactive framework where agents actively engage with their environment to gather information and make decisions in real-time.
At its core, active inference is rooted in the notion of agents as embodied entities situated within their environments, constantly interacting with and influencing their surroundings. This perspective mirrors the fundamental processes observed in living organisms, where perception, action, and cognition are deeply intertwined to facilitate adaptive behavior. By leveraging this holistic view of intelligence, active inference offers a unified framework that seamlessly integrates perception, decision-making, and action, thereby enabling agents to navigate complex and uncertain environments more effectively.
One of the defining features of active inference is its emphasis on the active acquisition of information. Rather than waiting passively for sensory inputs, agents proactively select actions that are expected to yield the most informative outcomes, thus guiding their interactions with the environment. This active exploration not only enables agents to reduce uncertainty and make more informed decisions but also allows them to actively shape their environments to better suit their goals and objectives.
Furthermore, active inference places a strong emphasis on the hierarchical organization of decision-making processes, recognizing that complex behaviors often emerge from the interaction of multiple levels of abstraction. At each level, agents engage in a continuous cycle of prediction, inference, and action, where higher-level representations guide lower-level processes while simultaneously being refined and updated based on incoming sensory information.
The applications of active inference span a wide range of domains, including robotics, autonomous systems, neuroscience, and cognitive science. In robotics, active inference offers a promising approach for developing robots that can adapt and learn in real-time, even in unpredictable and dynamic environments. In neuroscience and cognitive science, active inference provides a theoretical framework for understanding the computational principles underlying perception, action, and decision-making in biological systems.
In conclusion, active inference represents a paradigm shift in machine learning, offering a principled and unified framework for understanding and implementing intelligent behavior in artificial systems. By drawing inspiration from the principles of biological systems, active inference holds the promise of revolutionizing our approach to building intelligent machines and understanding the nature of intelligence itself.
1 BUSINESS UNDERSTANDING
Although the current project covers a small part of the span of Active Inference, we would nevertheless like to execute it within this context.
The client is responsible for the delittering of a mile-long beach walk-way in the Pacific Northwest in the USA. The density of foot traffic is roughly uniform along its length. Volunteers provide their services for cleaning up litter. One of the key determinants of the client’s planning is an estimation of the number of daily litter events along this walkway. The client does not want to over-engage his team of volunteers, nor does he want litter to become too noticeable.
2 DATA UNDERSTANDING
The number of daily litter events will be modeled by a Poisson distribution with parameter \(\theta\). This parameter, usually denoted by \(\lambda\), represents both the mean as well as the variance of the Poisson distribution. The \(\theta\) parameter will be learned or inferred by a model.
For additional insight, we will simulate some litter event data.
versioninfo() ## Julia version
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_NUM_THREADS =
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Resolving package versions...
No Changes to `/workspaces/2024-03-15^LitterModel/Project.toml`
No Changes to `/workspaces/2024-03-15^LitterModel/Manifest.toml`
Pkg.status()
Status `/workspaces/2024-03-15^LitterModel/Project.toml`
[a93c6f00] DataFrames v1.6.1
[b964fa9f] LaTeXStrings v1.3.1
⌃ [91a5bcdd] Plots v1.40.2
[86711068] RxInfer v3.6.0
⌃ [fdbf4ff8] XLSX v0.10.1
[9a3f8284] Random
Info Packages marked with ⌃ have new versions available and may be upgradable.
_rng =MersenneTwister(57)_N =365## daily measurements for 12 months_θ̃ =15##lambda of Poisson distribution_simdata = [float.(rand(_rng, Poisson(_θ̃), _N))]
We will use simulated data to prepare the model. To apply the model we will use data gathered from observations along the walk-way. There is no need to perform additional data preparation.
4 MODELING
4.1 Narrative
Please review the narrative in section 1.
4.2 Core Elements
This section attempts to answer three important questions:
What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?
For this problem, the only metric we are interested in is the daily number of litter events so that we can use Bayesian inference to estimate the mean of the Poisson distribution that that represents the littering events.
4.3 Environment Model (Generative Process)
The number of daily litter events will be given by \[
n^{Daily} \sim Pois(\theta)
\]
4.3.1 State variables
We do not have state variables. The only variable that needs to be inferred is \(\theta\), the mean (and variance) of the generative process, i.e. the Poisson distribution.
4.3.2 Decision variables
There will be no decision variables for this project.
4.3.3 Exogenous information variables
We assume that the volunteers that inspect the walk-way do not miscount litter events. Consequently we will not make provision for exogenous information variables.
4.3.4 Transition function
We will not use a transition function.
4.3.5 Objective function
We will not use an objective function.
4.3.6 Implementation of the Environment Model (Generative Process)
Let’s simulate some data with IID observations from a Poisson distribution, that represents the litter incidents. We also assume that the mean incidents per day is 15:
_rng =MersenneTwister(57)_N =365## daily measurements for 12 months_θ̃ˢⁱᵐ =15## hidden lambda of Poisson distribution
15
_simdata =float.(rand(_rng, Poisson(_θ̃ˢⁱᵐ), _N)) ##create data and convert to float
where \(x_i \in \{0, 1, ...\}\) is an observation induced by a Poisson likelihood while \(p(\theta)\) is a Gamma prior distribution on the parameter of the Poisson distribution. We are interested in inferring the posterior distribution of \(\theta\).
4.5.1 Implementation of the Agent Model (Generative Model)
We will use the RxInfer Julia package. RxInfer stands at the forefront of Bayesian inference tools within the Julia ecosystem, offering a powerful and versatile platform for probabilistic modeling and analysis. Built upon the robust foundation of the Julia programming language, RxInfer provides researchers, data scientists, and practitioners with a streamlined workflow for conducting Bayesian inference tasks with unprecedented speed and efficiency.
At its core, RxInfer leverages cutting-edge techniques from the realm of reactive programming to enable dynamic and interactive model specification and estimation. This unique approach empowers users to define complex probabilistic models with ease, seamlessly integrating prior knowledge, data, and domain expertise into the modeling process.
With RxInfer, conducting Bayesian inference tasks becomes a seamless and intuitive experience. The package offers a rich set of tools for performing parameter estimation, model comparison, and uncertainty quantification, all while leveraging the high-performance capabilities of Julia to deliver results in a fraction of the time required by traditional methods.
Whether tackling problems in machine learning, statistics, finance, or any other field where uncertainty reigns supreme, RxInfer equips users with the tools they need to extract meaningful insights from their data and make informed decisions with confidence.
RxInfer represents a paradigm shift in the world of Bayesian inference, combining the expressive power of Julia with the flexibility of reactive programming to deliver a state-of-the-art toolkit for probabilistic modeling and analysis. With its focus on speed, simplicity, and scalability, RxInfer is poised to become an indispensable tool for researchers and practitioners seeking to harness the power of Bayesian methods in their work.
To transfer the above factorized generative model to the RxInfer package, we need to include each of the factors:
\(N\) Kronecker-\(\delta\) factors (for the N observations)
\(1\) Gamma factor (for the prior distribution)
\(N\) Poisson factors (for the litter events)
## parameters for the prior distribution_αᴳᵃᵐ, _θᴳᵃᵐ =350., .05
(350.0, 0.05)
## Litter model: Gamma-Poisson@modelfunctionlitter_model(x, αᴳᵃᵐ, θᴳᵃᵐ)## prior on θ parameter of the model θ ~Gamma(αᴳᵃᵐ, θᴳᵃᵐ) ## 1 Gamma factor## assume daily number of litter incidents is a Poisson distributionfor i ineachindex(x) x[i] ~Poisson(θ) ## not θ̃; N Poisson factorsendend
The actual generative process actually had a much lower mean daily litter events, around about 8 events per day. The client can work with this value during planning of how to use his volunteers in the field.