Example: Gaussian with Given Variance
As a model we are given a univariate Gaussian distribution with unknown mean and standard deviation σ = 1. We have one data point at 2.0. For this toy example, we can compute the posterior over the mean. But to illustrate likelihood-free inference, let us assume here that we can only sample from the model:
using LikelihoodfreeInference, Distributions, Random
model(x) = randn() .+ x
model (generic function with 1 method)
LikelihoodfreeInference.jl passes parameter values as vectors to the model, even in the one-dimensional case. In our definition of the model we assume that x[1]
is the mean.
Approximate the Posterior
Our first goal is to find the posterior over the mean given observation and a Gaussian prior with mean 0 and standard deviation 5.
data = [2.0]
prior = MultivariateNormal([0.], [5.])
DiagNormal(
dim: 1
μ: [0.0]
Σ: [25.0]
)
The true posterior is a Gaussian distribution with mean 25/26*2 and standard deviation 26/25
trueposterior = pdf.(Normal.(-1:.01:5, 26/25), 25/26*2.0)
using Plots
gr()
figure = plot(-1:.01:5, trueposterior, label = "posterior")
Now, we will use an adaptive sequential Monte Carlo method:
smc = AdaptiveSMC(prior = prior, K = 10^4)
result = run!(smc, model, data, verbose = true, maxfevals = 10^6);
iteration elapsed fevals epsilon ess
0 0 seconds 10000 Inf 1
1 0 seconds 19000 8.769743e+00 0.8999999999999997
2 0 seconds 27101 7.114152e+00 0.8101000000000007
3 0 seconds 34392 5.968265e+00 0.7291000000000007
4 1 second 40954 5.096213e+00 0.6561999999999997
5 1 second 46860 4.471808e+00 0.5906
6 1 second 52176 3.938934e+00 0.5316
Resampling...
7 1 second 62176 3.482193e+00 0.9999999999999998
8 1 second 71176 3.092114e+00 0.8999999999999998
9 1 second 79279 2.737535e+00 0.8102999999999995
10 1 second 86572 2.441145e+00 0.7292999999999995
11 1 second 93136 2.170033e+00 0.6564
12 1 second 99044 1.927520e+00 0.5908
13 2 seconds 104362 1.730865e+00 0.5318
Resampling...
14 2 seconds 114362 1.555458e+00 0.9999999999999998
15 2 seconds 123362 1.392426e+00 0.8999999999999997
16 2 seconds 131464 1.256187e+00 0.8102000000000004
17 2 seconds 138756 1.128468e+00 0.7291999999999996
18 2 seconds 145319 1.013145e+00 0.6563000000000002
19 2 seconds 151226 9.081418e-01 0.5906999999999998
20 2 seconds 156543 8.231226e-01 0.5317000000000001
Resampling...
21 2 seconds 166543 7.377469e-01 0.9999999999999998
22 2 seconds 175543 6.645589e-01 0.8999999999999998
23 3 seconds 183644 5.940121e-01 0.8101000000000006
24 3 seconds 190936 5.339343e-01 0.7291999999999996
25 3 seconds 197499 4.798643e-01 0.6563000000000002
26 3 seconds 203406 4.314831e-01 0.5906999999999998
27 3 seconds 208723 3.858804e-01 0.5317000000000001
Resampling...
28 3 seconds 218723 3.474274e-01 0.9999999999999998
29 3 seconds 227726 3.138295e-01 0.9003
30 3 seconds 235829 2.823995e-01 0.8102999999999997
31 3 seconds 243124 2.526575e-01 0.7295000000000006
32 4 seconds 249690 2.261661e-01 0.6566000000000001
33 4 seconds 255600 2.030621e-01 0.5910000000000004
34 4 seconds 260919 1.821672e-01 0.5319000000000002
Resampling...
35 4 seconds 270919 1.641645e-01 0.9999999999999998
36 4 seconds 279920 1.469560e-01 0.9001
37 4 seconds 288022 1.332949e-01 0.8102000000000004
38 4 seconds 295315 1.194527e-01 0.7292999999999995
39 4 seconds 301879 1.074574e-01 0.6563999999999999
40 4 seconds 307957 1.000000e-01 0.6077999999999999
As a Monte Carlo Method the result is a list of particles
particles(smc)
10000-element Array{Array{Float64,1},1}:
[2.004767940014156]
[2.496850672329263]
[0.07075309496922522]
[1.527226942535811]
[1.3742892036798986]
[0.33594454798232554]
[3.0772166168618678]
[1.0515420804460813]
[2.191300117276455]
[3.3606675389161396]
⋮
[3.0539995074526924]
[2.014040171857891]
[2.9210170056258393]
[1.255363541392468]
[2.251619089673593]
[2.8316182149697897]
[0.38192142502851545]
[1.4759198647079739]
[2.6562673165976007]
with corresponding weights
weights(smc)
10000-element Array{Float64,1}:
0.0
0.0
0.0
0.0
0.0
0.00016452780519907864
0.0
0.0
0.00016452780519907864
0.00016452780519907864
⋮
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
0.00016452780519907864
The mean of the posterior is given by weights(smc) .* particles(smc)
, which is computed by the mean
function.
mean(smc)
figure = histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")
plot!(figure, -1:.01:5, trueposterior, label = "posterior")
The result
above also contains these weights and particles and some additional information.
keys(result)
(:weights, :particles, :n_sims, :epsilons)
AdaptiveSMC reduced the epsilon parameter adaptively, as we saw in column epsilon of the run above. We can plot this sequence.
scatter(cumsum(result.n_sims)[2:end], result.epsilons,
yscale = :log10, ylabel = "epsilon", xlabel = "number of model evaluations")
Alternatively, we may want to use KernelABC.
kabc = KernelABC(prior = prior,
kernel = Kernel(),
delta = 1e-12,
K = 10^4)
result = run!(kabc, model, data, maxfevals = 10^4)
mean(kabc)
figure = histogram(vcat(particles(kabc)...), weights = weights(kabc),
xlims = (-1, 5), bins = 100,
normalize = true, label = "KernelABC")
plot!(figure, -1:.01:5, trueposterior, label = "posterior")
Point Estimates
Sometimes we just want a point estimate. We will use BayesianOptimization.jl here to minimize the default QDLoss
. We know that the true maximum likelihood estimate is at mean = 25/26*2 ≈ 1.923
using BayesianOptimization
p = PointEstimator(optimizer = bo([-10.], [10.]), prior = prior, K = 100)
result = run!(p, model, data, maxfevals = 5*10^4, verbose = false);
result.x
1-element Array{Float64,1}:
1.9149246288596158
KernelRecursiveABC is an alternative method that requires often only few model evaluations in low and medium dimensional problems
k = KernelRecursiveABC(prior = prior,
kernel = Kernel(),
kernelx = Kernel(),
delta = 1e-2,
K = 100)
result = run!(k, model, data, maxfevals = 2*10^3)
result.x
1-element Array{Float64,1}:
2.0585323321846327
iid Samples
Let us suppose here that the data consists of multiple independent and identically distributed samples.
data_iid = [[2.0], [1.9], [2.8], [2.1]]
4-element Array{Array{Float64,1},1}:
[2.0]
[1.9]
[2.8]
[2.1]
There are two ways to deal with this data. Either we just assume it is one four-dimensional vector
data_onevec = vcat(data_iid...)
4-element Array{Float64,1}:
2.0
1.9
2.8
2.1
and we define the model as
model_iid_onevec(x) = vcat([model(x) for _ in 1:4]...)
smc = AdaptiveSMC(prior = prior, K = 10^4)
result = run!(smc, model_iid_onevec, data_onevec, verbose = true, maxfevals = 10^6);
histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")
Alternatively, we use another distance function:
model_iid(x) = [model(x) for _ in 1:4]
smc = AdaptiveSMC(prior = prior, K = 10^4, distance = energydistance)
result = run!(smc, model_iid, data_iid, verbose = true, maxfevals = 10^6);
histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")
This page was generated using Literate.jl.