Example: Gaussian with Given Variance

As a model we are given a univariate Gaussian distribution with unknown mean and standard deviation σ = 1. We have one data point at 2.0. For this toy example, we can compute the posterior over the mean. But to illustrate likelihood-free inference, let us assume here that we can only sample from the model:

using LikelihoodfreeInference, Distributions, Random
model(x) = randn() .+ x
model (generic function with 1 method)

LikelihoodfreeInference.jl passes parameter values as vectors to the model, even in the one-dimensional case. In our definition of the model we assume that x[1] is the mean.

Approximate the Posterior

Our first goal is to find the posterior over the mean given observation and a Gaussian prior with mean 0 and standard deviation 5.

data = [2.0]
prior = MultivariateNormal([0.], [5.])
DiagNormal(
dim: 1
μ: [0.0]
Σ: [25.0]
)

The true posterior is a Gaussian distribution with mean 25/26*2 and standard deviation 26/25

trueposterior = pdf.(Normal.(-1:.01:5, 26/25), 25/26*2.0)
using Plots
gr()
figure = plot(-1:.01:5, trueposterior, label = "posterior")

Now, we will use an adaptive sequential Monte Carlo method:

smc = AdaptiveSMC(prior = prior, K = 10^4)
result = run!(smc, model, data, verbose = true, maxfevals = 10^6);
 iteration       elapsed   fevals  epsilon      ess
         0     0 seconds    10000      Inf        1
         1     0 seconds    19000 8.769743e+00 0.8999999999999997
         2     0 seconds    27101 7.114152e+00 0.8101000000000007
         3     0 seconds    34392 5.968265e+00 0.7291000000000007
         4      1 second    40954 5.096213e+00 0.6561999999999997
         5      1 second    46860 4.471808e+00   0.5906
         6      1 second    52176 3.938934e+00   0.5316
Resampling...
         7      1 second    62176 3.482193e+00 0.9999999999999998
         8      1 second    71176 3.092114e+00 0.8999999999999998
         9      1 second    79279 2.737535e+00 0.8102999999999995
        10      1 second    86572 2.441145e+00 0.7292999999999995
        11      1 second    93136 2.170033e+00   0.6564
        12      1 second    99044 1.927520e+00   0.5908
        13     2 seconds   104362 1.730865e+00   0.5318
Resampling...
        14     2 seconds   114362 1.555458e+00 0.9999999999999998
        15     2 seconds   123362 1.392426e+00 0.8999999999999997
        16     2 seconds   131464 1.256187e+00 0.8102000000000004
        17     2 seconds   138756 1.128468e+00 0.7291999999999996
        18     2 seconds   145319 1.013145e+00 0.6563000000000002
        19     2 seconds   151226 9.081418e-01 0.5906999999999998
        20     2 seconds   156543 8.231226e-01 0.5317000000000001
Resampling...
        21     2 seconds   166543 7.377469e-01 0.9999999999999998
        22     2 seconds   175543 6.645589e-01 0.8999999999999998
        23     3 seconds   183644 5.940121e-01 0.8101000000000006
        24     3 seconds   190936 5.339343e-01 0.7291999999999996
        25     3 seconds   197499 4.798643e-01 0.6563000000000002
        26     3 seconds   203406 4.314831e-01 0.5906999999999998
        27     3 seconds   208723 3.858804e-01 0.5317000000000001
Resampling...
        28     3 seconds   218723 3.474274e-01 0.9999999999999998
        29     3 seconds   227726 3.138295e-01   0.9003
        30     3 seconds   235829 2.823995e-01 0.8102999999999997
        31     3 seconds   243124 2.526575e-01 0.7295000000000006
        32     4 seconds   249690 2.261661e-01 0.6566000000000001
        33     4 seconds   255600 2.030621e-01 0.5910000000000004
        34     4 seconds   260919 1.821672e-01 0.5319000000000002
Resampling...
        35     4 seconds   270919 1.641645e-01 0.9999999999999998
        36     4 seconds   279920 1.469560e-01   0.9001
        37     4 seconds   288022 1.332949e-01 0.8102000000000004
        38     4 seconds   295315 1.194527e-01 0.7292999999999995
        39     4 seconds   301879 1.074574e-01 0.6563999999999999
        40     4 seconds   307957 1.000000e-01 0.6077999999999999

As a Monte Carlo Method the result is a list of particles

particles(smc)
10000-element Array{Array{Float64,1},1}:
 [2.004767940014156]
 [2.496850672329263]
 [0.07075309496922522]
 [1.527226942535811]
 [1.3742892036798986]
 [0.33594454798232554]
 [3.0772166168618678]
 [1.0515420804460813]
 [2.191300117276455]
 [3.3606675389161396]
 ⋮
 [3.0539995074526924]
 [2.014040171857891]
 [2.9210170056258393]
 [1.255363541392468]
 [2.251619089673593]
 [2.8316182149697897]
 [0.38192142502851545]
 [1.4759198647079739]
 [2.6562673165976007]

with corresponding weights

weights(smc)
10000-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.00016452780519907864
 0.0
 0.0
 0.00016452780519907864
 0.00016452780519907864
 ⋮
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864
 0.00016452780519907864

The mean of the posterior is given by weights(smc) .* particles(smc), which is computed by the mean function.

mean(smc)

figure = histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")
plot!(figure, -1:.01:5, trueposterior, label = "posterior")

The result above also contains these weights and particles and some additional information.

keys(result)
(:weights, :particles, :n_sims, :epsilons)

AdaptiveSMC reduced the epsilon parameter adaptively, as we saw in column epsilon of the run above. We can plot this sequence.

scatter(cumsum(result.n_sims)[2:end], result.epsilons,
        yscale = :log10, ylabel = "epsilon", xlabel = "number of model evaluations")

Alternatively, we may want to use KernelABC.

kabc = KernelABC(prior = prior,
                 kernel = Kernel(),
                 delta = 1e-12,
                 K = 10^4)
result = run!(kabc, model, data, maxfevals = 10^4)
mean(kabc)

figure = histogram(vcat(particles(kabc)...), weights = weights(kabc),
                   xlims = (-1, 5), bins = 100,
                   normalize = true, label = "KernelABC")
plot!(figure, -1:.01:5, trueposterior, label = "posterior")

Point Estimates

Sometimes we just want a point estimate. We will use BayesianOptimization.jl here to minimize the default QDLoss. We know that the true maximum likelihood estimate is at mean = 25/26*2 ≈ 1.923

using BayesianOptimization
p = PointEstimator(optimizer = bo([-10.], [10.]), prior = prior, K = 100)
result = run!(p, model, data, maxfevals = 5*10^4, verbose = false);
result.x
1-element Array{Float64,1}:
 1.9149246288596158

KernelRecursiveABC is an alternative method that requires often only few model evaluations in low and medium dimensional problems

k = KernelRecursiveABC(prior = prior,
                       kernel = Kernel(),
                       kernelx = Kernel(),
                       delta = 1e-2,
                       K = 100)
result = run!(k, model, data, maxfevals = 2*10^3)
result.x
1-element Array{Float64,1}:
 2.0585323321846327

iid Samples

Let us suppose here that the data consists of multiple independent and identically distributed samples.

data_iid = [[2.0], [1.9], [2.8], [2.1]]
4-element Array{Array{Float64,1},1}:
 [2.0]
 [1.9]
 [2.8]
 [2.1]

There are two ways to deal with this data. Either we just assume it is one four-dimensional vector

data_onevec = vcat(data_iid...)
4-element Array{Float64,1}:
 2.0
 1.9
 2.8
 2.1

and we define the model as

model_iid_onevec(x) = vcat([model(x) for _ in 1:4]...)
smc = AdaptiveSMC(prior = prior, K = 10^4)
result = run!(smc, model_iid_onevec, data_onevec, verbose = true, maxfevals = 10^6);
histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")

Alternatively, we use another distance function:

model_iid(x) = [model(x) for _ in 1:4]
smc = AdaptiveSMC(prior = prior, K = 10^4, distance = energydistance)
result = run!(smc, model_iid, data_iid, verbose = true, maxfevals = 10^6);
histogram(vcat(particles(smc)...), weights = weights(smc), normalize = true, label = "AdaptiveSMC")

This page was generated using Literate.jl.