Quantcast
Viewing latest article 4
Browse Latest Browse All 9

Generative vs Declarative models Imperative vs Declarative programs

In programming, there are a variety of "paradigms" which languages fall into. The most well known is the "imperative" style of language, typified by C/C++, Matlab, Java, Python, etc.

There are also "functional" languages, like Scheme, Lisp, Haskell, and soforth.

Another genre is "declarative" languages, like Prolog, SQL, and a few others. Other than SQL, these languages are somewhat more rare for the average programmer to have experience with.

Some languages are Object Oriented, though typically within OO languages they are usually imperative in style.

Finally, many languages have aspects of all of these, for example R, Python, and to some extent C++ all support many aspects of functional paradigms, Common Lisp supports functional, object oriented, imperative, and declarative (through certain macro packages, where Prolog is built inside Common Lisp).

The distinction I'm trying to make here is between Imperative, and Declarative languages, and how that relates to building statistical models in a Generative vs Declarative style (related to my previous posts on declarative models) so it's worthwhile to think about how they are different.

In an imperative language, you tell the computer what to do so for example, you might tell it to loop over an index i, and add 1 to all the elements of an array, in R:

for(i in 1:10) {
   my_array[i] <- my_array[i]+1
}

In a declarative language, you tell the computer what the result you want is, and it figures out how to achieve it, in SQL:

select my_value + 1 from my_table where my_value > 0;

There is a proof (i'm told) that prolog, a declarative language, can calculate anything that C or another imperative language can, with at most logarithmic extra space. So, while SQL is a very limited declarative language, the declarative paradigm is fully general (Turing Complete).

Now, it's my opinion that this distinction carries over just about 1-1 when talking about building Bayesian statistical models. There are two basic paradigms:

Generative models:

In a generative model, the model is created as if the data and the parameters were the outputs of certain random number generators (RNGs). Michael Betancourt put it well in a discussion on the stan-user mailing list:

The generative picture is
1) generate truth, myfunction(x,a,b)
2) generate noise, epsilon ~ normal(0, sigma)
3) generate measurement, y = myfunction(x,a,b) + epsilon
This is a very "imperative" view of how you might specify a model. You can imagine coding it up as "sample the parameters a,b from the prior", then "calculate myfunction using the parameters" then generate random perturbation from the normal distribution, then say that "that's how y was created" (note, this last bit is almost always false, unless you're trying to experiment with your model based on simulated data).
This paradigm has some advantages, one of which is that you can easily create fake data and see how well your model fits the real data.

But, it's not the only way to think about Bayesian modeling. The alternative is more declarative, or since Bayesian models are a generalized logic, you can think like a Prolog "logic programmer"

Declarative models:

Lay out all the facts that you know:

a is between a_low and a_high with high probability, and is most probably around a_typ;

a ~ normal(a_typ, (a_high-a_low)/4); // or something similar

b is a positive number less than 10;

b ~ uniform(0,10);

The difference between the data and my predictions using myfunction is less than about 2 when the correct a, and b are used:

y-myfunction(x,a,b) ~ normal(0,2);

you can of course, elaborate, as I did in my model for simultaneously measuring a molecular length, and calibrating a set of lasers.

You're free to encode approximate knowledge of anything you like so long as its actually true knowledge (it's no good saying x ~ normal(0,3) when x is almost always > 50 in reality, unless you can get enough data to undo your incorrect assumption), and the Bayesian machinery (Stan!) will give you the approximate logical consequences of those statements.

I should say though, that it's very possible to go wrong by putting in facts that either are wrong, or by stating them in a way which is in-compatible with what you mean. For example, the well known problem with attempting to say that a parameter a has a lognormal distribution:

log(a) ~ normal(0,log(1.2));

It's well known that taking the logarithm compresses intervals of a into much smaller intervals of log(a). The degree of compression changes with the different values of a. To figure out what the distribution of a is (density per unit a), given that b=log(a) has a normal distribution per unit b, you can do:

normal(b,sigma) db = normal(log(a),sigma) db = normal(log(a),sigma) db * abs(da/db) to get a density in units of a

and Image may be NSFW.
Clik here to view.
da/db = 1/(db/da) = 1/(dlog(a)/da) = 1/(1/a) = a

Why doesn't this same type of caveat apply in the case of a declaration about a function of data and parameters? To answer that will take a little more thought and care, and will be left for next time.


Viewing latest article 4
Browse Latest Browse All 9

Trending Articles