I’ve been reading a lot of statistical and computational literature and it seems like expectation notation is absued as shorthand for integrals by decorating the expectation symbol with a subscripted distribution like so:
This is super confusing, because expectations are properly defined as functions of random variables.
For example, the square bracket convention arises because random variables are functions and square brackets are conventionally used for functionals (that is, functions that apply to functions).
Expectation is an operator
With the proper notation, expectation is a linear operator on random variables, , where
is the sample space and
the type of a random variable. In the abused notation, expectation is not an operator because there’s no argument, just an expression
with an unbound variable
In this post (and yesterday’s), I’ve been following standard notational conventions where capital letters like are random variables and their corresponding lower case variables used as bound variables. Then rather than using
for every density, they are subscripted with the random variables from which they were derived, so the density of random variable
is written
.
Bayesian Data Analysis notation
Gelman et al.’s Bayesian Data Analysis book overloads notation using lower case $a$ for both $A$ and $a$. This requires the reader to do a lot of sleuting to figure out which variables are random and which are bound. It led to no end of confusion for me when I was first learning this material. It turns out disambiguating a dense formula with ambigous notation is easier when you already understand the result.
The overloaded notation from Bayesian Data Analysis fine in most applied modeling work, but it makes it awkward to talk about random variables and bound variables simultaneously. For example, on page 20 of the third edition, Gelman et al. write (using for the expectation symbol and round parens instead of brackets and italic derivative symbol),
Here, the in
is understood as a random variable and the other
as bound variables. It’s even worse with the covariance definition,
where the in
and
are random variables, whereas the other two uses are bound variables.
Using more explicit notation which distinguishes random and bound variables, includes the multiplication operators, specifies range of integration, disambiguates the density symbol, and sets the derivative symbol in roman rather than italics, these become
This lets us clearly write variance out as an expectation as
which would look as follows in Bayesian Data Analysis notation,
Conditional expectations and posteriors
The problem’s even more prevalent with posteriors or other conditional expectations, which I often see written using notation
for what I would write using conditional expectation notation as
As before, this uses random variable notation inside the expectation and bound variable notation outside, with indicating the random variable
takes on the value
.