Making differential equation models in Stan more computationally efficient via some analytic integration

We were having a conversation about differential equation models in pharmacometrics, in particular how to do efficient computation when fitting models for dosing, and Sebastian Weber pointed to this Stancon presentation that included a single-dose model.

Sebastian wrote:

Multiple doses lead to a quick explosion of the Stan codes – so things get a bit involved from the coding side.

– In practical applications I [Sebastian] would by now try to integrate over the dosing events. This allows you to avoid making the initials being vars such that you can a lot of speed. The trick is to make the ODE integrate “see” the dosing events and prevent the integrator to step over it. The cheap hack to do that is to insert a few observation points around the dosing event. This will make the integrator stop at the dosing time and make it realize that there are sudden changes in one of the compartments. This is not really clean, but CVODES handles it well and sundials even documents this technique as one way to handle it.

– I think we should one more time try the adjoint stuff. As we are about to get closures in Stan it should be straightforward to give to the integrator in Stan a log-prob function along with the ODE stuff. Once we have that, then we should be able to create something which is a lot faster than what we have now.

– Last time you tried the adjoint stuff your profiling showed that std::vector did stand out. This is because our AD system is darn slow to get the Jacobian of the ODE RHS. We should instead use an analytic derivative here – not sure how to automatically generate it ATM, but you get 2-3x speedups on the existing ODE system when you do that (and I have code around which does the automatic generation of the symbolic Jacobian…but this code is a well working prototype, not more)… hopefully the OCAML parser can churn these algebraic derivatives out.

– The other thing to explore is sparsity in the ODE RHS. Many ODE systems have a banded structure. One way to get this structural information is by specifying a chemical reaction network. Have a look at the dMod package on CRAN. The cool thing about knowing the sparistiy is that we don’t waste so much on derivative anyway not needed as they are constant and the ODE integrator can work more efficiently during the Newton iterations.

He continued:

Whenever we have long observation times of patients we usually follow a very regular administration pattern. As the dosing compartement often follows a first-order eliminiation scheme you can drastically simplify things. These first order elimination compartements have analytic solutions and for regular dosing patterns you can use the geometric series to sum up an arbitrary number of dosings very quickly. As a result you would just use the solution of the dosing compartment as a forcing function to the rest of the ODE system. This way you integrate out one state and you speedup a lot all the calculations.

Moreover, in many applications where we have these long observation times, we actually don’t care so much about getting everything right. We often only have data on steady-state and as such you can simplify drastically the models used wrt to the absorbtion process.

What I am saying is that with the use of our brains we can drastically make our life easier here… now, many people still just dump huge dosing histories into these programs and then complain about long running times…

As I have been burned by performance from this I would say that I found lots of good ways of how to avoid the performance killer imposed by ODE stuff + dosing.

This reminds me of something we did many years ago, at a much simpler level, explaining how analytical and simulation methods could be combined to get inferences for small probabilities.