Numerical tools for economists¶
As the models and problems we try to solve get increasingly complicated, the likelihood that the existence of a closed-form solution decreases with it. Even simple problems can be impossible to solve by hand. For example, consider the following demand equation: $$ q(p) = 0.1p^{-0.2} + 0.7p^{-0.5} $$ What is the price that clears the market for a quantity of two units?
When closed-form solutions are impossible to find, we will have to resort to using numerical methods.
1. Interpolation¶
In macroeconomics in particular, we are often interested in finding entire functional equations, such as value functions and policy functions. However, we often face two challenging issues when trying to find them:
- Problems are often too complex for a closed form solution to exist. Even in the simplest settings, we need to resort to very careful calibration in order to obtain a solution without needing a computer to help us find a numerical solution.
- The functions we are interested in finding in many situations are continuous and defined over infinitely many elements. Yet, computers do not have enough memory to store infinitely many elements, much less perform computationally expensive operations on them.
Thus, we need computationally efficient ways to store and represent data (which also always arrive in discrete intervals even if the true underlying data generating process (DGP) is continuous). Scientists often use interpolation, which is a method of finding new data points based on finding a finite and discrete set of known points.
Interpolation needs at least three inputs:
- A set of inputs (inputs could be multi-dimensional),
- A corresponding set of outputs, and
- A family of basis functions
The first two are straightforward for today's purposes (and in empirical work, we do not have the freedom to choose which data points we observe anyway). (We will revisit this in future discussion sessions.) For illustrative purposes, suppose the true DGP trivially is the sine function $f(x) = sin(x)$ but we do not know that.
2. Numerical integration¶
As economists, we often study decision-making under uncertainty. Thus, in many computational applications, we will need to compute the expected value of functions of a random variable that follows a distribution. This usually means computing definite integrals of the form: $$ \int_I f(x)\omega(x)dx $$ where $\omega(x)$ is a weighting function (this can be anything - e.g. 1, or a probability density function, etc.).
As you may expect, computing this definite integral is usually not analytically feasible, hence we need to resort to numerical integration, or quadrature.[^1] All quadrature methods work by approximating the integral with a weighted sum of function evaluations, or $$ \int_I f(x)\omega(x)dx \approx \sum_{i=0}^N w_i f(x_i) $$ We need to make 3 important choices:
- The $x_i$'s, or quadrature nodes,
- the $w_i$'s, or quadrature weights, and
- $N$, the number of function evaluations.
There are many methods (Simpson's rule, Midpoint rule, the Trapezoid rule), but for today, we will only cover Gaussian quadrature. This method picks the $x_i$'s and $w_i$'s to satisfy $2N$ `moment-matching' conditions (for a given $N$): $$ \int_I x^k \omega(x) dx = \sum_{i=1}^N w_i x_i^k \hspace{0.2cm} \text{for} \hspace{0.2cm} k = 0, \ldots, 2N-1. $$ When $\omega(x) \equiv 1$, we are calculating the area under a curve, and this is called Gauss-Legendre quadrature.
3. Optimization & root-finding¶
There are many different algorithms and methods for solving equations but they can broadly be sorted into two categories: direct and iterative methods. A method is considered direct it is completed in a predetermined number of steps. Iterative methods take an initial guess of the solution and iterates upon it in some pre-specified way until the algorithm converges to the true solution (or is within some pre-specified tolerance threshold).
A non-exhaustive list of the most popular algorithms are the bisection method, Newton's method, the secant Method, the Golden search method, Brent's method, Nelder-Mead, particle swarm, simulated annealing, and MUCH more. Each algorithm has its own relative strengths and weaknesses, and their importance/relevance is highly situation-dependent. In practice however, two of the most important factors are:
- Speed
- Robustness/reliability
Algorithms that require a gradient (closed-form or numerical) such as Newtonian methods are typically significantly faster than non-derivative based methods since by construction, they require and exploit extremely information about the problem. If you have a `well-behaved' problem/the function you are dealing with does not have nasty features like kinks, discontinuities and extremely high curvature, gradient based solvers are probably ideal.
Their major downside is that they can be also be very unstable, potentially resulting in unideal outcomes like local minima/maxima, economically non-sensical results, or outright non-convergence. Ill-behaved problems and functions, bad/unlucky initial guesses, corner solutions, etc. are some of the many things that can throw a wrench in these faster methods. If you don't know much about the properties of the function and problem that you are dealing with, then the speed penalty from more robust algorithms may be worth it (after all, slowly converging to the solution is still faster than never converging).
Julia is home to a suite of excellent packages for root-finding and optimization. Here is a non-comprehensisve list of some of the most popular packages:
Some (very) opinionated thoughts on when to use each package¶
For finding the maximum (or minimum) of an objective function, I prefer Optim.jl due to its speed, flexibility (both in terms of the functions you can feed into it, and the options you can choose from like the specific solver, tolerance, etc), well-documented, and relatively low overhead in terms of syntax.
Roots.jl is for root-finding, and contains alot of the strengths of Optim.jl. Very usefully, it also supports box-constrained root-finding (which is very useful when you have a budget/resource constraint). Unfortunately, it can only handle univariate functions. For multivariate problems, you'll have to use IntervalRootFinding.jl or NonlinearSolve.jl. In my experience, NonlinearSolve.jl is faster than IntervalRootFinding.jl, but I find that IntervalRootFinding.jl is extremely useful during prototyping, especially if you are concerned about your function having multiple roots.
An aside on JuMP¶
JuMP is an extremely impressive package with three very appealing strengths:
- Constraints - JuMP is the best at constrained optimization, especially when you have multiple and/or complicated constraints. The other packages support constrained optimization and root-finding too, but the constraints are used expressed in terms of explicit upper and lower bounds, which is not nearly flexible (or easy) as JuMP.
- Ease - JuMP also leads the pack in terms of ease of translating your mathematical problem written on pen and paper into the computer.
- Documentation - of the packages I listed above, JuMP is probably the most well-documented and accessible to new-comers. There are alot of tutorials and resources which show off the many amazing features this package has. Moreover, if you ask a question about JuMP on the Julia Discourse, developers are usually very kind about answerering it.
Unfortunately, JuMP also suffers from two important drawbacks:
- Compatibility with other packages - unfortunately, JuMP does not play well with other Julia packages, especially when it comes to defining a complicated objective function that relies on other packages to be defined. At best, some of these issues can be surmounted at the cost of alot of overhead from converting objects into a format that JuMP accepts.
- Differentiation - JuMP's solvers need your objective function to be differentiable. If your function has nasty features like kinks and discontinuities, then JuMP is likely going to struggle.