Update
This post was mainly about how, and under what circumstances, the Complex step method works. I was curious, and working it out in a blog post was interesting. The post was tangentially concerned other methods of numerical differentiation. But, it does not directly address the question of which is the best.
Extracting the derivative of a real-valued function from its analytic continuation.
The first topic in Nick Higham’s talk at JuliaCon 2018 is on using computation in the complex domain to numerically compute the derivative of a real-valued function. The goal of this post is to understand how and why this works, as well as its relation to other methods for computing a derivative. On the fourth slide (slides here) we find $$ f’(x) \approx \textrm{Im}\left\{ {\frac{f(x + ih)}{h} }\right\}. $$ This holds if $f(z)$ is complex-analytic at the real point $x$. This method is much more accurate than finite difference. In fact, it easily computes the derivative to machine precision[1]. The expression is easy to derive: Because the derivative of a complex-analytic function is path-independent, we can write $$ f(x + ih) = f(x) + f’(x)(ih) + O(h^2) \quad\quad (1). $$ Equating imaginary and real parts and dividing by $h$, we obtain the desired result. This method may be called “approximate automatic differentiation”. I have never seen a useful introductory description of automatic differentiation, although I suppose one exists somewhere. Automatic differentiation numerically evaluates symbolic derivatives. For instance, to compute the derivative of $\cos$ at a floating-point number $x$, the function $-\sin$ is evaluated at $x$. However, sums and the chain-rule, etc. are not handled symbolically, but rather algorithmically. All the difficulty lies in doing the latter efficiently.
Complex step differentiation in Julia
The talk was at JuliaCon, so here is an implementation of the complex-step method as well as a difference method in Julia.
# complex-step method
deriv_cstep(f, h=1e-100) = x -> imag(f(complex(x, h))) / h
# Centered difference method
# The default value of h is approximately optimal.
deriv_diff(f, h=sqrt(8*eps())) = x -> (f(x + h) - f(x - h)) / (2h)
# complex-step derivative of cosine
const cos_deriv_cstep = deriv_cstep(cos)
# Centered difference derivative of cosine
const cos_deriv_diff = deriv_diff(cos)
# Numeric evaluation of exact derivative of cosine
const cos_deriv_exact = x -> -sin(x)
const y1 = pi / 4
# Results and times
cos_deriv_diff(y1) = -0.7071067798694585
11.956 ns (0 allocations: 0 bytes)
cos_deriv_exact(y1) = -0.7071067811865475
10.308 ns (0 allocations: 0 bytes)
cos_deriv_diff(y1) - cos_deriv_exact(y1) = 1.31708899342442e-9
cos_deriv_cstep(y1) = -0.7071067811865475
26.843 ns (0 allocations: 0 bytes)
cos_deriv_cstep(y1) - cos_deriv_exact(y1) = 0.0
The complex-step method takes more than twice as much time as finite difference, but accuracy is optimal.
Why is it optimal ?
The method relies on the idea behind the Cauchy-Riemann equations, which relate the derivatives of the real and imaginary parts of an analytic function. The method works perfectly (not because of magic !) because the information including this relation between the derivatives is perforce encoded in the algorithm that computes the function for complex arguments. The function may be written $$ f(x + ih) = u(x, h) + i v(x, h), $$ where $x$, $h$, $u$, and $v$ are real. Then $$ \textrm{Im}\left\{ f(x + ih) \right\} = h \frac{\partial v(x,y)}{\partial y}\Bigg|_{y=0} + O(h^2), $$ where we have used $v(x,0)=0$, which follows from $f(x)=u(x,0)$. Dividing by $h$, taking the limit, and using the Cauchy-Riemann equations [or Eq. (1)], we have that $f’(x)$ is given by
$$ \frac{\partial v(x,y)}{\partial y}\Bigg|_{y=0} = \frac{\partial u(x,0)}{\partial x}. $$ The complex step method gives the left hand side. Finite difference schemes approximate the right hand side via evaluations of $u(x,0)$. Note that the two methods make use of two completely different real-valued functions, $u$ and $v$.
In fact, the algorithm for cos(z)
in Julia explicitly employs
$$ \cos(x + iy) = \cos(x) \cosh(y) - i \sin(x) \sinh(y). $$ The imaginary part is $- \sin(x) \sinh(y)$. Furthermore, for $|y|$ less than a small cutoff, and $10^{-100}$ is well below this cutoff, the algorithm for $\sinh(y)$ returns exactly $y$. So the complex-step method applied to cosine computes exactly $(-\sin(x) * h) / h$. And IEEE floating-point division returns exactly $-\sin(x)$. Now it is clear why this may be called “approximate automatic differentiation”. The encoding of the derivative follows from complex arithmetic: Polynomials are analytic. Consider $f(z) = z^2$ $$ f(z) = z^2 = x^2 - y^2 + i 2xy. $$ Finite difference schemes evaluate the function $x^2$. The complex step method gives $2x$, in this case for any $h = y > 0$.
Comparison to automatic differentiation
How does this compare to forward automatic differentiation ?
julia> using ForwardDiff
julia> @btime ForwardDiff.derivative(cos, y1)
20.619 ns (0 allocations: 0 bytes)
-0.7071067811865475
julia> @btime deriv_cstep(cos)(y1)
26.636 ns (0 allocations: 0 bytes)
-0.7071067811865475
ForwardDiff
) is faster. This is
not surprising because the complex step method performs complex arithmetic as well as unused computations. For instance the
real part is calculated, but not used.
Non-analytic functions
The function $f(z)=\textrm{abs}(z)^2= z \bar z = x^2 + y^2$ is nowhere complex analytic, but is real analytic for all real $x$ with $f’(x)=2x$. The complex step method is not applicable, but automatic differentiation is. We don’t need a computer to see that the complex step method will return identically $0.0$, while automatic differentiation trivially gives the same result as for $f(x)=x^2$. In fact, the same holds for $f(z)=\textrm{abs}(z)$, except at $x=0$.
External libraries
The complex step method is compatible with functions in external libraries.
using PyCall
@pyimport mpmath
# Bernoulli Polynomials
mybernpoly(n, z::Complex) = (res = mpmath.bernpoly(n, z); complex(float(res[:real]), float(res[:imag])))
mybernpoly(n, z::Real) = float(mpmath.bernpoly(n, z))
bpderiv_diff = deriv_diff(z -> mybernpoly(10, z))
bpderiv_cstep = deriv_cstep(z -> mybernpoly(10, z))
# Finite difference
julia> bpderiv_diff(2.0)
90.00000010344269
# Complex step
julia> bpderiv_cstep(2.0)
90.0
Higher order derivatives
Can we use the complex-step method to compute the second derivative of an analytic function $f(z)$ ? In general, no, we cannot. The second derivative is the derivative of the first derivative. The first derivative $f’(z)$ is indeed complex analytic. But, we only have a method to compute $f’(z)$ for real arguments.
Acknowledgments
Thanks to the people commenting in this thread for improving this post.
- “to machine precision” is slightly imprecise. In fact, the complex-step method gives exactly the numerical evaluation of the symbolic derivative. For instance, as computed by Julia (or python, for instance),
sin(pi/4)
differs fromsqrt(2)/2
byeps(sqrt(2)/2)
, the difference between successive floating-point numbers. The complex step method for the derivative ofcos(x)
evaluated atpi/2
returns exactly-sin(pi/4)
and, in particular, not-sqrt(2)/2
. [return]