.. _dual-doc: .. ipython:: python :suppress: import numpy as np ************ Dual Numbers ************ The ``rateslib.dual`` module creates dual number datatypes that provide this library with the ability to perform automatic differentiation (AD). The implementation style here is known as *forward mode*, as opposed to *reverse mode* (otherwise called *automatic adjoint differentiation* or *back propagation*). .. container:: twocol .. container:: leftside40 .. image:: _static/thumb_coding_2_1.png :alt: Coding Interest Rates: FX, Swaps and Bonds :target: https://www.amazon.com/dp/0995455562 :width: 145 :align: center .. container:: rightside60 The mathematics and theory used to implement *rateslib's* AD is documented thoroughly in the companion book *Coding Interest Rates: FX, Swaps and Bonds*. .. raw:: html
Summary ******* Classes ------- .. autosummary:: rateslib.dual.Dual rateslib.dual.Dual2 rateslib.dual.Variable Methods ------- .. autosummary:: rateslib.dual.gradient rateslib.dual.dual_exp rateslib.dual.dual_log rateslib.dual.dual_norm_pdf rateslib.dual.dual_norm_cdf rateslib.dual.dual_inv_norm_cdf rateslib.dual.dual_solve rateslib.dual.newton_1dim rateslib.dual.newton_ndim rateslib.dual.ift_1dim rateslib.dual.quadratic_eqn Example ******* Below, a standard Python function is created and is called with standard *floats*. First Derivatives ----------------- .. math:: f(x, y, z) = x^6 + e^{\frac{x}{y}} + \ln {z} \\ f(2, 1, 2) = 72.0822... .. ipython:: python from rateslib.dual import * def func(x, y, z): return x**6 + dual_exp(x/y) + dual_log(z) func(2, 1, 2) For extracting first derivatives, or gradients, we can depend on the *rateslib* :class:`~rateslib.dual.Dual` datatypes and its operator overloading: .. math:: \frac{\partial f}{\partial x} &= \left . 6 x^5 + \frac{1}{y} e^{\frac{x}{y}} \right |_{(2,1,2)} = 199.3890... \\ \frac{\partial f}{\partial y} &= \left . -\frac{x}{y^2} e^{\frac{x}{y}} \right |_{(2,1,2)} = -14.7781... \\ \frac{\partial f}{\partial z} &= \left . \frac{1}{z} \right |_{(2,1,2)} = 0.50 \\ .. ipython:: python x = Dual(2, ["x"], []) y = Dual(1, ["y"], []) z = Dual(2, ["z"], []) value = func(x, y, z) value.real gradient(value, ["x", "y", "z"]) Second Derivatives ------------------ For extracting second derivatives we must use the :class:`~rateslib.dual.Dual2` datatype: .. ipython:: python x = Dual2(2, ["x"], [], []) y = Dual2(1, ["y"], [], []) z = Dual2(2, ["z"], [], []) value = func(x, y, z) value.real value.dual gradient(value, ["x", "y"], order=2) Implementation *************** .. ipython:: python x = Dual(2.0, ["x"], []) y = Dual(3.0, ["y"], []) Forward mode AD is implemented using operating overloading and custom compatible functions. The operations implemented are; - addition (+) .. ipython:: python x + y + 1.5 - subtraction and negation (-), .. ipython:: python x - y - 1.5 - multiplication (*), .. ipython:: python x * y * 1.5 - division and inversion (/) (\*\*-1), .. ipython:: python x / y / 1.5 - dual and float type powers (\*\*), .. ipython:: python x ** y ** 1.5 - exponential and logarithms (which require the specific methods below), .. ipython:: python dual_exp(x) dual_log(y) - equality of dual numbers with integers and floats and with each other. .. ipython:: python x == y .. warning:: :class:`~rateslib.dual.Dual` and :class:`~rateslib.dual.Dual2` are not designed to operate with each other. The purpose for this is to avoid miscalculation of second derivatives. :class:`~rateslib.dual.Dual` should always be replaced by :class:`~rateslib.dual.Dual2` in this instance. ``TypeErrors`` will be raised otherwise. Compatability with NumPy ************************ To enable this library to perform its calculations in a vectorised way we need to leverage NumPy's array calculations. NumPy arrays containing dual numbers are forced to have an ``object`` dtype configuration. This is imposed by NumPy and means that certain functions may not be compatible, for example ``np.einsum`` (although, support for ``object`` dtypes was added to ``np.einsum`` as of version 1.25.0). However, many functions are compatible. Broadcasting ------------ Operations of :class:`~rateslib.dual.Dual` and :class:`~rateslib.dual.Dual2` with ``int`` and ``float`` dtypes permit the NumPy versions; np.int8, np.int16, np.int32, np.int64, np.float16, np.float32, np.float64, and np.float128. Broadcasting of arrays has been implemented so that the following operations work as expected. .. ipython:: python np_arr = np.array([1, 2]) Dual(3, ["x"], []) * np_arr np_arr / Dual(4, ["y"], []) Dual(4, ["x"], []) ** np_arr Elementwise Operations ---------------------- Simple operations on tensors also work as expected. .. ipython:: python x = np.array([Dual(1, ["x"], []), Dual(2, ["y"], [])]) y = np.array([Dual(3, ["x"], []), Dual(4, ["y"], [])]) x + y x * y x / y Linear Algebra -------------- Common linear algebraic operations are also available, such as: - ``np.matmul`` - ``np.inner`` - ``np.dot`` - ``np.tensordot`` .. ipython:: python np.dot(x, y) np.inner(x, y) np.matmul(x[:, np.newaxis], y[np.newaxis, :]) np.tensordot(x[np.newaxis, :, np.newaxis], y[np.newaxis, :], (1, 1)) Solving the linear system, :math:`Ax=b`, is not not directly possible from NumPy, thus a custom solver, :meth:`~rateslib.dual.dual_solve`, has been implemented using the Doolittle algorithm with partial pivoting. .. ipython:: python A = np.array([ [1, 0], [Dual(2, ["z"], []), 1] ], dtype="object") b = np.array([Dual(2, ["y"], []), Dual(5, ["x", "y"], [])])[:, np.newaxis] x = dual_solve(A, b) x np.matmul(A, x) Exogenous Variables ********************* The :class:`~rateslib.dual.Variable` class allows users to inject sensitivity into calculations without knowing which AD order is required for calculations - calculations will **not** raise *TypeErrors*. Upon first instance of a binary operation with an object of either order it will return an object of associated type. .. ipython:: python x = Variable(2.5, ["x"]) x * Dual(1.5, ["y"], [2.2]) x * Dual2(1.5, ["y"], [2.2], []) In order for other internal processes to function dynamically, *rateslib* maintains a **global AD order**. When a *Variable* performs a self operation from which it cannot infer the AD order, it will refer to this global state. .. ipython:: python defaults._global_ad_order = 1 dual_exp(x) defaults._global_ad_order = 2 dual_exp(x) defaults._global_ad_order = 1 # Reset Product Rule and ``keep_manifold`` *********************************** The ``keep_manifold`` argument is also exclusively available for :class:`~rateslib.dual.Dual2`. When extracting a first order gradient from a :class:`~rateslib.dual.Dual2` there is information available to also express the gradient of this gradient. Consider the function :math:`g(x)=x^2` at :math:`x=2`. This is the object: .. ipython:: python g = Dual2(4.0, ["x"], [4.0], [1.0]) The default, first order, ``gradient`` extraction will simply yield floating point values: .. ipython:: python gradient(g, order=1) However, by directly requesting to ``keep_manifold`` then the output will yield a :class:`~rateslib.dual.Dual2` datatype of this information with the appropriate sensitivity of this gradient derived from second order. .. ipython:: python gradient(g, order=1, keep_manifold=True) This is not used frequently within *rateslib* but its purpose is to preserve the chain rule, and allow two chained operations of the :meth:`~rateslib.dual.gradient` method. Consider the additional function :math:`h(y)=y^2`, evaluated at the point :math:`y=4`. This is the object: .. ipython:: python h = Dual2(16.0, ["y"], [8.0], [1.0]) It is natural to derive second order gradients of the product of these functions at the supposed points using: .. ipython:: python gradient(g * h, order=2) Below demonstrates the dual application of the method to derive the same values. .. ipython:: python gradient(gradient(g * h, keep_manifold=True)[0]) gradient(gradient(g * h, keep_manifold=True)[1])