Differential Operators: A Formal Algebraic Approach

Differential Operators: A Formal Algebraic Approach#

Introduction#

Classical vector calculus presents students with a bewildering array of distinct operations—gradient, divergence, curl, Laplacian, Jacobian, Hessian—each with its own notation, definition, and geometric interpretation. While this zoo of operators serves physics and engineering well, it obscures a deeper unity: all these operators are just different ways of composing partial derivatives.

This note develops a systematic, algebraic framework for differential operators that treats them as first-class mathematical objects that can be manipulated, composed, and applied. The key insight is disarmingly simple: if we denote the partial derivative operator by \(\partial_i\) (or \(\partial_x\), \(\partial_y\), etc., when variables have names), then we can construct all the familiar operators of vector calculus through standard algebraic operations—scalar multiplication, vector dot products, cross products, and tensor products.

The Central Idea#

Consider the gradient operator. Traditionally, we define it as an operation that takes a scalar function \(f: \mathbb{R}^n \to \mathbb{R}\) and produces a vector field. But we can think of it more abstractly: the gradient operator is the column vector

\[\begin{split}\partial = \begin{bmatrix} \partial_1 \\ \partial_2 \\ \vdots \\ \partial_n \end{bmatrix}\end{split}\]

where each \(\partial_i\) represents the operator “take the partial derivative with respect to the \(i\)-th variable.”

When we write \(\partial \cdot f\), we’re using the notation of scalar multiplication from linear algebra—but instead of multiplying numbers, we’re applying operators to functions. The gradient of \(f\) is simply this formal product.

This shift in perspective—from operations on functions to algebraic manipulation of operators—has profound consequences. Once we accept that \(\partial\) is a legitimate vector of operators, the rest of vector calculus falls into place through familiar algebraic constructions:

The Laplacian \(\Delta f = \nabla^2 f\) becomes \(\partial^2 \cdot f\), the inner product of \(\partial\) with itself, applied to \(f\)
The Hessian becomes \(\partial \otimes \partial\), the tensor (outer) product of \(\partial\) with itself
The divergence of a vector field \(F\) is \(\partial \cdot F\), the inner product of \(\partial\) and \(F\).
The curl of a vector field \(F\) is \(\partial \times F\), the cross product
The Jacobian of a vector field \(F\) is \(\partial \otimes F\), the tensor product

Here is my problem. Any \(y \in \mathbb{R^m}\) defines the linear form

\[\begin{split}y^* : \left\{ \begin{array}{lr} \mathbb{R^m} \to \mathbb{R} \\ x \mapsto y^*(x) = y^\intercal \cdot x \end{array} \right . \end{split}\]

It’s important to keep in mind that \(y\) and \(y^\intercal\) are different:

\[\begin{split}&y \in \mathbb{R^m} \\ &y^* \in \mathbb{R^m}^* \end{split}\]

\(y^*\) is the dual \(y\) and lives in the dual space \(\mathbb{R^m}^*\), a different world. In general, \(y^*\) and \(y^\intercal\) can be considered synonyms.

For the gradient \(\partial f\) of a function \(f: \mathbb{R^m} \to \mathbb{R}\) we have

\[\begin{split}&\partial f \in \mathbb{R^m} \\ &f(x + h) = f(x) + (\partial f)^\intercal h + o(h) \end{split}\]

So, the gradient is a column vector, and its dual appears in the crucial eqution above.

Let’s consider a linear mapping \(A: \mathbb{R^m} \to \mathbb{R^n}\). Is \(A\) an \(m \times n\) matrix (\(m\) rows, \(n\) columns) or is it \(n \times m\). If it’s true what I said about vectors and their duals then \(A\) should be \(m \times n\)

Its dual \(A^*\) is defined as

\[\begin{split}A^* : \left\{ \begin{array}{lr} \mathbb{R^n}^* \to \mathbb{R^m}^* \\ y^* \mapsto A^*(y^*) \end{array} \right . \end{split}\]

where \(A^*(y^*)\) is defined by:

\[A^*(y^*)(x) = y^*(A(x)) = y^\intercal A^\intercal x = (Ay)^\intercal x\]

Now, I am applying the same reasoning to the Jacobian. For the Jacobian \(\partial F\) of a function \(F: \mathbb{R^m} \to \mathbb{R^n}\) we have

\[\begin{split}&\partial F = \begin{bmatrix} &\partial_{1}F_1 &\ldots \partial_{1}F_n \\ &\ldots \\ &\partial_{m}F_1 \ldots \partial_{m}F_n \end{bmatrix} &F(x + h) = F(x) + (\partial F)^\intercal h + o(h) \end{split}\]

Tangent and Cotangent Spaces#

Dimension = 1#

Let \(Q\) be an 1-dimensional manifold given a function \(x: [a, b] \to V\) where \(V\) is a vector space; \(x \in C^\infty([a, b], V)\).

We define the operator \(\partial_t\) by

\( \partial_t: \left\{ \begin{array}{lr} C^{\infty}(Q) \to C^{\infty}(Q) \\ f \mapsto \partial_t (f \circ x) \end{array} \right . \)

The operator \(\partial_t\) is called the directional derivative operator in the t-direction. The tangent space of \(Q\) at \(x(t)\) is defined by \( T_{x(t)} Q = \text{span}\{\partial_t\} \) This is a one-dimensional vector space.

The cotangent space of \(Q\) at \(x(t)\) is denoted by \(T^{*}_{x(t)} Q\). This is also a one-dimensional vector space; its unique basis vector is the linear mapping \(d: T_{x(t)} Q \to \mathbb{R}\), defined by

\( d: \left\{ \begin{array}{lr} \text{span}\{\partial_t\} \to \mathbb{R} \\ f \mapsto df \end{array} \right . \)

where \(df\) is defined by:

\( df(\partial_t) = \partial_t f \)

so:

\( df(v) = v(f) = \lambda \partial_t f \)

for \(v = \lambda \partial_t \in \text{span}\{\partial_t\}\). \(df\) is called the differential of \(f\).

The tangent bundle is defined as a set of pairs (point, tangent vector):

\( T Q = \bigcup_{x \in Q} (\{x \} \times T_{x} Q) = \{(x, v) | x \in Q, v \in T_{x} Q \} = T Q = \bigcup_{t \in [a, b]} (\{x(t) \} \times T_{x(t)} Q) \)

\( T Q = \bigcup_{x \in Q} (\{x \} \times T_{x} Q) #1 = \{(x, v) | x \in Q, v \in T_{x} Q \} #2 = \{(x, v) | x \in Q, v in \text{span}\{\partial_t\}} #3 \)

\( T Q = \bigcup_{t \in [a, b]} (\{x(t) \} \times T_{x(t)} Q) = \{(x(t), \lambda \partial_t) | t \in [a, b], \lambda \in \mathbb{R} } \)

The cotangent bundle is defined as as a set of pairs (point, covector):

\( T^{*} Q = \{(x, p) | x \in Q, p \in T^{*}_{x} Q \} \)

Tangent and Cotangent Spaces#

Dimension = 1#

Let \(Q\) be an 1-dimensional manifold given a function \(x: [a, b] \to V\) where \(V\) is a vector space; \(x \in C^\infty([a, b], V)\).

We define the operator \(\partial_t\) by

\( \partial_t: \left\{ \begin{array}{lr} C^{\infty}(Q) \to C^{\infty}(Q) \\ f \mapsto \partial_t (f \circ x) \end{array} \right . \)

The tangent space of \(Q\) at \(x(t)\) is defined by:

\( T_{x(t)} Q = \text{span}\{\partial_t\} \)

It is a one-dimensional vector space.

The cotangent space of \(Q\) at \(x(t)\) is denoted by T^{*}_{x(t)} Q.

It is also a one-dimensional vector space; its unique basis vector is the linear mapping \(d: T_{x(t)} Q \to \mathbb{R}\), defined by

\( d: \left\{ \begin{array}{lr} \text{span}\{\partial_t\} \to \mathbb{R} \\ f \mapsto \df \end{array} \right . \)

where, for \(v = \lambda \partial_t \in \text{span}\{\partial_t\}\) \( df(v) = v(f) = \lambda \partial_t f \)

\[\begin{split}\partial_t: \left\{ \begin{array}{lr} C^{\infty}(Q) \to C^{\infty}(Q) \\ f \mapsto \partial_t f \circ x \end{array} \right . \\ \partial = \begin{array}{lr} &\partial_1 \\ &... \\ &\partial_n \\ \end{array}\end{split}\]

\[T_{x(t)} Q = {\lambda \dot{x}(t) | \lambda in \mathbb{R} } # (1) = {\partial/\partial t f(x(t)) \dot{x}(t) | f \in C^\infty(Q) } # (2) = {\partial/\partial t f(x(t)) | f \in C^\infty(Q) } # (3) = {\lambda \partial/\partial t | \lambda in \mathbb{R} } # (4) \]

Notation and Conventions#

We adopt the following notational framework:

Basic derivative operators:

\(\partial_i\) denotes \(\frac{\partial}{\partial x_i}\) when variables are indexed as \(x_1, x_2, \ldots, x_n\)
\(\partial_x, \partial_y, \partial_z\) when variables have names
\(\partial\) denotes the gradient operator, the column vector \((\partial_1, \partial_2, \ldots, \partial_n)^T\)

Operator composition:

\(\partial \cdot f\) is the gradient of scalar function \(f\) (scalar multiplication)
\(\partial^2 = \partial \cdot \partial\) is the Laplacian operator (dot product of operators)
\(\partial \otimes \partial\) (written \(\partial^{\otimes 2}\) for brevity) is the Hessian operator
\(\partial^{\otimes k}\) represents the \(k\)-th order tensor of partial derivatives
For vector fields \(F\): \(\partial \cdot F\) is divergence, \(\partial \times F\) is curl, \(\partial \otimes F\) is the Jacobian

The symbol \(\otimes\) denotes the tensor/outer product. In the accompanying code, we use \ocross as a text representation.

Why This Approach?#

Conceptual clarity: Instead of memorizing separate definitions for each operator, we have a single unified principle: compose \(\partial\) using standard algebraic operations.

Dimensional reasoning: The algebraic structure makes dimensional constraints transparent. The curl \(\partial \times F\) involves a cross product, which only makes sense in \(\mathbb{R}^3\). The Laplacian \(\partial^2\) is a scalar operator in any dimension. These constraints emerge naturally from the algebra rather than being imposed ad hoc.

Extensibility: Need the Poisson bracket? It’s simply the bilinear operator

\[\{A, B\} = \partial_x A \cdot \partial_p B - \partial_p A \cdot \partial_x \]

acting on pairs of functions. Want higher-order derivatives for Taylor expansions? Use \(\partial^{\otimes k}\).

Computational implementation: This algebraic perspective translates directly into code. An “operator engine” can implement these constructions as composable objects that respect the algebraic rules and can be applied to symbolic expressions via SymPy.

Formal vs. Applied#

It’s crucial to distinguish between the operator (a formal algebraic object) and its application to functions. The gradient operator \(\partial\) is a vector of derivative operators. When we apply it to a function \(f\) by writing \(\partial \cdot f\), we get the gradient of \(f\)—a vector field.

This distinction parallels the difference between a matrix \(A\) and its action \(A \mathbf{v}\) on a vector \(\mathbf{v}\). The matrix exists as an algebraic object independent of any particular vector; similarly, \(\partial\) exists as an operator independent of any particular function.### Notation and Conventions

We adopt the following notational framework:

Basic derivative operators:

\(\partial_i\) denotes \(\frac{\partial}{\partial x_i}\) when variables are indexed as \(x_1, x_2, \ldots, x_n\)
\(\partial_x, \partial_y, \partial_z\) when variables have names
\(\partial\) denotes the gradient operator, the column vector \((\partial_1, \partial_2, \ldots, \partial_n)^T\)

Operator composition:

\(\partial \cdot f\) is the gradient of scalar function \(f\) (scalar multiplication)
\(\partial^2 = \partial \cdot \partial\) is the Laplacian operator (dot product of operators)
\(\partial \otimes \partial\) (written \(\partial^{\otimes 2}\) for brevity) is the Hessian operator
\(\partial^{\otimes k}\) represents the \(k\)-th order tensor of partial derivatives
For vector fields \(F\): \(\partial \cdot F\) is divergence, \(\partial \times F\) is curl, \(\partial \otimes F\) is the Jacobian

The symbol \(\otimes\) denotes the tensor/outer product. In the accompanying code, we use \ocross as a text representation.

What Follows#

In the sections that follow, we’ll:

Formalize the partial derivative operators \(\partial_i\) and their basic properties
Construct the gradient operator \(\partial\) as a vector of operators
Define scalar, dot, cross, and tensor products of operators
Derive the classical operators (Laplacian, Hessian, divergence, curl, Jacobian) as special cases
Explore operator identities and how they correspond to theorems of vector calculus
Demonstrate the implementation of an “operator engine” that makes this machinery computational

The goal is not merely notational elegance. By treating differential operators as algebraic objects in their own right, we gain a systematic framework that unifies vector calculus, clarifies its structure, and extends naturally to more sophisticated settings—from tensor analysis to differential geometry to the functional analysis underlying variational calculus.

More Notation#

Einstein coovariant, index down. \(\mu\) always runs from \(0\) to \(3\)

\[\begin{split}\partial_{\mu} = \begin{bmatrix} \partial_{0} \\ \partial_{1} \\ \partial_{2} \\ \partial_{3} \end{bmatrix}\end{split}\]

Einstein contravariant, index up:

\[\begin{split}\partial^{\mu} = \begin{bmatrix} \partial^{0} \\ \partial^{1} \\ \partial^{2} \\ \partial^{3} \end{bmatrix} = \begin{bmatrix} \partial_{0} \\ - \partial_{1} \\ - \partial_{2} \\ - \partial_{3} \end{bmatrix}\end{split}\]

\[\begin{split}g_{\mu \nu} = \begin{bmatrix} 1 &0 &0 &0 \\ 0 &-1 &0 &0 \\ 0 &0 &-1 &0 \\ 0 &0 &0 &-1 \end{bmatrix}\end{split}\]

\[\partial^{\mu} = g_{\mu \nu} \partial_{\mu}\]

The d’Alembert symbol:

\[\Box = \partial_{\mu} \cdot \partial^{\mu} = \partial_{0}^2 - \partial_{1}^2 - \partial_{2}^2 - \partial_{3}^2\]

Some Formulae#

The divergence of a curl vanishes:#

\[\partial \cdot (\partial \times F) = 0\]

The curl of a gradient vanishes:#

\[\partial \times (\partial f) = 0\]

Product rules#

\[\begin{split}\partial (f \, g) &= \partial f \, g + f \, \partial g \\ \partial \cdot (F \, g) &= (\partial \cdot F) \, g + F \cdot \partial g \\ \partial \, (A \cdot B) &= (\partial \otimes A) \, B + A \, (\partial \otimes B) \\ \partial \cdot (A \times B) &= B \cdot (\partial \times A) - A \cdot (\partial \times B) \\ \partial \times (\partial \times F) &= \partial \, (\partial \cdot F) - \partial^2F\end{split}\]

from sympy import symbols
from sympy.matrices.dense import Matrix

from math4phys.diff_ops import (expr_equal, matrices_equal, 
                                make_scalar_field, make_vector_field, 
                                divergence, gradient, hessian, jacobian)

def test_diff_ops():
    n_dim = 5

    x = symbols(f'x_1:{n_dim + 1}', real=True)

    f = make_scalar_field('f', x)
    g = make_scalar_field('g', x)

    F = make_vector_field('F', x, n_dim)
    G = make_vector_field('G', x, n_dim)

    assert expr_equal(divergence(F * g), divergence(F) * g + (F.T * gradient(g))[0, 0])
    assert expr_equal(divergence(F * g), divergence(F) * g + (gradient(g).T * F)[0, 0])

    assert matrices_equal(gradient(f * g), gradient(f) * g + f * gradient(g))
    assert matrices_equal(gradient(F.T * G), jacobian(F) * G + jacobian(G) * F)
    assert matrices_equal(hessian(f), jacobian(gradient(f)))
    assert matrices_equal(hessian(f * g),
                          (hessian(f) * g +
                           gradient(f) * gradient(g).T +
                           gradient(g) * gradient(f).T +
                           hessian(g) * f))
    
test_diff_ops()
print('all tests passed')

all tests passed

from myst_nb import glue
from sympy import symbols, latex, sin, cos, Matrix

x = symbols('x')
expr = sin(x)**2 + cos(x)**2
glue("my_expr", latex(expr), display=False)
hessian = Matrix([[2*x, 1], [1, 2]])
glue("hessian_latex", latex(hessian), display=False)
glue("hessian_size", hessian.shape[0])

The expression \sin^{2}{\left(x \right)} + \cos^{2}{\left(x \right)} is always equal to 1.

Differential Operators: A Formal Algebraic Approach

Contents

Differential Operators: A Formal Algebraic Approach#

Introduction#

The Central Idea#

Tangent and Cotangent Spaces#

Dimension = 1#

Tangent and Cotangent Spaces#

Dimension = 1#

Notation and Conventions#

Why This Approach?#

Formal vs. Applied#

What Follows#

More Notation#

Some Formulae#

The divergence of a curl vanishes:#

The curl of a gradient vanishes:#

Product rules#