17.01.2026
Differential Calculus#
Johannes Siedersleben, April 2026
Introduction#
WORK IN PROGRESS
Linear Mappings, Matrices, Tensors#
Let \(A\) be a real (m x n)-Matrix, so \(A^\intercal\) is (n x m). \(A\) defines a linear mapping \(A:\mathbb{R}^m \to \mathbb{R}^n\) by
Likewise,
So, every vector and every matrix can also be considered a linear mapping defined as above. Physicists prefer the bracket notation:
To the right of the bar is the argument, and to the left is the function. No need to care about transpositions. We normally use brackets, but keep in mind that there are three alternative ways to express the same thing.
Let \(B\) be a \((n \times p)\)-Matrix, so \(B^\intercal\) is \((p \times n)\). \(B\) defines a linear mapping \(B:\mathbb{R}^n \to \mathbb{R}^p\). The composition \(B\circ A: \mathbb{R}^m \to \mathbb{R}^p\) is a linear mapping defined by
so:
The matrix \((A\circ B)^\intercal\) is \((n \times p)\). This is important for the chain rule. The case \(p = 1\) is frequent:
Tensors, Frobenius Product#
Let \(x\) \(\in \) \(\mathbb{R}^n\), \(y\) \(\in \) \(\mathbb{R}^m\). Then
\(>>>\) Let S, T be tensors with shape = \(\left(n_1, n_2,\ldots, n_q\right)\). Then the Frobenius product \(\langle S,T\rangle\)of \(S\)and \(T\) is defined by
\(>>>\) or, in Einstein notation:
So, \(h^{\unicode{f3da}q\) is a tensors with shape = \((n, n, \ldots,n)\) with \(q\) times \(n\). With an \((n\times n)\)-matrix \(A\) we have:
Vector Calculus in \(\mathbb{R^n\)#
Let \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\)be a function. A linear mapping \(\text{DF}(x):\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\) is the derivative of \(F\) in \(x\), iff
\(\text{DF}(x)\) is also called the Jacobian of \(F\) in \(x\).
Definition 3 (Directional Derivatives)#
Let \(x,h \in \mathbb{R}^n\), \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\)be differentiable in \(x\)and defined in an open environment of \(x\). The directional derivative of \(F\) with respect to \(h\) is defined as:
The first equation is the definition of the directional derivative, the second definition of the derivative in one dimension, and the third follows from the chain rule. If \(b_j\) is the j-th basis vector, then
Theorem 2 (Chain Rule)#
Let \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\)be differentiable in x and \(G:\mathbb{R}^m\text{-\)>\( \mathbb{R}^n\)differentiable in F(x). Then \(G\circ F:U\text{-\)>\( W\)is differentiable in \(x\) and we have:
or, omitting the argument:
Note that:
You have to multiply the transpose of DG(F) and DF; DG(F) and DF cannot multiplied. The case \(p = 1\):
So, for instance:
or, with an \(n\times m\)-Matrix \(A\):
Proof:
So:
which proves the assertion via uniqueness.
Definition 2#
Let \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\)be a function. The partial derivative of \(F_i\) with respect to \(x_j\) is defined as:
Let \(\left(j_1, j_2,\ldots, j_q\right)\) be a sequence of indices. The partial derivative of \(F_i\) with respect to \(x_{j_1},x_{j_2},\ldots, x_{j_q\) is recursively defined as:
Theorem 3 (Jacobian and Partial Derivatives)#
a) If \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^m\) is differentiable in \(x\), then
which is the same as (omitting the argument \(x\))
So:
b) If all \(\partial _jF_i\) are continuous in an open environment \(U\) of \(x\), then \(F\) is differentiable in \(U\), and (a) holds for all \(y\in U\).
Theorem 4 (Partial derivatives are interchangeable)#
Let \(f:\mathbb{R}^n\text{-\)>\( \mathbb{R}^1\)be a function. If \(\partial _i\partial _jf\) and \(\partial _j\partial _if\) are continuously differentiable in an open environment of x, then
So, if all \(\partial _i\partial _jf\) are continuously differentiable in an open environment of x, then Theorem 3 applies and we have:
which is the same as
More generally, multiple derivative, such as
do not depend on the order of the indices, or, equivalently, the tensor of partial derivatives is fully symmetric.
Differential Operators#
There is the dot notation for \(n=1\):
The terms \(\partial ,\partial ^{\unicode{f3da}q\) and their aliases (\(\nabla \), grad, H, D, J) are operators that map a scalar- or vector-valued function (\(f\) or \(F\)) to a tensor-valued function, such as \(\partial f\).
Let \(F:\mathbb{R}^n\times \mathbb{R}^m\text{-\)>\( \mathbb{R}^p\)be a function of two vectors. Then we write, assuming that the arguments are denoted by \(x\) and \(y\):
\(>>>\) We assume that the subscript somehow identifies the argument in question. So, writing \(\left.\partial _{x_j}F(x,y\right)\) or \(\left.\partial _{y_k}F(x,y\right)\) can be useful and unambiguous. In physics, we often encounter arguments like L(x, \(\dot{x\)), to indicate that the second argument is supposed to the velocity. This leads to expressions like \(\partial _{\dot{x}}L\left(x,\dot{x}\right)\). In this case, \(\dot{x}\text{ \)is just the name of an argument, and it happens to have a dot at the top. Referencing arguments by name can be ambiguous. Take the functions \(f(x, y)\), \(g(x, y)\), \(h(x, y)\) and differentiate \(f(g(x,y), h(x,y))\).
The chain rule gives:
Ambiguities are eliminated by parenthesis. Identifying the variables by indices works as well:
Some Formulae#
div(curl) = 0
curl(gradient) = 0
div(product)
Laplace = grad(div) - curl(curl)
div(cross product) = G \(\cdot \) curl(F) - F \(\cdot \) curl(G)
Notation 4 (1+3 Dimensions) { 1}#
Definition 4 (Stationary Points) 1#
Let \(f:\mathbb{R}^n\text{-\)>\( \mathbb{R\)be a function, differentiable in \(x\). The point x is a stationary point of f if
Theorem 3 (Local Extrema)#
Let \(f:\mathbb{R}^n\text{-\)>\( \mathbb{R\)be a function, differentiable in \(x\). If f has a local minimum or maximum in \(x\), then \(x\) is a stationary point of \(f\).
Proof (for minimum only): Let h\(\in \)\(\mathbb{R}^n\) be a small vector. Then:
So, for small positive \(\epsilon\) we get:
which proves the statement.
Theorem 4 (Derivative of the Inverse Function)#
a) Let \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^n\)be differentiable in x, and let \(F^{-1}:\mathbb{R}^n\text{-\)>\( \mathbb{R}^n\)be differentiable in \(F(x)\). Then:
which is the same as (setting \(y=F(x)\))
which is the same as (omitting \(x\))
Proof: Immediate from the chain rule:
b) Let \(F:\mathbb{R}^n\text{-\)>\( \mathbb{R}^n\)be continuously differentiable in an open environment of \(x\), and let \(\text{DF}(x)\) be invertible.
Then \(F^{-1\) exists and is differentiable in an open environment \(U\) of \(x\). So, (a) applies on \(U\).
Proof: Hard stuff.
Theorem 5 (Derivative of Implicit Functions)#
Let \(F:\mathbb{R}^{n\times m}\text{-\)>\( \mathbb{R}^m\)be continuously differentiable in an open environment of\((a,b)\), and let \(F(a,b) = 0\). \(F\)is a function of two vectors \(x\) and \(y\). The partial derivatives \(\partial _xF\), \(\partial _yF\) are defined as above (see Notation XX). Let \(\partial _yF(a,b)\) be invertible.Then there is a function \(G:A\text{-\)>\(\mathbb{R}^m\) (with \(A\subset \mathbb{R}^n\) open and \(a\in A\)) such that:
and the derivative of \(G\) is given by
Proof: Let
H is invertible. The inverse \(H^{-1\)is defined in an open environment \(A\times B\) of \((a,0)\), with \(A\subset \mathbb{R}^n\), \(B\subset \mathbb{R}^m\)), and there is a function \(K:A\times B\text{-\)>\( \mathbb{R}^m\)such that
on \(A\times B\). So, for \(x\in A\) and \(y=0\) we get:
Setting \(G(x) = K(x,0)\) completes the proof. \(K\) is defined on \(A\times B\), \(G\) is defined on \(A\). The derivative of \(G\) is immediate from the chain rule.
Theorem 41 (Taylor in n dimensions)
Let \(f:\mathbb{R}^n\to \mathbb{R}\) and assume \(\partial^{\otimes (k+1)} f\) continuous on an open environment \(U\) of \(x \in \mathbb{R}^n\). Let \(h\in \mathbb{R}^n\) such that \(x+h \in U\). Then there exists a \(\xi \in [x,x+h]\) such that:
Proof. It holds that, for \(j \leq k\):
because:
and so on. Taylor in one dimension tells us that, for \(g(t) = f(x+th)\):
for some \(\xi \in [0,1]\). Rewriting this equation in terms of \(f\) gives the statement.
Calculus of Variations#
Definition 4 (Functional Derivatives)#
Let H be a Hilbert space with an inner product of \(\phi \), \(\psi \in \)H denoted as \(\langle \phi |\psi \rangle \). Often \(H=L^2(\Omega )\) for some region \(\Omega \subset \mathbb{R}^n\) and:
Let \(\phi \), h\(\in \)H, and \(F:H\text{-\)>\(\mathbb{R\) be a functional defined on an open environment of \(\phi \). The directional derivative of \(F\) with respect to \(h\) is defined as:
If \(\text{\)\delta \(F}(\phi )(h)\) can be expressed as an inner product:
then we write
or
Definition 5 (Dirac Function)#
The Dirac function \(\delta \):
Easy Examples#
Product Rule#
Chain Rule#
The first two terms are known from V3. It is the last (third) term we have to compute. We assume initially \(\phi \) and \(h\) to be scalars as in V2.
Derivatives on Hilbert Spaces#
Definition 30 (Derivatives)
Let \(F:U \to V\). The derivative \(\partial F(x)\) of \(F\) in \(x\) is defined as
or, equivalently:
If such a mapping exists, \(F\) is called differentiable in \(x\). The derivative of \(F\) in \(x\) is a linear mapping
that approximates the function \(F\) locally in \(x\) as shown in (75). We will show that under weak assumptions, \(\partial F\) can be calculated as
The differential operator \(\partial\) maps a differentiable function \(F\) to a function that maps a vector \(x\) to the linear mapping \(\partial F(x)\):
Definition 31 (Uniqueness of the Derivative)
The derivative is unique.
Proof. Assume that there are two linear mappings \(D, E\) satisfying (75):
Then:
which shows that \(D = E\).
Chain Rule, Product Pule#
Theorem 42 (Chain Rule)
Let \(G:\mathbb{R}^n \to \mathbb{R}^m\) be differentiable in x and \(F:\mathbb{R}^m \to \mathbb{R}^k\) differentiable in G(x). Then \(F \circ G: \mathbb{R}^n \to \mathbb{R}^k\) is differentiable in \(x\) and we have:
or, written with arguments:
You have to multiply the transpose of DG(F) and DF; DG(F) and DF cannot multiplied. The case \(p = 1\):
So, for instance:
or, with an \(n\times m\)-Matrix \(A\):
Proof:
So:
which proves the assertion via uniqueness.