On theories, symmetries and gauge

For a more complete development see my preprint.

Definition of "theory"

Working definition of Theory: It is a mechanism to select some fields. Given a space $M$ and the set $S = Γ (M, E)$ , for a certain bundle $E \to M$ , whose sections encompass all field degrees of freedom (metrics, matter fields, gauge connections, etc.), we aim to select a subset $S \subseteq S$ by means of any criteria (to be determined). These will be the fields of our theory.
The criteria is usually a variational principle $E$ .

Examples

Example 1. A zero-dimensional theory: space is a point.

Space: Let $M = {p}$ , a single-point manifold (0-dimensional spacetime).
Fields: Sections of a trivial bundle
$E = M \times R^{n} \to M,$
which simply means: an $R^{n}$ -valued field at the point $p$ . The space of sections is:
$S = Γ (M, E) ≅ R^{n} .$
Selection Mechanism (Theory): A variational principle defined by a function (note: integration becomes evaluation since $M$ is just a point):
$E [x] = \frac{1}{2} ∥ x ∥^{2} - ⟨ b, x ⟩,$
where $x \in R^{n}$ represents a field configuration, $∥ x ∥^{2} := \sum x_{i}^{2}$ , and $b \in R^{n}$ is a fixed parameter (think of an external source term).
Selected Fields (The Theory’s Solutions):
$S = {x \in R^{n} | δ E [x] = 0} = {x = b} .$
That is, the theory picks out a unique field configuration: the point $x = b$ in $R^{n}$ .

Example 2. A toy theory selecting fields supported on the line $x = 2$ .

Space: Let $M$ be a 2-dimensional manifold, with standard coordinates $φ = (x, y) \in R^{2}$ .
Fields: Sections of the trivial bundle

E = M \times R \to M,

i.e., scalar fields over $M$ .

Total Field Space:

S = Γ (M, E) \equiv {f : M \to R | f measurable} .

Selection Mechanism (Theory): A variational principle with action

E [f] = \iint_{R^{2}} (x - 2)^{2} (f_{φ} (x, y) - 0)^{2} d x d y + \iint_{R^{2}} δ (x - 2) (f_{φ} (x, y) - 1)^{2} d x d y .

where $f_{φ} = f \circ φ^{- 1}$ . We are not being very rigorous at this point.

Selected Fields:

S = {f \in S | δ E [f] = 0} = {h | h_{φ} (x, y) = {\begin{cases} 1, & x = 2, \\ 0, & x \neq 2 \end{cases}} .

That is, the theory only selects one field, $h$ , which is entirely supported on the vertical line $x = 2$ , taking value 1 there and vanishing elsewhere.

Example 3a. General Relativity with matter.

Space: Let $M$ be a 4D smooth manifold (spacetime).
Fields: Sections of the bundle $E = Lor (M) \oplus F \to M$ where $Lor (M)$ is the bundle of Lorentzian metrics and $F$ encodes matter fields.
Total Field Space: $S = Γ (M, E) = Γ (M, Lor (M)) \times Γ (M, F)$
Selection Mechanism (Theory): A variational principle with the Einstein--Hilbert action: $E [g, ϕ] = \int_{M} (R (g) + L_{matter} (g, ϕ, \nabla ϕ)) {vol}_{g}$
Selected Fields (The Theory's Solutions): $S = {(g, ϕ) \in S | δ E [g, ϕ] = 0}$ i.e., solutions to the Einstein equations and the matter field equations.

Example 3b. General Relativity with Pressureless Dust

Space: Let $M$ be a 4-dimensional smooth manifold (spacetime).
Fields: Sections of the bundle $E = Lor (M) \oplus (Λ^{0} (M) \oplus T M) \to M,$ where:
- $Lor (M)$ : the bundle of Lorentzian metrics on $M$ ,
- $Λ^{0} (M)$ : scalar fields (the mass density $ρ$ ),
- $T M$ : tangent bundle (the dust 4‑velocity field $u^{a}$ ).
Total Field Space: $S = Γ (M, E) = Γ (Lor (M)) \times Γ (Λ^{0} (M)) \times Γ (T M) .$
Selection Mechanism (Theory): A variational principle with action $E [g, ρ, u] = \int_{M} (R (g) + ρ g_{a b} u^{a} u^{b}) \sqrt{- g} d^{4} x,$ with constraints:
- $g_{a b} u^{a} u^{b} = - 1$ (normalization of 4-velocity),
- $\nabla_{a} (ρ u^{a}) = 0$ (mass conservation).
Selected Fields (The Theory’s Solutions): $S = {(g, ρ, u) \in S | \begin{aligned} G_{a b} = 8 π ρ u_{a} u_{b}, \\ u^{b} \nabla_{b} u^{a} = 0, \\ \nabla_{a} (ρ u^{a}) = 0 \end{aligned}} .$

Example 3c. Mercury perihelion problem.

Example 4. Fixed backgrounds.
Some times, theories are formulated on a spacetime with a prescribed structure, mostly a preferred field (a metric, a connection). These cases can also be framed within our definition, by simply adding a term to our relevant bundle $E$ and the corresponding one to the criteria $E$ (something similar to criteria $E$ in Example 2).

Passive transformations

I prefer to call them relabelings. There are two types: relabeling the spacetime $M$ , the coordinate changes, or relabeling the target space of the fields, i.e., frame changes also known as gauge transformations. Of course, these relabelings change the description of the criteria $E$ , and this was a reason for controversy when coordinates were not distinguished of the objects themselves. This is related to Kretschmann objection: any theory can be converted into general passive covariant if we introduce enough mathematical objects.

Example 1 revisited

In the case of Example 1, since the base space $M$ is a single point, there is no coordinate system on $M$ to transform. The only possible relabeling is in the target space, i.e., in the fibers of the bundle $E = M \times R^{n} \to M$ , which is trivial here. These are the analogues of gauge transformations.

Let’s analyze a change of frame in the target space. For instance, let’s consider a linear change of basis given by an invertible matrix $A \in G L (n, R)$ . Suppose we use a new frame ${\tilde{e}}_{i} = A^{j}_{i} e_{j}$ , and write the field in this frame: if the field in the original frame is described by the vector $x \in R^{n}$ , then in the new frame it is described by

\tilde{x} = A^{- 1} x .

Then, the variational principle $E$ expressed in terms of $\tilde{x}$ becomes:

E [\tilde{x}] = \frac{1}{2} ∥ A \tilde{x} ∥^{2} - ⟨ b, A \tilde{x} ⟩ .

That is,

E [\tilde{x}] = \frac{1}{2} {\tilde{x}}^{T} A^{T} A \tilde{x} - {\tilde{x}}^{T} A^{T} b .

The minimizer in this frame is given by

\tilde{x} = (A^{T} A)^{- 1} A^{T} b,

which corresponds, via $x = A \tilde{x}$ , to

x = b,

as expected. So again, the selected field is the same, merely described in new coordinates.

This illustrates the principle behind passive gauge transformations: the theory and its solutions remain unchanged, only their description is transformed.
By the way, this could have been done with any $ϕ \in D i f f (R^{n})$ .

Example 2 revisited

Let's analyze a coordinate transformation in Example 2. If we take other coordinates $ψ = (a, b)$ , related to the others by the (passive) transformation

(x, y) \overset{ϕ}{\mapsto} (a, b) = (- y, x),

with $ϕ = ψ \circ φ^{- 1}$ , our criteria $E$ takes the form, in the new coordinates,

E [f] = \iint_{R^{2}} (b - 2)^{2} f_{ψ} (a, b)^{2} d a d b + \iint_{R^{2}} δ (b - 2) (f_{ψ} (a, b) - 1)^{2} d a d b .

The solution to this variational problem is

g (a, b) = {\begin{cases} 1 & if b = 2, \\ 0 & otherwise . \end{cases}

which is nothing but the same distinguished $h$ in the Example 2, but expressed in the coordinate $ψ = (a, b)$ .
Importantly, even if $ϕ$ is the $+ \frac{π}{2}$ rotation, we are not rotating anything, we are only changing the labeling.

Similarly, since the bundle $E$ in Example 2 is a $G L (1)$ -bundle, we can consider a different trivialization. We can consider in each fibre $E_{p}$ , $p \in M$ , the basis $b (p) \neq 0$ , instead of 1 (this corresponds to a gauge transformation in the corresponding principal bundle). For instance, suppose $b (p) = 3, p \in M$ . Given a field $f$ , described in $φ$ -coordinates by $f_{φ}$ , with the new moving frame it will be described by ${\tilde{f}}_{φ} = \frac{1}{3} f_{φ}$ . So the new description for the criteria $E$ takes the form:

E [f] = \iint_{R^{2}} (x - 2)^{2} (3 {\tilde{f}}_{φ} (x, y) - 0)^{2} d x d y + \iint_{R^{2}} δ (x - 2) (3 {\tilde{f}}_{φ} (x, y) - 1)^{2} d x d y .

Obviously, the solution for this functional is

g (x, y) = {\begin{cases} \frac{1}{3} & if x = 2, \\ 0 & otherwise, \end{cases}

which is the transformed version of the description of the distinguished $h$ in Example 2.

Remark. In the passive (coordinate‑change) picture on a natural bundle, a single spacetime diffeomorphism $ϕ$ carries out two simultaneous relabelings: it reassigns each point’s coordinates on the base manifold, and—via the Jacobian of $ϕ$ —it reassigns the local frame (fiber basis) in exactly the way a gauge frame change would. Only in this passive viewpoint, and only for bundles that are naturally tied to the base (tangent, tensor, spinor, etc.), does one diffeomorphism deliver both a base‑point relabeling and an internal (frame) relabeling.

Active transformations

First, at the level of points, it is true that given a general transformation (diffeomorphism) $F : M \to M$ , it can be seen like a passive transformation and viceversa. Keep an eye: when we consider manifolds endowed with a structure, this is no longer true. It is explained here.
The same happens to gauge transformations: they can be interpreted as a change in the description but also as an active transformation of the field itself.

Example 1 revisited

Let’s now interpret the same change $A \in G L (n, R)$ as an active transformation: a transformation of the field itself, rather than its description. That is, we define a new field

\tilde{x} := A x,

and construct a new variational principle:

\tilde{E} [\tilde{x}] := E [A^{- 1} \tilde{x}] = \frac{1}{2} ∥ A^{- 1} \tilde{x} ∥^{2} - ⟨ b, A^{- 1} \tilde{x} ⟩ .

Explicitly,

\tilde{E} [\tilde{x}] = \frac{1}{2} {\tilde{x}}^{T} (A^{- 1})^{T} A^{- 1} \tilde{x} - {\tilde{x}}^{T} (A^{- 1})^{T} b .

The new minimizer is then

\tilde{x} = A b,

which is clearly different from the original minimizer $x = b$ . That is, under the active transformation $x \mapsto A x$ , the set of selected fields changes.

This shows that active covariance is not automatic — the theory is not invariant under arbitrary active transformations of the target space unless these transformations belong to a special subgroup preserving $E$ . That subgroup defines the symmetry group of the theory.

In this particular example, consider orthogonal transformations $B \in O (n)$ such that $B b = b$ . In that case

\tilde{E} [x] = E [B^{- 1} x] = \frac{1}{2} ∥ B^{- 1} x ∥^{2} - ⟨ b, B^{- 1} x ⟩ =

= \frac{1}{2} ∥ x ∥^{2} - ⟨ b, x ⟩,

which is of the same form as $E [x]$ . Therefore, the true gauge symmetry group of the theory is the stabilizer of $b$ inside $O (n)$ , i.e., those transformations preserving the inner product $⟨ b, x ⟩$ . These are the transformations under which the variational principle and the selected field remain unchanged.

Example 2 revisited

Let's start with the analogous to coordinate changes. Following along with our previous example, we can consider a diffeomorphism $F : M \to M$ induced by the previous change of coordinates:

\begin{matrix} p \in M & \overset{F}{\to} & q \in M \\ φ ↓ & ↑ φ^{- 1} \\ (x, y) \in R^{2} & \overset{ϕ}{\to} & (- y, x) \in R^{2} \end{matrix}

Then, $F$ induces a new (possibly different) variational principle $\tilde{E}$ :

\tilde{E} [f] = E [f \circ F],

which in this case is

\tilde{E} [f] = \iint_{R^{2}} (y - 2)^{2} f_{φ} (x, y)^{2} d x d y + \iint_{R^{2}} δ (y - 2) (f_{φ} (x, y) - 1)^{2} d x d y .

Now we have a different selected set of fields

S = {f \in S | δ \tilde{E} [f] = 0} = {\tilde{h} | {\tilde{h}}_{φ} (x, y) = {\begin{cases} 1, & y = 2, \\ 0, & y \neq 2 \end{cases}} .

In the original theory we were singling out the line $x = 2$ and now we are pointing its rotated version $y = 2$ .
So active covariance is not trivial, it only takes place when we restrict to a particular group of transformations called symmetries of the theory. In Example 2, I think that it can be checked that any transformation of the form

(x, y) \mapsto (x, α (x, y))

is such that $\tilde{E} = E$ , so they are symmetries of the theory.
In Example 3, general relativity, any diffeomorphism $F : M \to M$ is a symmetry for the theory. This is linked to the fact that the relevant bundle is a natural bundle.
Keep an eye: the symmetries of the theory do not have to leave each element $f \in S$ invariant, they can permute the elements of $S$ . In this sense, the general covariance of general relativity is the statement that any diffeomorphism of spacetime is a permutation of the set $S$ of metrics and matter fields.

Something similar happens to gauge transformations in Example 2. We can consider that $b (p) = 3$ is not a change in the basis for the fibers $E_{p}$ but a true transformation on the fibers. If we give that use to the gauge transformation above, we obtain indeed a different variational principle, which distinguish a different preferred field. The group of gauge transformations that leaves $E$ invariant is called the gauge symmetry group of the theory or gauge symmetries of the theory. Keep an eye: a gauge symmetry does not fix every $f \in S$ , but leaves $S$ invariant.

The gauge symmetry group for this theory can be explicitly computed. It consists of the group of active transformations $f \mapsto b \cdot f$ where the function $b (x, y)$ is restricted to be exactly $1$ on the line $x = 2$ , and can take the values $\pm 1$ everywhere else:

b (x, y) = {\begin{cases} 1, & x = 2 \\ \pm 1, & x \neq 2 \end{cases}

This is seen by imposing the symmetry condition $E [f] = E [f / b]$ for all fields $f$ . The term in the action weighted by $(x - 2)^{2}$ immediately forces $b (x, y)^{2} = 1$ for all $x \neq 2$ . Simultaneously, the Dirac delta term, which constrains the fields on the line $x = 2$ , requires that $(f / b - 1)^{2} = (f - 1)^{2}$ on that line. For this to hold for arbitrary field values $f$ , we must have $b (x, y) = 1$ when $x = 2$ .

Final reflection

It could look like if the symmetries (normal or gauge) of the theory are determined by the selected set $S$ , which in turn is determined by an action (Lagrangian) codified in $E$ . I am not sure, but I think that the process is the other way around: we take a criteria $E$ to determine $S$ (Einstein-Hilbert action, or whatever) that respects the symmetries we observe in experiments.
This "gauge principle" is one of the most powerful ideas in theoretical physics, underlying both general relativity and the Standard Model of particle physics.