Let's call to the frame bundle and to the connection 1-form of this connection. What is a vector ? The point represents a point in the manifold together with a choice of a basis for , and represents the beginning of a curve leaving and a choice of a basis for every . The value of tells us how the basis is changing (if it is the case) when we move along. This is "not natural" to , and must be introduced by hand. This "change" is infinitesimal, since it correspond to an infinitesimal step along , so it is measured by an element of . See Cartan geometry#Generalization of manifolds with affine connections for more info about "the big picture".
Motivational introduction
Consider first the situation in . Let be vector fields. To define the directional derivative of the vector field in the direction of the vector field at a point , we can mimic usual definition of directional derivative:
The result is a vector field on . You can check that the operation defined as above satisfies the following two properties:
.
.
Here, are vector fields and is a scalar function. The function (at a point ) is the directional derivative of at in the direction .
Now let us try and mimic the above construction on a general manifold. Given vector fields , we try to use the same formula and define
However, we see that there are two problems. First, the expression is not defined because we don't have a way of adding a point to a tangent vector . This is not so bad because we can actually replace the expression with any curve "which goes in the direction " such as the flow . The more serious problem is that we need to subtract the tangent vector from the tangent vector and those are two tangent vectors that belong to different vector spaces.
In general, without any extra data, we have no way of identifying tangent spaces at different points of .
To summarize, we see that we can differentiate vector fields along vector fields without any problem on but we encounter problems when we try and do it on a general manifold. But is also a manifold so what makes it special? We need extra data.
The definition of an affine connection is meant to supply the manifold "externally" with an operation which satisfies properties and so allows us to differentiate vector fields along vector fields. That is, instead of defining the directional derivative of a vector field along a vector field, we require that somebody handles us a mechanism which satisfies the properties that the familiar derivative satisfied on and then we will think of it as a directional derivative.
Or it can be inherited from the ambient (for example if the manifold is immersed in ). Covariant derivative operator on surfaces can be seen like inherited from the absolute parallelism of and the metric of :
where is the projection into the surface along its normal . For more see Gauss' Equation, Theorem 1.
We call covariant derivative operator or affine connection or linear connection to an operator , where is the set of all vector fields in , satisfying:
Commute with addition.
Leibniz rule.
Tensorial with respect to the first .
Commute with index contraction.
Applied to scalar fields, coincides with directional derivative respect to the vector.
For scalar fields we have commutation: .
Obviously this raises quite a lot of questions:
Does such mechanism always exists? (Yes).
It it unique? (No).
Is there a natural choice of such differentiation mechanism? (Yes, under certain circumstances).
Can we use this mechanism to recover the ability to identify tangent vectors at different points that was necessary to define the regular directional derivative in ? (Yes, at least along curves. This leads to the notion of parallel transport).
In this context, we call covariant derivative to the result
Once we have a covariant derivative on the tangent vector fields of a manifold, it can be extended to any tensor field with the same Christoffel symbols. In this video is explained how can be computed the covariant derivative of 1-forms, and in this part of the same video it is applied to any tensor.
For example, given the vector , the 1-form and the (0,2)-tensor :
Anyway, we could have defined directly a covariant derivative of a tensor field by the following properties:
For any smooth function ,
For any tensor fields and ,
For any tensor field , 1-form , and vector field ,
For any smooth function and vector fields and ,
Worked example
Another approach to understanding the problem is as follows:
In with Cartesian coordinates , we have a basis for the different tangent spaces. Moreover, the tangent vector at point and the vector at point are "the same," in the sense that I can translate it from to . This is because we are assuming the notion of traditional parallelism in .
Now, let's consider other coordinates, for example, given by the transformation:
The vector at point (note that this is the same as in the previous paragraph but expressed in coordinates ), and the vector at point are now not the same, from the perspective of traditional parallelism in . Let's see this:
at can be expressed in Cartesian coordinates as
at can be expressed in Cartesian coordinates as
Therefore, is not constant, even though its components are. This implies that if we want to differentiate a vector field in this new coordinate system, we cannot simply differentiate each component. For example, the derivative of would be 0, but we have just seen that it is not constant.
The way to fix this is to add correction terms to the traditional component-wise derivative that reflect the deformation of the axes themselves. Let's see this in the specific case of coordinates (note that for coordinates , it would be sufficient to differentiate each component because we assume that the basis vectors are constant):
Consider the vector field and the field . A consistent way to differentiate with respect to would be an operation that should satisfy:
Since the Leibniz rule should also hold for vectors, the above expression becomes:
These "correction terms" will be fully determined when we calculate , , , and . Let's translate everything into Cartesian coordinates, where we can differentiate because parallelism exists.
Therefore:
Thus, the "differentiation" becomes:
The components of , , , and in the basis are called Christoffel symbols and depend on the notion of parallelism and the chosen coordinates. They are symbolized by , and in our case, , and all others are 0.
Ultimately, a connection will determine a way to identify tangent vectors at one point with those at another point , although it will depend on the curve connecting them (parallel transport).