Introduction to the mathematics of general relativity

This article is on the minimal body of mathematics necessary to understand general relativity. For a more complete overview see Mathematics of general relativity.

An understanding of calculus and differential equations is necessary for the understanding of nonrelativistic physics. In order to understand special relativity one also needs an understanding of tensor calculus. To understand the general theory of relativity, one needs a basic introduction to the mathematics of curved spacetime that includes a treatment of curvilinear coordinates, nontensors, curved space, parallel displacement, Christoffel symbols, geodesics, covariant differentiation, the curvature tensor, Bianchi relations, and the Ricci tensor. This article follows the basic treatment in the lecture series on the topic, intended for advanced undergraduates, given by Paul Dirac at Florida State University.[Ref. 1]

For an introduction based on the specific physical example of particles orbiting a large mass in circular orbits, see Newtonian motivations for general relativity for a nonrelativistic treatment and Theoretical motivation for general relativity for a fully relativistic treatment.

Mathematics of special relativity

Vectors

Interval between two points

Spacetime physics requires four coordinates for the description of a point in spacetime:

ct=x^{0}\quad x=x^{1}\quad y=x^{2}\quad z=x^{3}

where c is the speed of light and x, y, and z are spatial coordinates.

A point very close to our original point is

x_{}^{\mu }+dx^{\mu }\quad \mu \in \{0,1,2,3\}

.

The square of the distance, or interval, between the two points is

ds^{2}=-\left(dx^{0}\right)^{2}+\left(dx^{1}\right)^{2}+\left(dx^{2}\right)^{2}+\left(dx^{3}\right)^{2}

and is invariant under coordinate transformations. Here we are using the Minkowski metric.

Coordinate transformations

Transformation of dx

If one defines a new coordinate system $x^{\mu '}$ such that

x^{\mu '}=x^{\mu '}\left(x^{\mu }\right)

then

\delta x^{\mu '}={\partial x^{\mu '} \over \partial x^{\nu }}dx^{\nu }\equiv x_{,\nu }^{\mu '}\;\delta x^{\nu }

where repeated indices are summed according to the Einstein summation convention.

The comma in the subscript of the last term indicates differentiation.

Transformation of a scalar

A scalar quantity transforms as

{\partial f\left(x^{\mu }\right) \over \partial x^{\mu '}}={\partial x^{\nu } \over \partial x^{\mu '}}{\partial f\left(x^{\mu }\right) \over \partial x^{\nu }}=x_{,\mu '}^{\nu }{\partial f\left(x^{\mu }\right) \over \partial x^{\nu }}

.

Contravariant vectors

Quantities $A^{\mu }$ that transform in the same way as $dx^{\mu }$ under a change of coordinates,

A^{\mu '}=x_{,\nu }^{\mu '}\;A^{\nu }

,

form a contravariant vector. The squared length of the vector is the invariant quantity

(A,A)\equiv -\left(A^{0}\right)^{2}+\left(A^{1}\right)^{2}+\left(A^{2}\right)^{2}+\left(A^{3}\right)^{2}

.

The term on the left is the notation for the inner product of A with itself.

Covariant vectors

The covariant vector is defined as

A_{0}\equiv -A^{0}\quad A_{1}\equiv A^{1}\quad A_{2}\equiv A^{2}\quad A_{3}\equiv A^{3}

.

It transforms as a scalar

A_{\mu '}=x_{,\mu '}^{\nu }A_{\nu }

.

Inner product

The inner product of two vectors is written

(A,B)_{}^{}=A_{\mu }B^{\mu }=B_{\mu }A^{\mu }

.

This quantity is also invariant under coordinate transformations.

Tensors

Definition

A rank 2 contravariant tensor can be constructed from the outer product of vectors as

T^{\mu \nu }\equiv A^{\mu }B^{\nu }+C^{\mu }D^{\nu }+E^{\mu }F^{\nu }+\cdots

.

Contravariant tensor

The components of a rank 2 contravariant tensor transform in the same way as the quantities ${A^{\mu }B^{\nu }}_{}^{}$ ,

T^{\mu '\nu '}=x_{,\mu }^{\mu '}\;x_{,\nu }^{\nu '}\;T^{\mu \nu }

.

Coavariant and mixed tensors

Higher rank tensors are constructed similarly as are covarariant and mixed tensors. For a rank 2 covariant tensor, the transformation is

T_{\mu '\nu '}=x_{,\mu '}^{\mu }\;x_{,\nu '}^{\nu }\;T_{\mu \nu }

.

Oblique axes

The interval and the metric tensor

An oblique coordinate system is one in which the axis are not necessarily orthogonal to each other. For oblique axes, the interval is

{ds^{2}}_{}^{}=g_{\mu \nu }dx^{\mu }dx^{\nu }=g_{\nu \mu }dx^{\mu }dx^{\nu }

where the coefficients $g_{\mu \nu }$ , called the metric tensor depend on the system of oblique axes.

Determinant of the metric tensor

The determinant of $g_{\mu \nu }$ is denoted $g$ and is always negative for any real coordinate axes.

Inner product

The inner product of any two vectors

{(A,B)}_{}^{}=g_{\mu \nu }A^{\mu }B^{\nu }

is invariant.

Relation between covariant and contravariant tensors

Covariant tensors are related to contravariant tensors by

{A_{\mu }}_{}^{}=g_{\mu \nu }A^{\nu }

and

{A^{\mu }}_{}^{}=g^{\mu \nu }A_{\nu }

where $g^{\mu \nu }$ is the cofactor of the corresponding $g_{\mu \nu }$

and

g_{\mu \nu }g^{\nu \rho }=g_{\mu }^{\rho }\equiv {\begin{cases}1,&{\mbox{if }}\mu =\rho \\0,&{\mbox{if }}\mu \neq \rho \end{cases}}

.

Nontensors

A nontensor is a tensor-like quantity ${N_{\mu }}$ that behaves like a tensor in the raising and lowering of indices,

{N_{\mu }}_{}^{}=g_{\mu \nu }N^{\nu }

and

{N^{\mu }}_{}^{}=g^{\mu \nu }N_{\nu }

,

but that does not transform like a tensor under a coordinate transformation.

Mathematics of general relativity

Curvilinear coordinates and curved spacetime

Curvilinear coordinates are coordinates in which the angles between axes can change from point to point. In other words, the metric tensor $g_{\mu \nu }$ in curvilinear coordinates is no longer a constant, but depends on the spacetime location of the metric tensor. It is therefore a field quantity.

Like the surface of ball embedded in three-dimensional space, we can imagine four dimensional spacetime as embedded in a flat space of a higher dimension. The coordinates on the surface of the ball are curvilinear, while the coordinates in three dimensional space can be rectilinear. The coordinates of four dimensional curved spacetime are curvilinear, while the four space is embedded in a larger dimensional space of rectilinear coordinates.

Parallel displacement

The interval in a high dimensional space

Imagine our four dimensional curved spacetime is embeded in a larger N dimensional flat space. Any true physical vector lies entirely in the curved physical space. In other words, the vector is tangent to the curved physical spacetime. It has no component normal to the four dimensional curved spacetime.

In the N dimensional flat space with coordinates $z^{n}(n=1,2,3,\cdots ,N)$ the interval between neighboring points is

ds_{}^{2}=h_{nm}dz^{n}dz^{m}

where $h_{nm}$ is the metric for the flat space. We do not assume the coordinates are orthogonal, only rectilinear.

The interval between two point in physical spacetime

To quote Dirac:

Physical spacetime forms a four dimensional "surface" in the flat N-dimensional space. Each point $x^{\mu }$ determines a definite point $y^{n}$ in the N-dimensional space. Each coordinate $y^{n}$ is a function of the four x's; say $y^{n}(x)$ . There are N-4 such equations.

The relation between neighboring contravariant vectors: Christoffel symbols

The difference in y for two neighboring points in the surface differing by $\delta x^{\mu }$ is

$\delta y^{n}=y_{,\mu }^{n}\delta x^{\mu }$

where

$y_{,\mu }^{n}={\partial y^{n}(x) \over \partial x^{\mu }}$ .

The interval between two neighboring points in physical spacetime becomes

ds_{}^{2}=h_{nm}dy^{n}dy^{m}=h_{nm}y_{,\mu }^{n}y_{,\nu }^{m}dx^{\mu }dx^{\nu }=g_{\mu \nu }\delta x^{\mu }\delta x^{\nu }

where

g_{\mu \nu }=h_{nm}y_{,\mu }^{n}y_{,\nu }^{m}=y_{,\mu }^{n}y_{n,\nu }

.

A contravariant vector at a point x in physical spacetime is related to the same contravariant vector at the same point y(x) in N-dimensional space by the relation

A^{n}=y_{,\mu }^{n}A^{\mu }

.

The vector lies in the surface of physical spacetime.

Now shift the vector $A^{n}$ to the point y^n(x+dx) keeping it parallel to itself. In other words, we hold the comonents of the vector constant during the shift. The vector no longer lies in the surface because of curvature of the surface.

The shifted vector can be split into two parts, one tangent to the surface and one normal to surface, as

A^{n}=A_{\mbox{tan}}^{n}+A_{\mbox{nor}}^{n}

.

The vector as a function of y tangent to the surface can be written in terms of the vector K in terms of x as

A_{\mbox{tan}}^{n}=Ky_{,\mu }^{n}(x+dx)

.

The normal vector $A_{\mbox{nor}}^{n}$ is normal to every vector in the surface including the unit vectors that define the comonents of $x^{\mu }$ . Therefore

A_{\mbox{nor}}^{n}\;\;y_{n,\mu }(x+dx)=0

.

This allows us to write

A^{n}\;y_{n,\mu }(x+dx)=K^{\mu }g_{\mu \nu }(x+dx)

or

K^{\mu }-A_{\nu }\equiv dA_{\nu }=A^{\mu }\;y^{n}{,\mu }y_{n,\nu ,\sigma }dx^{\sigma }\equiv A^{\mu }\;\Gamma _{\mu \nu \sigma }dx^{\sigma }

where

\Gamma _{\mu \nu \sigma }\equiv y_{,\mu }^{n}y_{n,\nu ,\sigma }

is a nontensor called the Christoffel symbol of the first kind. It can be shown to be related to the metric tensor through the relation

\Gamma _{\mu \nu \sigma }={1 \over 2}\left(g_{\mu \nu ,\sigma }+g_{\mu \sigma ,\nu }-g_{\nu \sigma ,\mu }\right)

.

Since the Christoffel symbol can be written entirely in terms of the metric in physical spacetime, all reference to ther N-dimensional space has disappeared.

Christoffel symbol of the second kind

The Christoffel symbol of the second kind is defined as

\Gamma _{\nu \sigma }^{\mu }\equiv g^{\mu \lambda }\Gamma _{\lambda \nu \sigma }

.

This operation is allowed for nontensors.

This allows us to write

dA_{\nu }=A_{\mu }\Gamma _{\nu \sigma }^{\mu }dx^{\sigma }

and

dA^{\nu }=-A^{\mu }\Gamma _{\mu \sigma }^{\nu }dx^{\sigma }

.

The minus sign in the second expression can be seen from the invariance of an inner product of two vectors

d\left(A^{\nu },B_{\nu }\right)=0

.

The constancy of the length of the parallel displaced vector

From Dirac:

The constancy of the length of the vector follows from geometrical arguments. When we split up the vector into tangential and normal parts ... the normal part is infinitesimal and is orthogonal to the tangential part. It follows that, to the first order, the length of the whole vector equals that of its tangential part.

The covariant derivative

The partial derivative of a vector with respect to a spacetime coordinate is composed of two parts, the normal partial derivative minus the the change in the vector due to parallel transport

A_{\mu ;\nu }\equiv A_{\mu ,\nu }-A_{\mu }\Gamma _{\mu \nu }^{\alpha }dx^{\sigma }

.

Geodesics

Suppose we have a point $z^{\mu }$ that moves along a track in physical spacetime. Suppose the track is parameterized with the quantity $\tau$ . The a "velocity" vector that points in the direction of motion in spacetime is

u^{\mu }\equiv {dz^{\mu } \over d\tau }

.

The variation of the velocity upon parallel displacement along the track is then

{du^{\mu } \over d\tau }+\Gamma _{\mu \sigma }^{\nu }u^{\mu }{dz^{\sigma } \over d\tau }

.

If there are no "forces" acting on the point, then the velocity is unchanged along the track and we have

{du^{\mu } \over d\tau }+\Gamma _{\mu \sigma }^{\nu }u^{\mu }{dz^{\sigma } \over d\tau }={d^{2}z^{\mu } \over d\tau ^{2}}+\Gamma _{\mu \sigma }^{\nu }{dz^{\mu } \over d\tau }{dz^{\sigma } \over d\tau }=0

,

which is called the geodesic equation.

References

[1] P. A. M. Dirac (1996). General Theory of Relativity. Princeton University Press. ISBN 0-691-01146-X.

[2] Misner, Charles; Thorne, Kip S. & Wheeler, John Archibald (1973). Gravitation. San Francisco: W. H. Freeman. ISBN 0-7167-0344-0.{{cite book}}: CS1 maint: multiple names: authors list (link)

[3] Landau, L. D. and Lifshitz, E. M. (1975). Classical Theory of Fields (Fourth Revised English Edition). Oxford: Pergamon. ISBN 0-08-018176-7.{{cite book}}: CS1 maint: multiple names: authors list (link)

[4] R. P. Feynman, F. B. Moringo, and W. G. Wagner (1995). Feynman Lectures on Gravitation. Addison-Wesley. ISBN 0-201-62734-5.{{cite book}}: CS1 maint: multiple names: authors list (link)

[5] Einstein, A. (1961). Relativity: The Special and General Theory. New York: Crown. ISBN 0-517-029618.