Paper (section 1 to 3, of 22):


The Foundation of the General Theory of Relativity

paper by Albert Einstein (1916)

(Annalen der Physik 49)

A. FUNDAMENTAL CONSIDERATIONS ON THE POSTULATE OF RELATIVITY

1. Observations on the Special Theory of Relativity

THE special theory of relativity is based on the following postulate, which is also satisfied by the mechanics of Galileo and Newton.

If a system of co-ordinates K is chosen so that, in relation to it, physical laws hold good in their simplest form, the same laws also hold good in relation to any other system of co-ordinates K' moving in uniform translation relatively to K. This postulate we call the "special principle of relativity." The word "special" is meant to intimate that the principle is restricted to the case when K' has a motion of uniform translation relatively to K, but that the equivalence of K' and K does not extend to the case of non-uniform motion of K' relatively to K.

Thus the special theory of relativity does not depart from classical mechanics through the postulate of relativity, but through the postulate of the constancy of the velocity of light in vacuo, from which, in combination with the special principle of relativity, there follow, in the well-known way, the relativity of simultaneity, the Lorentzian transformation and the related laws for the behaviour of moving bodies and clocks.

The modification to which the special theory of relativity has subjected the theory of space and time is indeed far-reaching, but one important point has remained unaffected.

For the laws of geometry, even according to the special theory of relativity, are to be interpreted directly as laws relating to the possible relative positions of solid bodies at rest; and, in a more general way, the laws of kinematics are to be interpreted as laws which describe the relations of measuring bodies and clocks. To two selected material points of a stationary rigid body there always corresponds a distance of quite definite length, which is independent of the locality and orientation of the body, and is also independent of the time.

To two selected positions of the hands of a clock at rest relatively to the privileged system of reference there always corresponds an interval of time of a definite length, which is independent of place and time. We shall soon see that the general theory of relativity cannot adhere to this simple physical interpretation of space and time.

-== ==-

2: The Need for an Extension of the Postulate of Relativity

IN classical mechanics, and no less in the special theory of relativity, there is an inherent epistemological defect which was, perhaps for the first time, clearly pointed out by Ernst Mach. We will elucidate it by the following example: - Two fluid bodies of the same size and nature hover freely in space at so great a distance from each other and from all other masses that only those gravitational forces need be taken into account which arise from the interaction of different parts of the same body. Let the distance between the two bodies be invariable, and in neither of the bodies let there be any relative movements of the parts with respect to one another.

But let either mass, as judged by an observer at rest relatively to the other mass, rotate with constant angular velocity about the line joining the masses. This is a verifiable relative motion of the two bodies. Now let us imagine that each of the bodies has been surveyed by means of measuring instruments at rest relatively to itself, and let the surface of S1 prove to be a sphere, and that of S2 an ellipsoid of revolution. Thereupon we put the question - What is the reason for this difference in the two bodies? No answer can be admitted as epistemologically satisfactory unless the reason given is an observable fact of experience. The law of causality has not the significance of a statement as to the world of experience, except when observable facts ultimately appear as causes and effects.

Newtonian mechanics does not give a satisfactory answer to this question. It pronounces as follows: - The laws of mechanics apply to the space R1, in respect to which the body S1 is at rest, but not to the space R2 , in respect to which the body S2 is at rest. But the privileged space R1 of Galileo, thus introduced, is a merely fictitious cause, and not a thing that can be observed. It is therefore clear that Newton's mechanics does not really satisfy the requirement of causality in the case under consideration but only apparently does so, since it makes the fictitious cause R1 responsible for the observable difference in the bodies S1 and S2 .

The only satisfactory answer must be that the physical system consisting of S1 and S2 reveals within itself no imaginable cause to which the differing behaviour of S1 and S2 can be referred. The cause must therefore lie outside this system. We have to take it that the general laws of motion, which in particular determine the shapes of S1 and S2 , must be such that the mechanical behaviour of S1 and S2 is partly conditioned in quite essential respects, by distant masses which we have not included in the system under consideration. These distant masses and their motions relative to S1 and S2 must then be regarded as the seat of the causes (which must be susceptible to observation) of the different behaviour of our two bodies S1 and S2. They take over the rôle of the fictitious cause R1. Of all imaginable spaces R1, R2, etc., in any kind of motion relatively to one another there is none which we may look upon as privileged a priori without reviving the above-mentioned epistemological objection. The laws of physics must be of such a nature that they apply to systems reference in any kind of motion. Along this road we arrive at an extension at the postulate of relativity.

In addition to this weighty argument from the theory of knowledge, there is a well-known physical fact which favours an extension of the theory of relativity. Let K be a Galilean system of reference, i.e. a system relatively to which (at least in the four-dimensional region under consideration) a mass, sufficiently distant from other masses, is moving with uniform motion in a straight line. Let K' be a second system of reference which is moving relatively to K in uniformly accelerated translation. Then, relatively to K', a mass sufficiently distant from other masses would have an accelerated motion such that its acceleration and direction of acceleration are independent of the material composition and physical state of the mass.

Does this permit an observer at rest relatively to K' to infer that he is on a "really" accelerated system of reference? The answer is in the negative; for the above-mentioned relation of freely movable masses to K' may be interpreted equally well in the following way. The system of reference K' is unaccelerated, but the space-time territory in question is under the sway of a gravitational field, which generates the accelerated motion of the bodies relatively to K'.

This view is made possible for us by the teaching of experience as to the existence of a field of force, namely, the gravitational field, which possesses the remarkable property of imparting the same acceleration to all bodies. The mechanical behaviour of bodies relatively to K' is the same as presents itself to experience in the case of systems which we are wont to regard as "stationary" or as "privileged." Therefore, from the physical standpoint, the assumption readily suggests itself that the systems K and K' may both with equal right be looked upon as "stationary" that is to say, they have an equal title as systems of reference for the physical description of phenomena.

It will be seen from these reflexions that in pursuing the general theory of relativity we shall be led to a theory of gravitation, since we are able to "produce" a gravitational field merely by changing the system of co-ordinates. It will also be obvious that the principle of the constancy of the velocity of light in vacuo must be modified, since we easily recognize that the path of a ray of light with respect to K' must in general be curvilinear, if with respect to K light is propagated in a straight line with a definite constant velocity.

-== ==-

3. The Space-Time Continuum. Requirement of General Co-Variance for the Equations Expressing General Laws of Nature

IN classical mechanics, as well as in the special theory of relativity, the co-ordinates of space and time have a direct physical meaning. To say that a point-event has the X1 co-ordinate x1 means that the projection of the point-event on the axis of X1, determined by rigid rods and in accordance with the rules of Euclidean geometry, is obtained by measuring off a given rod (the unit of length) x1 times from the origin of co-ordinates along the axis of X1. To say that a point-event has the X4 co-ordinate x4=t, means that a standard clock, made to measure time in a definite unit period, and which is stationary relatively to the system of co-ordinates and practically coincident in space with the point-event, will have measured off x4=t periods at the occurrence of the event.

This view of space and time has always been in the minds of physicists, even if, as a rule, they have been unconscious of it. This is clear from the part which these concepts play in physical measurements; it must also have underlain the reader's reflexions on the preceding paragraph for him to connect any meaning with what he there read. But we shall now show that we must put it aside and replace it by a more general view, in order to be able to carry through the postulate of general relativity, if the special theory of relativity applies to the special case of the absence of a gravitational field.

In a space which is free of gravitational fields we introduce a Galilean system of reference K (x, y, z, t), and also a system of co-ordinates K' (x', y', z', t') in uniform rotation relatively to K. Let the origins of both systems, as well as their axes of Z, permanently coincide. We shall show that for a space-time measurement in the system K' the above definition of the physical meaning of lengths and times cannot be maintained. For reasons of symmetry it is clear that a circle around the origin in the X, Y plane of K may at the same time be regarded as a circle in the X', Y' plane of K'. We suppose that the circumference and diameter of this circle have been measured with a unit measure infinitely small compared with the radius, and that we have the quotient of the two results. If this experiment were performed with a measuring-rod at rest relatively to the Galilean system K, the quotient would be [pi]. With a measuring-rod at rest relatively to K', the quotient would be greater than [pi]. This is readily understood if we envisage the whole process of measuring from the "stationary" system K, and take into consideration that the measuring-rod applied to the periphery undergoes a Lorentzian contraction, while the one applied along the radius does not. Hence Euclidean geometry does not apply to K'. The notion of co-ordinates defined above, which presupposes the validity of Euclidean geometry, therefore breaks down in relation to the system K'. So, too, we are unable to introduce a time corresponding to physical requirements in K', indicated by clocks at rest relatively to K'. To convince ourselves of this impossibility, let us imagine two clocks of identical constitution placed, one at the origin of co-ordinates, and the other at the circumference of the circle, and both envisaged from the "stationary" system K. By a familiar result of the special theory of relativity, the clock at the circumference -- judged from K -- goes more slowly than the other, because the former is in motion and the latter at rest. An observer at the common origin of co-ordinates, capable of observing the clock at the circumference by means of light, would therefore see it lagging behind the clock beside him. As he will not make up his mind to let the velocity of light along the path in question depend explicitly on the time, he will interpret his observations as showing that the clock at the circumference "really" goes more slowly than the clock at the origin. So he will be obliged to define time in such a way that the rate of a clock depends upon where the clock may be.

We therefore reach this result: -- In the general theory of relativity, space and time cannot be defined in such a way that differences of the spatial co-ordinates can be directly measured by the unit measuring-rod, or differences in the time co-ordinate by a standard clock.

The method hitherto employed for laying co-ordinates into the space-time continuum in a definite manner thus breaks down, and there seems to be no other way which would allow us to adapt systems of co-ordinates to the four-dimensional universe so that we might expect from their application a particularly simple formulation of the laws of nature. So there is nothing for it but to regard all imaginable systems of co-ordinates, on principle, as equally suitable for the description of nature. This comes to requiring that: --

The general laws of nature are to be expressed by equations which hold good for all the systems of co-ordinates, that is, are co-variant with respect to any substitutions whatever (generally co-variant).

It is clear that a physical theory which satisfies this postulate will also be suitable for the general postulate of relativity. For the sum of all substitutions in any case includes those which correspond to all relative motions of three-dimensional systems of co-ordinates. That this requirement of general co-variance, which takes away from space and time the last remnant of physical objectivity, is a natural one, will be seen from the following reflexion. All our space-time verifications invariably amount to a determination of space-time coincidences. If, for example, events consisted merely in the motion of material points, then ultimately nothing would be observable but the meetings of two or more of these points. Moreover, the results of our measurings are nothing but verifications of such meetings of the material points of our measuring instruments with other material points, coincidences between the hands of a clock and points on the clock-dial, and observed point-events happening at the same place at the same time.

The introduction of a system of reference serves no other purpose than to facilitate the description of the totality of such coincidences. We allot to the universe four space-time variables x1, x2, x3, x4 in such a way that for every point-event there is a corresponding system of values of the variables x1 . . . x4. To two coincident point-events there corresponds one system of values of the variables x1 . . . x4, i.e. coincidence is characterized by the identity of the co-ordinates. If, in place of the variables x1 . . . x4, we introduce functions of them, x'1, x'2, x'3, x'4, as a new system of co-ordinates, so that the systems of values are made to correspond to one another without ambiguity, the equality of all four co-ordinates in the new system will also serve as an expression for the space-time coincidence of the two point-events. As all our physical experience can be ultimately reduced to such coincidences, there is no immediate reason for preferring certain systems of coordinates to others, that is to say, we arrive at the requirement of general co-variance.

-== ==-

© copyright 1998 by Eric Baird.

Back to Selected Works page