Relativity: The Special and General Theory
Einstein's theories of relativity have profoundly affected the whole of
physics and astronomy. They dominate current theories of the origin, size and age
of the creation and it is important to understand their basis to make an
assessment of the reliability of the accepted "scientific" view.
Einstein claimed that any scientist who could not explain his work to a
schoolchild was a charlatan. His text can be followed by anyone with a secondary
education. It is far easier to gain an insight into relativity from Einstein's
own work than from the numerous texts which treat it mathematically and give
little or no
insight into the assumptions behind the equations. Einstein makes his
assumptions clear. It is these assumptions, not the impeccable mathematics
usually presented, which need close attention.
The text is reproduced with thanks to the Gutenberg project. My comments are
marked [... PRS]
Albert Einstein
Relativity: The Special and General Theory
Preface
(December, 1916)
The present book is intended, as far as possible, to give an exact insight
into the theory of Relativity to those readers who, from a general scientific
and philosophical point of view, are interested in the theory, but who are not
conversant with the mathematical apparatus of theoretical physics. The work
presumes a standard of education corresponding to that of a university
matriculation examination, and, despite the shortness of the book, a fair amount
of patience and force of will on the part of the reader. The author has spared
himself no pains in his endeavour to present the main ideas in the simplest and
most intelligible form, and on the whole, in the sequence and connection in
which they actually originated. In the interest of clearness, it appeared to me
inevitable that I should repeat myself frequently, without paying the slightest
attention to the elegance of the presentation. I adhered scrupulously to the
precept of that brilliant theoretical physicist L. Boltzmann, according to whom
matters of elegance ought to be left to the tailor and to the cobbler. I make no
pretence of having withheld from the reader difficulties which are inherent to
the subject. On the other hand, I have purposely treated the empirical physical
foundations of the theory in a "step-motherly" fashion, so that
readers unfamiliar with physics may not feel like the wanderer who was unable to
see the forest for the trees. May the book bring some one
a few happy hours of suggestive thought!
December, 1916
A. EINSTEIN
Part I
The Special Theory of Relativity
Section 1
Physical Meaning of Geometrical Propositions
In your schooldays most of you who read this book made acquaintance with the
noble building of Euclid's geometry, and you remember — perhaps with more
respect than love — the magnificent structure, on the lofty staircase of which
you were chased about for uncounted hours by conscientious teachers. By reason
of our past experience, you would certainly regard everyone with disdain who
should pronounce even the most out-of-the-way proposition of this science to be
untrue. But perhaps this feeling of proud certainty would leave you immediately
if some one were to ask you: "What, then, do you mean by the assertion that
these propositions are true?" Let us proceed to give this question a little
consideration.
Geometry sets out from certain conceptions such as "plane,"
"point," and "straight line," with which we are able to
associate more or less definite ideas, and from certain simple propositions
(axioms) which, in virtue of these ideas, we are inclined to accept as
"true." Then, on the basis of a logical process, the justification of
which we feel ourselves compelled to admit, all remaining propositions are shown
to follow from those axioms, i.e. they are proven. A proposition is
then correct ("true") when it has been derived in the recognised
manner from the axioms. The question of "truth" of the individual
geometrical propositions is thus reduced to one of the "truth" of the
axioms. Now it has long been known that the last question is not only
unanswerable by the methods of geometry, but that it is in itself entirely
without meaning. We cannot ask whether it is true that only one straight line
goes through two points. We can only say that Euclidean geometry deals with
things called "straight lines," to each of which is ascribed the
property of being uniquely determined by two points situated on it. The concept
"true" does not tally with the assertions of pure geometry, because by
the word "true" we are eventually in the habit of designating always
the correspondence with a "real" object; geometry, however, is not
concerned with the relation of the ideas involved in it to objects of
experience, but only with the logical connection of these ideas among
themselves. [An important point. Mathematics
is not science. It deals with abstractions (which are often simplified
approximations to reality). A scientist can use mathematics as a powerful tool in
determining relationships in the real world if there is a strong correlation
between the mathematics and reality. Formerly mathematics was pursued almost
exclusively as a tool for dealing with reality. There was a blurring of the
distinction between a "mathematician" and a "scientist" (Brahe
and Newton, for example were "mathematicians" who would today be more
likely thought of as "scientists"). Mathematics is
increasingly becoming a field completely divorced from reality, an
autonomous discipline which can make its own rules and is valid whether it
has any connection with the real world or not. PRS]
It is not difficult to understand why, in spite of this, we feel constrained
to call the propositions of geometry "true." Geometrical ideas
correspond to more or less exact objects in nature, and these last are
undoubtedly the exclusive cause of the genesis of those ideas. Geometry ought to
refrain from such a course, in order to give to its structure the largest
possible logical unity. The practice, for example, of seeing in a
"distance" two marked positions on a practically rigid body is
something which is lodged deeply in our habit of thought. We are accustomed
further to regard three points as being situated on a straight line, if their
apparent positions can be made to coincide for observation with one eye, under
suitable choice of our place of observation.
If, in pursuance of our habit of thought, we now supplement the propositions
of Euclidean geometry by the single proposition that two points on a practically
rigid body always correspond to the same distance (line-interval), independently
of any changes in position to which we may subject the body, the propositions of
Euclidean geometry then resolve themselves into propositions on the possible
relative position of practically rigid bodies.1)
Geometry which has been supplemented in this way is then to be treated as a
branch of physics. We can now legitimately ask as to the "truth" of
geometrical propositions interpreted in this way, since we are justified in
asking whether these propositions are satisfied for those real things we have
associated with the geometrical ideas. In less exact terms we can express this
by saying that by the "truth" of a geometrical proposition in this
sense we understand its validity for a construction with rule and compasses. [
Notice that Einstein here claims that with the addition of one proposition he
has converted the mathematical discipline of Geometry into a branch of science
which is competent to tell us the truth about the world. The confusion between
mathematics and science was a pitfall for ancient Greek attempts at science -
assuming that the real world must conform to mathematics. While in this
particular example it seems quite innocuous, Einstein is opening the door for
science to return to that state. Soddy's 1954
address noted that this is exactly what happened in physics following the
acceptance of Relativity. PRS]
Of course the conviction of the "truth" of geometrical propositions
in this sense is founded exclusively on rather incomplete experience. For the
present we shall assume the "truth" of the geometrical propositions,
then at a later stage (in the general theory of relativity) we shall see that
this "truth" is limited, and we shall consider the extent of its
limitation.
Notes
1) It follows
that a natural object is associated also with a straight line. Three points A,
B and C on a rigid body thus lie in a straight line when the
points A and C being given, B is chosen such that the
sum of the distances AB and BC is as short as possible. This
incomplete suggestion will suffice for the present purpose.
Section 2
The System of Co-ordinates
On the basis of the physical interpretation of distance which has been
indicated, we are also in a position to establish the distance between two
points on a rigid body by means of measurements. For this purpose we require a
" distance " (rod S) which is to be used once and for all,
and which we employ as a standard measure. If, now, A and B
are two points on a rigid body, we can construct the line joining them according
to the rules of geometry ; then, starting from A, we can mark off the
distance S time after time until we reach B. The number of
these operations required is the numerical measure of the distance AB.
This is the basis of all measurement of length. 1)
Every description of the scene of an event or of the position of an object in
space is based on the specification of the point on a rigid body (body of
reference) with which that event or object coincides. This applies not only to
scientific description, but also to everyday life. If I analyse the place
specification " Trafalgar Square, London," I arrive at the following
result. The earth is the rigid body to which the specification of place refers;
" Trafalgar Square, London," is a well-defined point, to which a name
has been assigned, and with which the event coincides in space.2)
This primitive method of place specification deals only with places on the
surface of rigid bodies, and is dependent on the existence of points on this
surface which are distinguishable from each other. But we can free ourselves
from both of these limitations without altering the nature of our specification
of position. If, for instance, a cloud is hovering over Trafalgar Square, then
we can determine its position relative to the surface of the earth by erecting a
pole perpendicularly on the Square, so that it reaches the cloud. The length of
the pole measured with the standard measuring-rod, combined with the
specification of the position of the foot of the pole, supplies us with a
complete place specification. On the basis of this illustration, we are able to
see the manner in which a refinement of the conception of position has been
developed.
(a) We imagine the rigid body, to which the place
specification is referred, supplemented in such a manner that the object whose
position we require is reached by the completed rigid body.
(b) In locating the position of the object, we make use
of a number (here the length of the pole measured with the measuring-rod)
instead of designated points of reference.
(c) We speak of the height of the cloud even when the
pole which reaches the cloud has not been erected. By means of optical
observations of the cloud from different positions on the ground, and taking
into account the properties of the propagation of light, we determine the length
of the pole we should have required in order to reach the cloud.
From this consideration we see that it will be advantageous if, in the
description of position, it should be possible by means of numerical measures to
make ourselves independent of the existence of marked positions (possessing
names) on the rigid body of reference. In the physics of measurement this is
attained by the application of the Cartesian system of co-ordinates.
This consists of three plane surfaces perpendicular to each other and rigidly
attached to a rigid body. Referred to a system of co-ordinates, the scene of any
event will be determined (for the main part) by the specification of the lengths
of the three perpendiculars or co-ordinates (x, y, z) which can be
dropped from the scene of the event to those three plane surfaces. The lengths
of these three perpendiculars can be determined by a series of manipulations
with rigid measuring-rods performed according to the rules and methods laid down
by Euclidean geometry.
In practice, the rigid surfaces which constitute the system of co-ordinates
are generally not available ; furthermore, the magnitudes of the co-ordinates
are not actually determined by constructions with rigid rods, but by indirect
means. If the results of physics and astronomy are to maintain their clearness,
the physical meaning of specifications of position must always be sought in
accordance with the above considerations. 3)
We thus obtain the following result: Every description of events in space
involves the use of a rigid body to which such events have to be referred. The
resulting relationship takes for granted that the laws of Euclidean geometry
hold for "distances;" the "distance" being represented
physically by means of the convention of two marks on a rigid body.
Notes
1) Here we
have assumed that there is nothing left over i.e. that the measurement
gives a whole number. This difficulty is got over by the use of divided
measuring-rods, the introduction of which does not demand any fundamentally new
method.
2) It is not
necessary here to investigate further the significance of the expression
"coincidence in space." This conception is sufficiently obvious to
ensure that differences of opinion are scarcely likely to arise as to its
applicability in practice.
3) A
refinement and modification of these views does not become necessary until we
come to deal with the general theory of relativity, treated in the second part
of this book.
Section 3
Space and Time in Classical Mechanics
The purpose of mechanics is to describe how bodies change
their position in space with "time." I should load my conscience with
grave sins against the sacred spirit of lucidity were I to formulate the aims of
mechanics in this way, without serious reflection and detailed explanations. Let
us proceed to disclose these sins.
It is not clear what is to be understood here by "position" and
"space." I stand at the window of a railway carriage which is
travelling uniformly, and drop a stone on the embankment, without throwing it.
Then, disregarding the influence of the air resistance, I see the stone descend
in a straight line. A pedestrian who observes the misdeed from the footpath
notices that the stone falls to earth in a parabolic curve. I now ask: Do the
"positions" traversed by the stone lie "in reality" on a
straight line or on a parabola? Moreover, what is meant here by motion "in
space" ? From the considerations of the previous section the answer is
self-evident. In the first place we entirely shun the vague word
"space," of which, we must honestly acknowledge, we cannot form the
slightest conception, and we replace it by "motion relative to a
practically rigid body of reference." The positions relative to the body of
reference (railway carriage or embankment) have already been defined in detail
in the preceding section. If instead of " body of reference " we
insert " system of co-ordinates," which is a useful idea for
mathematical description, we are in a position to say : The stone traverses a
straight line relative to a system of co-ordinates rigidly attached to the
carriage, but relative to a system of co-ordinates rigidly attached to the
ground (embankment) it describes a parabola. With the aid of this example it is
clearly seen that there is no such thing as an independently existing trajectory
(lit. "path-curve" 1)),
but only a trajectory relative to a particular body of reference.
In order to have a complete description of the motion, we must
specify how the body alters its position with time ; i.e. for every
point on the trajectory it must be stated at what time the body is situated
there. These data must be supplemented by such a definition of time that, in
virtue of this definition, these time-values can be regarded essentially as
magnitudes (results of measurements) capable of observation. If we take our
stand on the ground of classical mechanics, we can satisfy this requirement for
our illustration in the following manner. We imagine two clocks of identical
construction ; the man at the railway-carriage window is holding one of them,
and the man on the footpath the other. Each of the observers determines the
position on his own reference-body occupied by the stone at each tick of the
clock he is holding in his hand. In this connection we have not taken account of
the inaccuracy involved by the finiteness of the velocity of propagation of
light. With this and with a second difficulty prevailing here we shall have to
deal in detail later.
Notes
1) That is, a
curve along which the body moves.
Section 4
The Galileian System of Co-ordinates
As is well known, the fundamental law of the mechanics of Galilei-Newton,
which is known as the law of inertia, can be stated thus: A body
removed sufficiently far from other bodies continues in a state of rest or of
uniform motion in a straight line. [Note that
Newton did not say this. His first law is usually stated "every body
continues in its state of rest or uniform motion in a straight line except
insofar as it compelled to change that state by external impressed forces".
PRS].
This law not only says something about the motion of the bodies, but it also
indicates the reference-bodies or systems of coordinates, permissible in
mechanics, which can be used in mechanical description. The visible fixed stars
are bodies for which the law of inertia certainly holds to a high degree of
approximation. Now if we use a system of co-ordinates which is rigidly
attached to the earth, then, relative to this system, every fixed star describes
a circle of immense radius in the course of an astronomical day, a result which
is opposed to the statement of the law of inertia. [i.e.
to Einstein's above statement of it PRS] So that if we adhere to this
law we must refer these motions only to systems of coordinates relative to which
the fixed stars do not move in a circle. A system of co-ordinates of which the
state of motion is such that the law of inertia holds relative to it is called a
" Galileian system of co-ordinates." The laws of the mechanics of
Galflei-Newton can be regarded as valid only for a Galileian system of
co-ordinates.
Section 5
The Principle of Relativity
(in the restricted sense)
In order to attain the greatest possible clearness, let us return to our
example of the railway carriage supposed to be travelling uniformly. We call its
motion a uniform translation ("uniform" because it is of constant
velocity and direction, " translation " because although the carriage
changes its position relative to the embankment yet it does not rotate in so
doing). Let us imagine a raven flying through the air in such a manner that its
motion, as observed from the embankment, is uniform and in a straight line. If
we were to observe the flying raven from the moving railway carriage. we should
find that the motion of the raven would be one of different velocity and
direction, but that it would still be uniform and in a straight line. Expressed
in an abstract manner we may say : If a mass m is
moving uniformly in a straight line with respect to a co-ordinate system K,
then it will also be moving uniformly and in a straight line relative to a
second co-ordinate system K1 provided that
the latter is executing a uniform translatory motion with respect to K.
In accordance with the discussion contained in the preceding section, it follows
that:
If K is a Galileian co-ordinate
system. then every other co-ordinate system K' is a
Galileian one, when, in relation to K, it is in a
condition of uniform motion of translation. Relative to K1
the mechanical laws of Galilei-Newton hold good exactly as they do with respect
to K.
We advance a step farther in our generalisation when we express the tenet
thus: If, relative to K, K1
is a uniformly moving co-ordinate system devoid of rotation, then natural
phenomena run their course with respect to K1
according to exactly the same general laws as with respect to K.
This statement is called the principle of relativity (in the restricted
sense).
As long as one was convinced that all natural phenomena were capable of
representation with the help of classical mechanics, there was no need to doubt
the validity of this principle of relativity. But in view of the more recent
development of electrodynamics and optics it became more and more evident that
classical mechanics affords an insufficient foundation for the physical
description of all natural phenomena. At this juncture the question of the
validity of the principle of relativity became ripe for discussion, and it did
not appear impossible that the answer to this question might be in the negative.
[It was primarily experiments which failed to show any
velocity of the earth through space (Michelson and Morely's in particular) which
led to this. PRS]
Nevertheless, there are two general facts which at the outset speak very much
in favour of the validity of the principle of relativity. Even though classical
mechanics does not supply us with a sufficiently broad basis for the theoretical
presentation of all physical phenomena, still we must grant it a considerable
measure of " truth," since it supplies us with the actual motions of
the heavenly bodies with a delicacy of detail little short of wonderful. The
principle of relativity must therefore apply with great accuracy in the domain
of mechanics. But that a principle of such broad generality should hold
with such exactness in one domain of phenomena, and yet should be invalid for
another, is a priori not very probable.
We now proceed to the second argument, to which, moreover, we shall return
later. If the principle of relativity (in the restricted sense) does not hold,
then the Galileian co-ordinate systems K, K1, K2,
etc., which are moving uniformly relative to each other, will not be equivalent
for the description of natural phenomena. In this case we should be constrained
to believe that natural laws are capable of being formulated in a particularly
simple manner, and of course only on condition that, from amongst all possible
Galileian co-ordinate systems, we should have chosen one (K0)
of a particular state of motion as our body of reference. We should then be
justified (because of its merits for the description of natural phenomena) in
calling this system " absolutely at rest," and all other Galileian
systems K " in motion".
[The possibility of the existence of such a "preferred" frame of
reference is anathema to modern science. PRS] If, for instance, our
embankment were the system K0 then our
railway carriage would be a system K, relative to which
less simple laws would hold than with respect to K0.
This diminished simplicity would be due to the fact that the carriage K
would be in motion (i.e."really")with respect to K0.
In the general laws of nature which have been formulated with reference to K,
the magnitude and direction of the velocity of the carriage would necessarily
play a part. We should expect, for instance, that the note emitted by an
organpipe placed with its axis parallel to the direction of travel would be
different from that emitted if the axis of the pipe were placed perpendicular to
this direction. [As far as I can see Einstein should
say the note "could" be different, I see no justification for his
"would" be different. PRS]
Now in virtue of its motion in an orbit round the sun, our earth is
comparable with a railway carriage travelling with a velocity of about 30
kilometres per second. If the principle of relativity were not valid we should
therefore expect that the direction of motion of the earth at any moment would
enter into the laws of nature, and also that physical systems in their behaviour
would be dependent on the orientation in space with respect to the earth. For
owing to the alteration in direction of the velocity of revolution of the earth
in the course of a year, the earth cannot be at rest relative to the
hypothetical system K0 throughout the whole
year. However, the most careful observations have never revealed such
anisotropic properties in terrestrial physical space, i.e. a physical
non-equivalence of different directions. This is very powerful argument in
favour of the principle of relativity. [This is a
crucial paragraph and deserves careful though. Einstein begins by taking for
granted the absolute certainty of the motion of the earth in an orbit around the
sun. He then points out that the most careful observations cannot detect a trace
of this motion. This "powerful argument" forms a basis of his theory
of Relativity which concludes that it is impossible to tell whether any
astronomical body (including the earth) is actually in motion or not! PRS]
Section 6
The Theorem of the
Addition of Velocities
Employed in Classical Mechanics
Let us suppose our old friend the railway carriage to be travelling along the
rails with a constant velocity v, and that a man
traverses the length of the carriage in the direction of travel with a velocity w.
How quickly or, in other words, with what velocity W
does the man advance relative to the embankment during the process ? The only
possible answer seems to result from the following consideration: If the man
were to stand still for a second, he would advance relative to the embankment
through a distance v equal numerically to the velocity
of the carriage. As a consequence of his walking, however, he traverses an
additional distance w relative to the carriage, and
hence also relative to the embankment, in this second, the distance w
being numerically equal to the velocity with which he is walking. Thus in total
be covers the distance W=v+w relative to the embankment
in the second considered. We shall see later that this result, which expresses
the theorem of the addition of velocities employed in classical mechanics,
cannot be maintained ; in other words, the law that we have just written down
does not hold in reality. For the time being, however, we shall assume its
correctness.
Section 7
The Apparent Incompatibility of the
Law of Propagation of Light with the
Principle of Relativity
There is hardly a simpler law in physics than that according to which light
is propagated in empty space. Every child at school knows, or believes he knows,
that this propagation takes place in straight lines with a velocity c=
300,000 km./sec. At all events we know with great exactness that this velocity
is the same for all colours, because if this were not the case, the minimum of
emission would not be observed simultaneously for different colours during the
eclipse of a fixed star by its dark neighbour. By means of similar
considerations based on observations of double stars, the Dutch astronomer De
Sitter was also able to show that the velocity of propagation of light cannot
depend on the velocity of motion of the body emitting the light. The assumption
that this velocity of propagation is dependent on the direction "in
space" is in itself improbable.
In short, let us assume that the simple law of the constancy of the velocity
of light c (in vacuum) is justifiably believed by the
child at school. Who would imagine that this simple law has plunged the
conscientiously thoughtful physicist into the greatest intellectual
difficulties? Let us consider how these difficulties arise. [A
point to note is that the child at school would almost certainly have a
different idea of what he means by "the constancy of the velocity of
light". Einstein is actually saying that however quickly, and in whatever
direction an observer is moving he will always measure the speed at which any
beam of light strikes him as 300,000 km/sec. Einstein and his
peers came to believe this because (a) experiments designed to measure
the speed of the earth around the sun could detect no difference between light
impact speeds throughout the year and (b) they were confident that the
earth does, indeed, orbit the sun. PRS]
Of course we must refer the process of the propagation of light (and indeed
every other process) to a rigid reference-body (co-ordinate system). As such a
system let us again choose our embankment. We shall imagine the air above it to
have been removed. If a ray of light be sent along the embankment, we see from
the above that the tip of the ray will be transmitted with the velocity c
relative to the embankment. Now let us suppose that our railway carriage is
again travelling along the railway lines with the velocity v,
and that its direction is the same as that of the ray of light, but its velocity
of course much less. Let us inquire about the velocity of propagation of the ray
of light relative to the carriage. It is obvious that we can here apply the
consideration of the previous section, since the ray of light plays the part of
the man walking along relatively to the carriage. The velocity w
of the man relative to the embankment is here replaced by the velocity of light
relative to the embankment. w is the required velocity
of light with respect to the carriage, and we have
w = c-v.
The velocity of propagation of a ray of light relative to the carriage thus
comes out smaller than c.
But this result comes into conflict with the principle of relativity set
forth in Section V. For,
like every other general law of nature, the law of the transmission of light in
vacuo [in vacuum] must, according to the
principle of relativity, be the same for the railway carriage as reference-body
as when the rails are the body of reference. But, from our above consideration,
this would appear to be impossible. If every ray of light is propagated relative
to the embankment with the velocity c, then for this
reason it would appear that another law of propagation of light must necessarily
hold with respect to the carriage — a result contradictory to the principle of
relativity. [Einstein's previous description of a
general law of nature does not give the impression that the numerical value of a
velocity can be such a law. His illustration of the raven's flight suggests
exactly the opposite. PRS]
In view of this dilemma there appears to be nothing else for it than to
abandon either the principle of relativity or the simple law of the propagation
of light in vacuo. Those of you who have carefully followed the
preceding discussion are almost sure to expect that we should retain the
principle of relativity, which appeals so convincingly to the intellect because
it is so natural and simple. The law of the propagation of light in vacuo
would then have to be replaced by a more complicated law conformable to the
principle of relativity. The development of theoretical physics shows, however,
that we cannot pursue this course. The epoch-making theoretical investigations
of H. A. Lorentz on the electrodynamical and optical phenomena connected with
moving bodies show that experience in this domain leads conclusively to a theory
of electromagnetic phenomena, of which the law of the constancy of the velocity
of light in vacuo is a necessary consequence. Prominent theoretical
physicists were therefore more inclined to reject the principle of relativity,
in spite of the fact that no empirical data had been found which were
contradictory to this principle.
At this juncture the theory of relativity entered the arena. As a result of
an analysis of the physical conceptions of time and space, it became evident
that in reality there is not the least incompatibility between the principle
of relativity and the law of propagation of light, and that by
systematically holding fast to both these laws a logically rigid theory could be
arrived at. This theory has been called the special theory of relativity
to distinguish it from the extended theory, with which we shall deal later. In
the following pages we shall present the fundamental ideas of the special theory
of relativity.
Section 8
On the Idea of Time in Physics
Lightning has struck the rails on our railway embankment at two places A
and B far distant from each other. I make the
additional assertion that these two lightning flashes occurred simultaneously.
If I ask you whether there is sense in this statement, you will answer my
question with a decided "Yes." But if I now approach you with the
request to explain to me the sense of the statement more precisely, you find
after some consideration that the answer to this question is not so easy as it
appears at first sight.
After some time perhaps the following answer would occur to you: "The
significance of the statement is clear in itself and needs no further
explanation; of course it would require some consideration if I were to be
commissioned to determine by observations whether in the actual case the two
events took place simultaneously or not." I cannot be satisfied with this
answer for the following reason. Supposing that as a result of ingenious
considerations an able meteorologist were to discover that the lightning must
always strike the places A and B
simultaneously, then we should be faced with the task of testing whether or not
this theoretical result is in accordance with the reality. [Note
the use of the word "discover" here, common in "scientific"
circles, potentially confusing to anyone not in the know. A scientist
discovers something when he develops a theory or deduces a consequence of a
theory. PRS] We encounter the same
difficulty with all physical statements in which the conception "
simultaneous " plays a part. The concept does not exist for the physicist
until he has the possibility of discovering whether or not it is fulfilled in an
actual case. We thus require a definition of simultaneity such that this
definition supplies us with the method by means of which, in the present case,
he can decide by experiment whether or not both the lightning strokes occurred
simultaneously. As long as this requirement is not satisfied, I allow myself to
be deceived as a physicist (and of course the same applies if I am not a
physicist), when I imagine that I am able to attach a meaning to the statement
of simultaneity. (I would ask the reader not to proceed farther until he is
fully convinced on this point.)
After thinking the matter over for some time you then offer the following
suggestion with which to test simultaneity. By measuring along the rails, the
connecting line AB should be measured up and an
observer placed at the mid-point M of the distance AB.
This observer should be supplied with an arrangement (e.g. two mirrors
inclined at 900) which allows him visually to observe both places A
and B at the same time. If the observer perceives the
two flashes of lightning at the same time, then they are simultaneous.
I am very pleased with this suggestion, but for all that I cannot regard the
matter as quite settled, because I feel constrained to raise the following
objection:
"Your definition would certainly be right, if only I knew
that the light by means of which the observer at M
perceives the lightning flashes travels along the length A
M with the same velocity as along the length B
M. But an examination of this supposition would only be
possible if we already had at our disposal the means of measuring time. It would
thus appear as though we were moving here in a logical circle."
After further consideration you cast a somewhat disdainful glance at me —
and rightly so — and you declare:
"I maintain my previous definition nevertheless, because
in reality it assumes absolutely nothing about light. There is only one demand
to be made of the definition of simultaneity, namely, that in every real case it
must supply us with an empirical decision as to whether or not the conception
that has to be defined is fulfilled. That my definition satisfies this demand is
indisputable. That light requires the same time to traverse the path A
M as for the path B
M is in reality neither a supposition nor a
hypothesis about the physical nature of light, but a stipulation which I
can make of my own free will in order to arrive at a definition of
simultaneity."
[The importance of the last few
paragraphs can hardly be overstated. They imply the claim that a
scientist can define time (and simultaneity) in any way he chooses as long as
the definition is used consistently. There can therefore be no objection to his own
definition. In fact Einstein chose this particular
definition of simultaneity to make his analysis agree with previously
established observations (e.g. e=mc2). This definition has enormous
consequences and brings to mind Euler's warning "in our researches into the
phenomena of the visible world we are liable to weaknesses and inconsistencies
so humiliating that a Revelation was absolutely necessary to us and we ought to
avail ourselves of it with the most powerful veneration?" Referring to the
Revelation so highly venerated by Euler we see that God set the sun, moon and
stars in the firmament of heaven "to be for signs and for seasons, for days
and for years". We have the God-given definition of time in the movements
of the heavenly bodies. For many years it was universally accepted that solar
mean time referred to some meridian was the accepted measure. Since the Greenwich
meridian was the first accurately evaluated for mean time it became the world
standard. A simple addition gives time relative to any other meridian. If
Einstein had not "put words into our mouth", a more likely answer
might have been:- "if observers with clocks set accurately to GMT record
two events at the same reading of their clocks then the events are
simultaneous". Impeccable mathematics based on Einstein's bizarre
definition will lead to bizarre conclusions. Einstein recognises that he is
already in trouble from one of these. The fact that his definition results in
time being plastic makes it impossible to maintain that light moves at the same
velocity from A to M as from B to M, since velocity is defined in terms of time.
He is now forced to make it "a stipulation of his own free will". If he were
consistent he would have to admit that all his previous analyses were also
stipulations of his own free will, since they involved velocities etc.
which will no longer conform to their "classical" definitions. PRS]
It is clear that this definition can be used to give an exact meaning not
only to two events, but to as many events as we care to choose, and
independently of the positions of the scenes of the events with respect to the
body of reference 1) (here
the railway embankment). We are thus led also to a definition of " time
" in physics. For this purpose we suppose that clocks of identical
construction are placed at the points A, B
and C of the railway line (co-ordinate system) and that
they are set in such a manner that the positions of their pointers are
simultaneously (in the above sense) the same. Under these conditions we
understand by the " time " of an event the reading (position of the
hands) of that one of these clocks which is in the immediate vicinity (in space)
of the event. In this manner a time-value is associated with every event which
is essentially capable of observation.
This stipulation contains a further physical hypothesis, the validity of
which will hardly be doubted without empirical evidence to the contrary. It has
been assumed that all these clocks go at the same rate if they are of
identical construction. Stated more exactly: When two clocks arranged at rest in
different places of a reference-body are set in such a manner that a particular
position of the pointers of the one clock is simultaneous (in the above
sense) with the same position, of the pointers of the other clock, then
identical " settings " are always simultaneous (in the sense of the
above definition).
Footnotes
1) We suppose
further, that, when three events A, B
and C occur in different places in such a manner that A
is simultaneous with B and B
is simultaneous with C (simultaneous in the sense of
the above definition), then the criterion for the simultaneity of the pair of
events A, C is also satisfied.
This assumption is a physical hypothesis about the propagation of light:
it must certainly be fulfilled if we are to maintain the law of the constancy of
the velocity of light in vacuo.
Section 9
The Relativity of Simultaneity
Up to now our considerations have been referred to a particular body of
reference, which we have styled a " railway embankment." We suppose a
very long train travelling along the rails with the constant velocity v
and in the direction indicated in Fig 1. People travelling in this train will
with advantage view the train as a rigid reference-body (co-ordinate system);
they regard all events in
reference to the train. Then every event which takes place along
the line also takes place at a particular point of the train. Also the
definition of simultaneity can be given relative to the train in exactly the
same way as with respect to the embankment. As a natural consequence, however,
the following question arises :
Are two events (e.g. the two strokes of lightning A
and B) which are simultaneous with reference to the
railway embankment also simultaneous relatively to the train? We
shall show directly that the answer must be in the negative.
When we say that the lightning strokes A and B
are simultaneous with respect to be embankment, we mean: the rays of light
emitted at the places A and B,
where the lightning occurs, meet each other at the mid-point M
of the length A B
of the embankment. But the events A and B
also correspond to positions A and B
on the train. Let M1 be the mid-point of the
distance A B
on the travelling train. Just when the flashes (as judged from the embankment)
of lightning occur, this point M1 naturally
coincides with the point M but it moves towards the
right in the diagram with the velocity v of the train.
If an observer sitting in the position M1 in
the train did not possess this velocity, then he would remain permanently at M,
and the light rays emitted by the flashes of lightning A
and B would reach him simultaneously, i.e.
they would meet just where he is situated. Now in reality (considered with
reference to the railway embankment) he is hastening towards the beam of light
coming from B, whilst he is riding on ahead of the beam
of light coming from A. Hence the observer will see the
beam of light emitted from B earlier than he will see
that emitted from A. Observers who take the railway
train as their reference-body must therefore come to the conclusion that the
lightning flash B took place earlier than the lightning
flash A. We thus arrive at the important result:
Events which are simultaneous with reference to the embankment
are not simultaneous with respect to the train, and vice versa
(relativity of simultaneity). Every reference-body (co-ordinate system) has its
own particular time ; unless we are told the reference-body to which the
statement of time refers, there is no meaning in a statement of the time of an
event. [Note this is a direct consequence of his definition of
simultaneity. PRS]
Now before the advent of the theory of relativity it had always tacitly been
assumed in physics that the statement of time had an absolute significance, i.e.
that it is independent of the state of motion of the body of reference. But we
have just seen that this assumption is incompatible with the most natural
definition of simultaneity; if we discard this assumption, then the conflict
between the law of the propagation of light in vacuo and the principle
of relativity (developed in Section
7) disappears. [Einstein seems to have convinced
himself that his definition is "the most natural". Soddy found "it
reads, to say the least, somewhat curiously". It is certainly not the most
natural definition for one who holds Scriptural revelation in high esteem. PRS]
We were led to that conflict by the considerations of Section
6, which are now no longer tenable. In that section we concluded that the
man in the carriage, who traverses the distance w per
second relative to the carriage, traverses the same distance also with
respect to the embankment in each second of time. But, according to the
foregoing considerations, the time required by a particular occurrence with
respect to the carriage must not be considered equal to the duration of the same
occurrence as judged from the embankment (as reference-body). Hence it cannot be
contended that the man in walking travels the distance w
relative to the railway line in a time which is equal to one second as judged
from the embankment.
Moreover, the considerations of Section
6 are based on yet a second assumption, which, in the light of a strict
consideration, appears to be arbitrary, although it was always tacitly made even
before the introduction of the theory of relativity.
Section 10
On the Relativity of the Conception of Distance
Let us consider two particular points on the train 1)
travelling along the embankment with the velocity v,
and inquire as to their distance apart. We already know that it is necessary to
have a body of reference for the measurement of a distance, with respect to
which body the distance can be measured up. It is the simplest plan to use the
train itself as reference-body (co-ordinate system). An observer in the train
measures the interval by marking off his measuring-rod in a straight line (e.g.
along the floor of the carriage) as many times as is necessary to take him from
the one marked point to the other. Then the number which tells us how often the
rod has to be laid down is the required distance.
It is a different matter when the distance has to be judged from the railway
line. Here the following method suggests itself. If we call A1
and B1 the two points on the train whose
distance apart is required, then both of these points are moving with the
velocity v along the embankment. In the first place we
require to determine the points A and B
of the embankment which are just being passed by the two points A1
and B1 at a particular time t
— judged from the embankment. These points A and B
of the embankment can be determined by applying the definition of time given in Section
8. The distance between these points A and B
is then measured by repeated application of the measuring-rod along the
embankment.
A priori it is by no means certain that this last measurement will
supply us with the same result as the first. Thus the length of the train as
measured from the embankment may be different from that obtained by measuring in
the train itself. [In fact, Einstein's definition
guarantees it. PRS]. This circumstance leads us to a second objection which must be
raised against the apparently obvious consideration of Section
6. Namely, if the man in the carriage covers the distance w
in a unit of time — measured from the train, — then this distance
— as measured from the embankment — is not necessarily also equal
to w.
Footnotes
1) e.g.
the middle of the first and of the hundredth carriage.
Section 11
The Lorentz Transformation
The results of the last three sections show that the apparent incompatibility
of the law of propagation of light with the principle of relativity (Section
7) has been derived by means of a consideration which borrowed two
unjustifiable hypotheses from classical mechanics; these are as follows:
(1) The time-interval (time) between two events is
independent of the condition of motion of the body of reference.
(2) The space-interval (distance) between two points of a
rigid body is independent of the condition of motion of the body of reference.
If we drop these hypotheses, then the dilemma of Section
7 disappears, because the theorem of the addition of velocities derived in Section
6 becomes invalid. The possibility presents itself that the law of the
propagation of light in vacuo may be compatible with the principle of
relativity, and the question arises: How have we to modify the considerations of
Section 6 in order to
remove the apparent disagreement between these two fundamental results of
experience? This question leads to a general one. In the discussion of Section
6 we have to do with places and times relative both to the train and to the
embankment. How are we to find the place and time of an event in relation to the
train, when we know the place and time of the event with respect to the railway
embankment ? Is there a thinkable answer to this question of such a nature that
the law of transmission of light in vacuo does not contradict the
principle of relativity ? In other words : Can we conceive of a relation between
place and time of the individual events relative to both reference-bodies, such
that every ray of light possesses the velocity of transmission c
relative to the embankment and relative to the train ? This question leads to a
quite definite positive answer, and to a perfectly definite transformation law
for the space-time magnitudes of an event when changing over from one body of
reference to another.
Before we deal with this, we shall introduce the following incidental
consideration. Up to the present we have only considered events taking place
along the embankment, which had mathematically to assume the function of a
straight line. In the manner indicated in Section
2 we can imagine this reference-body supplemented laterally and in a
vertical direction by means of a framework of rods, so that an event which takes
place anywhere can be localised with reference to this framework.
Similarly, we can imagine the train travelling with the velocity v
to be continued across the whole of space, so that every event, no matter how
far off it may be, could also be localised with respect to the second framework.
Without committing any fundamental error, we can disregard the fact that in
reality these frameworks would continually interfere with each other, owing to
the impenetrability of solid bodies. In every such framework we imagine three
surfaces perpendicular to each other marked out, and designated as "
co-ordinate planes " (" co-ordinate system "). A
co-ordinate system K then corresponds to the
embankment, and a co-ordinate system K' to the train.
An event, wherever it may have taken place, would be fixed in space with respect
to K by the three perpendiculars x,
y, z on the co-ordinate
planes, and with regard to time by a time value t.
Relative to K1, the same event
would be fixed in respect of space and time by corresponding values x1,
y1, z1, t1, which of course are not
identical with x, y, z, t. It has already been set
forth in detail how these magnitudes are to be regarded as results of physical
measurements.
Obviously our problem can be exactly formulated in the following manner. What
are the values x1, y1, z1, t1,
of an event with respect to K1, when the
magnitudes x, y, z, t, of the same event with respect
to K are given ? The relations must be so chosen that
the law of the transmission of light in vacuo is satisfied for one and
the same ray of light (and of course for every ray) with respect to K
and K1. For the relative orientation in
space of the co-ordinate systems indicated in the diagram (Fig.
2), this problem is solved by means of the equations :
y1 = y
z1 = z
This system of equations is known as the " Lorentz transformation."
1)
[It is these relations of which Soddy remarked
"if any schoolboy were to commit such a cardinal crime in
maths as to cook his figures to get the answer right he would be held up to
obloquy to the whole school and probably spanked." His objection seems to
be that Einstein makes no admission that they come in just "to get the
answer right". Lorentz had proposed these factors within a plausible theory
in which bodies are shortened and clocks slow down for physical reasons.
Einstein's theory leads to the conclusion that time (not just the rate at which
clocks run) and space (not just the dimensions of physical objects) themselves are not
constant. Most concepts in physics (velocity, force, energy etc.) are defined in
terms of them as fundamental primitives. PRS]
If in place of the law of transmission of light we had taken as our basis the
tacit assumptions of the older mechanics as to the absolute character of times
and lengths, then instead of the above we should have obtained the following
equations:
x1 = x - vt
y1 = y
z1 = z
t1 = t
This system of equations is often termed the " Galilei
transformation." The Galilei transformation can be obtained from the
Lorentz transformation by substituting an infinitely large value for the
velocity of light c in the latter transformation.
Aided by the following illustration, we can readily see that, in accordance
with the Lorentz transformation, the law of the transmission of light in
vacuo is satisfied both for the reference-body K
and for the reference-body K1. A
light-signal is sent along the positive x-axis, and
this light-stimulus advances in accordance with the equation
x = ct,
i.e. with the velocity c.
According to the equations of the Lorentz transformation, this simple relation
between x and t involves a
relation between x1 and t1.
In point of fact, if we substitute for x the value ct
in the first and fourth equations of the Lorentz transformation, we obtain:
from which, by division, the expression
x1 = ct1
immediately follows. If referred to the system K1,
the propagation of light takes place according to this equation. We thus see
that the velocity of transmission relative to the reference-body K1
is also equal to c. The same result is obtained for
rays of light advancing in any other direction whatsoever. Of cause this is not
surprising, since the equations of the Lorentz transformation were derived
conformably to this point of view.
Footnotes
1) A simple
derivation of the Lorentz transformation is given in Appendix
I.
Section 12
The Behaviour of Measuring-Rods and Clocks in Motion
Place a metre-rod in the x1-axis
of K1 in such a manner that one end (the
beginning) coincides with the point x1=0
whilst the other end (the end of the rod) coincides with the point x1=I.
What is the length of the metre-rod relatively to the system K?
In order to learn this, we need only ask where the beginning of the rod and the
end of the rod lie with respect to K at a particular
time t of the system K. By
means of the first equation of the Lorentz transformation the values of these
two points at the time t = 0 can be shown to be
the distance between the points being .
But the metre-rod is moving with the velocity v
relative to K. It therefore follows that the length of
a rigid metre-rod moving in the direction of its length with a velocity v
is of a metre.
The rigid rod is thus shorter when in motion than when at rest,
and the more quickly it is moving, the shorter is the rod. For the velocity v=c
we should have ,
and for stiII greater velocities the square-root becomes
imaginary. From this we conclude that in the theory of relativity the velocity c
plays the part of a limiting velocity, which can neither be reached nor exceeded
by any real body.
Of course this feature of the velocity c as a
limiting velocity also clearly follows from the equations of the Lorentz
transformation, for these became meaningless if we choose values of v
greater than c.
If, on the contrary, we had considered a metre-rod at rest in the x-axis
with respect to K, then we should have found that the
length of the rod as judged from K1 would
have been ;
this is quite in accordance with the principle of relativity
which forms the basis of our considerations.
A Priori it is quite clear that we must be able to learn something
about the physical behaviour of measuring-rods and clocks from the equations of
transformation, for the magnitudes z, y, x, t, are
nothing more nor less than the results of measurements obtainable by means of
measuring-rods and clocks. If we had based our considerations on the Galileian
transformation we should not have obtained a contraction of the rod as a
consequence of its motion.
Let us now consider a seconds-clock which is permanently situated at the
origin (x1=0) of K1.
t1=0 and t1=I
are two successive ticks of this clock. The first and fourth equations of the
Lorentz transformation give for these two ticks :
t = 0
and
As judged from K, the clock is moving with the
velocity v; as judged from this reference-body, the
time which elapses between two strokes of the clock is not one second, but
seconds, i.e. a somewhat larger time. As a consequence
of its motion the clock goes more slowly than when at rest. Here also the
velocity c plays the part of an unattainable limiting
velocity.
Section 13
Theorem of the Addition of Velocities.
The Experiment of Fizeau
Now in practice we can move clocks and measuring-rods only with velocities
that are small compared with the velocity of light; hence we shall hardly be
able to compare the results of the previous section directly with the reality.
But, on the other hand, these results must strike you as being very singular,
and for that reason I shall now draw another conclusion from the theory, one
which can easily be derived from the foregoing considerations, and which has
been most elegantly confirmed by experiment.
In Section 6 we
derived the theorem of the addition of velocities in one direction in the form
which also results from the hypotheses of classical mechanics- This theorem can
also be deduced readily from the Galilei transformation (Section
11). In place of the man walking inside the carriage, we introduce a point
moving relatively to the co-ordinate system K1
in accordance with the equation
x1 = wt1
By means of the first and fourth equations of the Galilei transformation we
can express x1 and t1
in terms of x and t, and we
then obtain
x = (v + w)t
This equation expresses nothing else than the law of motion of the point with
reference to the system K (of the man with reference to
the embankment). We denote this velocity by the symbol W,
and we then obtain, as in Section 6,
W=v+w A)
But we can carry out this consideration just as well on the basis of the
theory of relativity. In the equation
x1 = wt1
B)
we must then express x1and t1
in terms of x and t, making
use of the first and fourth equations of the Lorentz transformation. Instead of
the equation (A) we then obtain the equation
which corresponds to the theorem of addition for velocities in
one direction according to the theory of relativity. The question now arises as
to which of these two theorems is the better in accord with experience. On this
point we are enlightened by a most important experiment which the brilliant
physicist Fizeau performed more than half a century ago, and which has been
repeated since then by some of the best experimental physicists, so that there
can be no doubt about its result. The experiment is concerned with the following
question. Light travels in a motionless liquid with a particular velocity w.
How quickly does it travel in the direction of the arrow in the tube T
(see the accompanying diagram, Fig. 3) when the liquid
above mentioned is flowing through the tube with a velocity v
?
In accordance with the principle of relativity we shall certainly have to
take for granted that the propagation of light always takes place with the same
velocity w with respect to the liquid, whether
the latter is in motion with reference to other bodies or not. The velocity of
light relative to the liquid and the velocity of the latter relative to the tube
are thus known, and we require the velocity of light relative to the tube.
It is clear that we have the problem of Section 6 again before us. The tube
plays the part of the railway embankment or of the co-ordinate system K,
the liquid plays the part of the carriage or of the co-ordinate system K1,
and finally, the light plays the part of the
man walking along the carriage, or of the moving point in the
present section. If we denote the velocity of the light relative to the tube by W,
then this is given by the equation (A) or (B), according as the Galilei
transformation or the Lorentz transformation corresponds to the facts.
Experiment1) decides in
favour of equation (B) derived from the theory of relativity, and the agreement
is, indeed, very exact. According to recent and most excellent measurements by
Zeeman, the influence of the velocity of flow v on the
propagation of light is represented by formula (B) to within one per cent.
Nevertheless we must now draw attention to the fact that a theory of this
phenomenon was given by H. A. Lorentz long before the statement of the theory of
relativity. This theory was of a purely electrodynamical nature, and was
obtained by the use of particular hypotheses as to the electromagnetic structure
of matter. This circumstance, however, does not in the least diminish the
conclusiveness of the experiment as a crucial test in favour of the theory of
relativity, for the electrodynamics of Maxwell-Lorentz, on which the original
theory was based, in no way opposes the theory of relativity.
Rather has the
latter been developed from electrodynamics as an astoundingly simple combination
and generalisation of the hypotheses, formerly independent of each other, on
which electrodynamics was built .[Einstein admits that
Lorentz's theory
predicts the same result, yet he claims it as a conclusive proof of his
own theory. In fact Frensnel had predicted exactly this result by
considerations of the aether many years before Fizeau performed the experiment.
The experiment was taken as proof of Fresnel's theory of the partial entrainment of the
aether. Hoek derived the same expression without considering the aether half a
century before Einstein's relativity. We will see several more instances of results explained
equally well by other theories claimed to be definitive proofs of Einstein's theory
alone. In fact I know of no observation explainable by Einstein's theory which
is not explained by at least one other theory in a physically
understandable context. Einstein claims superiority on the grounds that his
theory is "an astoundingly simple combination and generalisation" of
previous ides. But the earlier theories had attempted to give a physical
explanation of observations. Einstein presents a mathematically impeccable
generalisation, but its physics is obscure and the price to be paid is the
overthrow of the very basis of all the physics which had gone before. Whether
this is actually an advance or not is debatable. Chapter 2 of Barnes'
"Physics of the Future" contains the opinions of some outstanding
physicists who think not. PRS]
Footnotes
1) Fizeau
found , where 
is the index of refraction of the liquid. On the other hand,
owing to the smallness of as compared with I,
we can replace (B) in the first place by ,
or to the same order of approximation by
, which agrees with Fizeau's
result.
Section 14
The Heuristic Value of the Theory of Relativity
Our train of thought in the foregoing pages can be epitomised in the
following manner. Experience has led to the conviction that, on the one hand,
the principle of relativity holds true and that on the other hand the velocity
of transmission of light in vacuo has to be considered equal to a
constant c. By uniting these two postulates we obtained
the law of transformation for the rectangular co-ordinates x,
y, z and the time t of the events which
constitute the processes of nature. In this connection we did not obtain the
Galilei transformation, but, differing from classical mechanics, the Lorentz
transformation.
The law of transmission of light, the acceptance of which is justified by our
actual knowledge, played an important part in this process of thought. Once in
possession of the Lorentz transformation, however, we can combine this with the
principle of relativity, and sum up the theory thus:
Every general law of nature must be so constituted that it is
transformed into a law of exactly the same form when, instead of the space-time
variables x, y, z, t of the original coordinate system K,
we introduce new space-time variables x1, y1,
z1, t1 of a co-ordinate system K1.
In this connection the relation between the ordinary and the accented magnitudes
is given by the Lorentz transformation. Or in brief : General laws of nature are
co-variant with respect to Lorentz transformations.
This is a definite mathematical condition that the theory of relativity
demands of a natural law, and in virtue of this, the theory becomes a valuable
heuristic aid in the search for general laws of nature. If a general law of
nature were to be found which did not satisfy this condition, then at least one
of the two fundamental assumptions of the theory would have been disproved. Let
us now examine what general results the latter theory has hitherto evinced.
Section 15
General Results of the Theory
It is clear from our previous considerations that the (special) theory of
relativity has grown out of electrodynamics and optics. In these fields it has
not appreciably altered the predictions of theory, but it has considerably
simplified the theoretical structure, i.e. the derivation of laws, and
— what is incomparably more important — it has considerably reduced the
number of independent hypothese forming the basis of theory. The special theory
of relativity has rendered the Maxwell-Lorentz theory so plausible, that the
latter would have been generally accepted by physicists even if experiment had
decided less unequivocally in its favour.
Classical mechanics required to be modified before it could come into line
with the demands of the special theory of relativity. For the main part,
however, this modification affects only the laws for rapid motions, in which the
velocities of matter v are not very small as compared
with the velocity of light. We have experience of such rapid motions only in the
case of electrons and ions; for other motions the variations from the laws of
classical mechanics are too small to make themselves evident in practice. [Note
that the analysis has been presented in the context of a railway carriage moving
along a railway embankment. We are given the impression that the analysis is for
such "real" bodies. However, Einstein notes that the predictions of the
theory have only ever been experimentally compared with hypothetical particles.
These experiments have never involved measurements of the hypothetical
particle's length by anything comparable to a measuring rod, nor its mass with
anything comparable to a balance or scale, but indirectly using considerations
involving the Maxwell-Lorentz theory. The number of confirmations of Einstein's
theory is impressive, but this confirmation comes from a very restricted area of
physics, and every one is predicted by other theories which leave the
fundamental undefinables (space and time) intact. PRS] We
shall not consider the motion of stars until we come to speak of the general
theory of relativity. In accordance with the theory of relativity the kinetic
energy of a material point of mass m is no longer given
by the well-known expression
but by the expression
This expression approaches infinity as the velocity v
approaches the velocity of light c. The velocity must
therefore always remain less than c, however great may
be the energies used to produce the acceleration. If we develop the expression
for the kinetic energy in the form of a series, we obtain
When is small compared with unity, the
third of these terms is always small in comparison with the second,
which last is alone considered in classical mechanics. The first
term mc2 does not contain the velocity, and
requires no consideration if we are only dealing with the question as to how the
energy of a point-mass; depends on the velocity. We shall speak of its essential
significance later.
The most important result of a general character to which the special theory
of relativity has led is concerned with the conception of mass. Before the
advent of relativity, physics recognised two conservation laws of fundamental
importance, namely, the law of the canservation of energy and the law of the
conservation of mass these two fundamental laws appeared to be quite independent
of each other. By means of the theory of relativity they have been united into
one law. We shall now briefly consider how this unification came about, and what
meaning is to be attached to it.
The principle of relativity requires that the law of the concervation of
energy should hold not only with reference to a co-ordinate system K,
but also with respect to every co-ordinate system K1
which is in a state of uniform motion of translation relative to K,
or, briefly, relative to every " Galileian " system of co-ordinates.
In contrast to classical mechanics; the Lorentz transformation is the deciding
factor in the transition from one such system to another.
By means of comparatively simple considerations we are led to draw the
following conclusion from these premises, in conjunction with the fundamental
equations of the electrodynamics of Maxwell: A body moving with the velocity v,
which absorbs 1) an amount
of energy E0 in the form of radiation
without suffering an alteration in velocity in the process, has, as a
consequence, its energy increased by an amount
In consideration of the expression given above for the kinetic energy of the
body, the required energy of the body comes out to be
Thus the body has the same energy as a body of mass
moving with the velocity v. Hence we can
say: If a body takes up an amount of energy E0,
then its inertial mass increases by an amount
the inertial mass of a body is not a constant but varies
according to the change in the energy of the body. The inertial mass of a system
of bodies can even be regarded as a measure of its energy. The law of the
conservation of the mass of a system becomes identical with the law of the
conservation of energy, and is only valid provided that the system neither takes
up nor sends out energy. Writing the expression for the energy in the form
we see that the term mc2, which has
hitherto attracted our attention, is nothing else than the energy possessed by
the body 2) before it
absorbed the energy E0.
A direct comparison of this relation with experiment is not possible at the
present time (1920; see Note, p. 48), owing to the
fact that the changes in energy E0 to which
we can Subject a system are not large enough to make themselves perceptible as a
change in the inertial mass of the system.
is too small in comparison with the mass m,
which was present before the alteration of the energy. It is owing to this
circumstance that classical mechanics was able to establish successfully the
conservation of mass as a law of independent validity.
Let me add a final remark of a fundamental nature. The success of the
Faraday-Maxwell interpretation of electromagnetic action at a distance resulted
in physicists becoming convinced that there are no such things as instantaneous
actions at a distance (not involving an intermediary medium) of the type of
Newton's law of gravitation. According to the theory of relativity, action at a
distance with the velocity of light always takes the place of instantaneous
action at a distance or of action at a distance with an infinite velocity of
transmission. This is connected with the fact that the velocity c
plays a fundamental role in this theory. In Part II we shall see in what way
this result becomes modified in the general theory of relativity.
Footnotes
1) E0
is the energy taken up, as judged from a co-ordinate system moving with the
body.
2) As judged
from a co-ordinate system moving with the body.
[Note] The
equation E = mc2 has been thoroughly proved
time and again since this time.
Section 16
Experience and the Special Theory of Relativity
To what extent is the special theory of relativity supported by experience ?
This question is not easily answered for the reason already mentioned in
connection with the fundamental experiment of Fizeau. The special theory of
relativity has crystallised out from the Maxwell-Lorentz theory of
electromagnetic phenomena. Thus all facts of experience which support the
electromagnetic theory also support the theory of relativity. As being of
particular importance, I mention here the fact that the theory of relativity
enables us to predict the effects produced on the light reaching us from the
fixed stars. These results are obtained in an exceedingly simple manner, and the
effects indicated, which are due to the relative motion of the earth with
reference to those fixed stars are found to be in accord with experience. We
refer to the yearly movement of the apparent position of the fixed stars
resulting from the motion of the earth round the sun (aberration), and to the
influence of the radial components of the relative motions of the fixed stars
with respect to the earth on the colour of the light reaching us from them. The
latter effect manifests itself in a slight displacement of the spectral lines of
the light transmitted to us from a fixed star, as compared with the position of
the same spectral lines when they are produced by a terrestrial source of light
(Doppler principle). The experimental arguments in favour of the Maxwell-Lorentz
theory, which are at the same time arguments in favour of the theory of
relativity, are too numerous to be set forth here. In reality they limit the
theoretical possibilities to such an extent, that no other theory than that of
Maxwell and Lorentz has been able to hold its own when tested by experience.
[Maxwell and Lorentz (as well as Fresnel and
Fizeau) founded their theories on the aether, without which nothing in
electromagnetism seems to make sense. Einstein denies the existence of the
aether six paragraphs farther on - "there is no such thing as a " specially favoured
" (unique) co-ordinate system to occasion the introduction of the æther-idea
....". A few years later he had to admit that his General Theory is
"unthinkable without the aether" (see "Sidelights on
Relativity"). PRS]
But there are two classes of experimental facts hitherto obtained which can
be represented in the Maxwell-Lorentz theory only by the introduction of an
auxiliary hypothesis, which in itself — i.e. without making use of
the theory of relativity — appears extraneous. [Ptolemy's
method - as a purely mathematical model for predicting planetary positions -
could be presented as superior to Newton's because it does not require the
additional hypothesis of gravity. In addition, Ptolemy's method (recast in
modern notation as Fourier analysis) is far simpler (Newton's analysis needs
many auxiliary hypotheses to give comparable accuracy). I suspect that few would
claim that Ptolemy's model is superior physics though. It does not represent a
quest for understanding how nature actually works. Einstein is presenting a
largely physics-free mathematical model which gives many useful answers at the
cost of denying classical physics and common sense. Attempts to understand the
physics (auxiliary hypotheses) appear extraneous. PRS]
It is known that cathode rays and the so-called β-rays emitted by
radioactive substances consist of negatively electrified particles (electrons)
of very small inertia and large velocity. By examining the deflection of these
rays under the influence of electric and magnetic fields, we can study the law
of motion of these particles very exactly.
In the theoretical treatment of these electrons, we are faced with the
difficulty that electrodynamic theory of itself is unable to give an account of
their nature. For since electrical masses of one sign repel each other, the
negative electrical masses constituting the electron would necessarily be
scattered under the influence of their mutual repulsions, unless there are
forces of another kind operating between them, the nature of which has hitherto
remained obscure to us.1) If
we now assume that the relative distances between the electrical masses
constituting the electron remain unchanged during the motion of the electron
(rigid connection in the sense of classical mechanics), we arrive at a law of
motion of the electron which does not agree with experience. Guided by purely
formal points of view, H. A. Lorentz was the first to introduce the hypothesis
that the form of the electron experiences a contraction in the direction of
motion in consequence of that motion. the contracted length being proportional
to the expression
This, hypothesis, which is not justifiable by any
electrodynamical facts, supplies us then with that particular law of motion
which has been confirmed with great precision in recent years.
The theory of relativity leads to the same law of motion, without requiring
any special hypothesis whatsoever as to the structure and the behaviour of the
electron. We arrived at a similar conclusion in Section
13 in connection with the experiment of Fizeau, the result of which is
foretold by the theory of relativity without the necessity of drawing on
hypotheses as to the physical nature of the liquid.
The second class of facts to which we have alluded has reference to the
question whether or not the motion of the earth in space can be made perceptible
in terrestrial experiments. We have already remarked in Section
5 that all attempts of this nature led to a negative result. Before the
theory of relativity was put forward, it was difficult to become reconciled to
this negative result, for reasons now to be discussed. The inherited prejudices
about time and space did not allow any doubt to arise as to the prime importance
of the Galileian transformation for changing over from one body of reference to
another. Now assuming that the Maxwell-Lorentz equations hold for a
reference-body K, we then find that they do not hold
for a reference-body K1 moving uniformly
with respect to K, if we assume that the relations of
the Galileian transformstion exist between the co-ordinates of K
and K1. It thus appears that, of all
Galileian co-ordinate systems, one (K) corresponding to
a particular state of motion is physically unique. This result was interpreted
physically by regarding K as at rest with respect to a
hypothetical æther of space. On the other hand, all coordinate systems K1
moving relatively to K were to be regarded as in motion
with respect to the æther. To this motion of K1
against the æther ("æther-drift " relative to K1)
were attributed the more complicated laws which were supposed to hold relative
to K1. Strictly speaking, such an æther-drift
ought also to be assumed relative to the earth, and for a long time the efforts
of physicists were devoted to attempts to detect the existence of an æther-drift
at the earth's surface.
In one of the most notable of these attempts Michelson devised a method which
appears as though it must be decisive. Imagine two mirrors so arranged on a
rigid body that the reflecting surfaces face each other. A ray of light requires
a perfectly definite time T to pass from one mirror to
the other and back again, if the whole system be at rest with respect to the
æther. It is found by calculation, however, that a slightly different time T1
is required for this process, if the body, together with the mirrors, be moving
relatively to the æther. And yet another point: it is shown by calculation that
for a given velocity v with reference to the æther,
this time T1 is different when the body is
moving perpendicularly to the planes of the mirrors from that resulting when the
motion is parallel to these planes. Although the estimated difference between
these two times is exceedingly small, Michelson and Morley performed an
experiment involving interference in which this difference should have been
clearly detectable. But the experiment gave a negative result — a fact very
perplexing to physicists. [Perplexing because they were
convinced that the earth does, indeed, have an absolute velocity through space
which the apparatus could not detect. PRS]. Lorentz and FitzGerald rescued the theory from this
difficulty by assuming that the motion of the body relative to the æther
produces a contraction of the body in the direction of motion, the amount of
contraction being just sufficient to compensate for the difference in time
mentioned above. Comparison with the discussion in Section
11 shows that also from the standpoint of the theory of relativity this
solution of the difficulty was the right one. But on the basis of the theory of
relativity the method of interpretation is incomparably more satisfactory.
According to this theory there is no such thing as a " specially favoured
" (unique) co-ordinate system to occasion the introduction of the æther-idea,
and hence there can be no æther-drift, nor any experiment with which to
demonstrate it. Here the contraction of moving bodies follows from the two
fundamental principles of the theory, without the introduction of particular
hypotheses ; and as the prime factor involved in this contraction we find, not
the motion in itself, to which we cannot attach any meaning, but the motion with
respect to the body of reference chosen in the particular case in point. Thus
for a co-ordinate system moving with the earth the mirror system of Michelson
and Morley is not shortened, but it is shortened for a co-ordinate system which
is at rest relatively to the sun. [Is it then actually
shortened or not? Is it only apparently shortened for someone at rest relatively
to the sun? Is it only apparently not shortened for someone moving with the
earth? Or is it both shortened and not shortened at the same time? Is this a
theory about appearances or reality? Or mathematics without a connection
to the real world? PRS]
Footnotes
1) The
general theory of relativity renders it likely that the electrical masses of an
electron are held together by gravitational forces.
Section 17
Minkowski's Four-Dimensional Space
The non-mathematician is seized by a mysterious shuddering when he hears of
"four-dimensional" things, by a feeling not unlike that awakened by
thoughts of the occult. And yet there is no more common-place statement than
that the world in which we live is a four-dimensional space-time continuum.
Space is a three-dimensional continuum. By this we mean that it is possible
to describe the position of a point (at rest) by means of three numbers (co-ordinales)
x, y, z, and that there is an indefinite number of
points in the neighbourhood of this one, the position of which can be described
by co-ordinates such as x1, y1, z1,
which may be as near as we choose to the respective values of the co-ordinates x,
y, z, of the first point. In virtue of the latter property we speak of a
" continuum," and owing to the fact that there are three co-ordinates
we speak of it as being " three-dimensional."
Similarly, the world of physical phenomena which was briefly called "
world " by Minkowski is naturally four dimensional in the space-time sense.
For it is composed of individual events, each of which is described by four
numbers, namely, three space co-ordinates x, y, z, and
a time co-ordinate, the time value t. The"
world" is in this sense also a continuum; for to every event there are as
many "neighbouring" events (realised or at least thinkable) as we care
to choose, the co-ordinates x1, y1, z1,
t1 of which differ by an indefinitely small amount from those
of the event x, y, z, t originally considered. That we
have not been accustomed to regard the world in this sense as a four-dimensional
continuum is due to the fact that in physics, before the advent of the theory of
relativity, time played a different and more independent role, as compared with
the space coordinates. It is for this reason that we have been in the habit of
treating time as an independent continuum. As a matter of fact, according to
classical mechanics, time is absolute, i.e. it is independent of the
position and the condition of motion of the system of co-ordinates. We see this
expressed in the last equation of the Galileian transformation (t1
= t)
The four-dimensional mode of consideration of the "world" is
natural on the theory of relativity, since according to this theory time is
robbed of its independence. This is shown by the fourth equation of the Lorentz
transformation:
Moreover, according to this equation the time difference Δt1
of two events with respect to K1 does not in
general vanish, even when the time difference Δt1
of the same events with reference to K vanishes. Pure
" space-distance " of two events with respect to K
results in " time-distance " of the same events with respect to K.
But the discovery of Minkowski, which was of importance for the formal
development of the theory of relativity, does not lie here. It is to be found
rather in the fact of his recognition that the four-dimensional space-time
continuum of the theory of relativity, in its most essential formal properties,
shows a pronounced relationship to the three-dimensional continuum of Euclidean
geometrical space.1) In
order to give due prominence to this relationship, however, we must replace the
usual time co-ordinate t by an imaginary magnitude
proportional to it. Under these conditions, the natural laws satisfying the
demands of the (special) theory of relativity assume mathematical forms, in
which the time co-ordinate plays exactly the same role as the three space
co-ordinates. Formally, these four co-ordinates correspond exactly to the three
space co-ordinates in Euclidean geometry. It must be clear even to the
non-mathematician that, as a consequence of this purely formal addition to our
knowledge, the theory perforce gained clearness in no mean measure.
These inadequate remarks can give the reader only a vague notion of the
important idea contributed by Minkowski. Without it the general theory of
relativity, of which the fundamental ideas are developed in the following pages,
would perhaps have got no farther than its long clothes. Minkowski's work is
doubtless difficult of access to anyone inexperienced in mathematics, but since
it is not necessary to have a very exact grasp of this work in order to
understand the fundamental ideas of either the special or the general theory of
relativity, I shall leave it here at present, and revert to it only towards the
end of Part 2.
[Minkowski's idea was a great step towards accepting
mathematics as reality in modern physics - and considering reasoning and common
sense as of little value. His construction is, undeniably, elegant,
mathematically "beautiful" and very powerful. But whether treating
"imaginary time" as behaving just like length, has anything to do with
a valid model of the world we actually live in needs serious
consideration. PRS]
Footnotes
1) Cf. the
somewhat more detailed discussion in Appendix
II.
======================================================================
Part II The General Theory of Relativity |