How Einstein Uncovered the Path a Particle Traces Through Spacetime
In this physics mini lesson, we're going to continue our discussion of the principle of least action, following up on the lessons about the action in Newtonian mechanics and in special relativity. This time, we'll talk about the action for a particle in general relativity, Einstein's theory of gravity. We'll write down the action for a particle traveling through spacetime, and see how the particle is forced to traverse a very special kind of curve called a geodesic.
The basic idea of Einstein's theory is that a massive object like a star warps the geometry of spacetime around it. Then according to Einstein, something like a planet traveling along nearby doesn't really experience a gravitational force at all, it just keeps moving along the straightest and shortest path that it can through this curved geometry. And that's what a geodesic is: the straightest and shortest possible path through a curved space.
In special relativity, we learned about how as particle travels around, it traces out a path through spacetime called the worldline of the particle. And then we set the action to be proportional to the length of the worldline, $S = - mc \int \sqrt{-\mathrm{d}s^2}$. Then by minimizing the action, we saw that a free particle travels along the shortest and straightest path that it can through spacetime: which of course is just a straight line. More precisely, the particle follows the path that maximizes its proper time, which is the time that's ticked off on a watch strapped to the particle.
$\mathrm{d}s^2 = - c^2 \mathrm{d}t^2 + \mathrm{d}x^2$ here was the Minkowski metric—it's what tells us how to measure distances in spacetime. It's that minus sign in front of the time term that makes the geometry of spacetime so bizarre compared to ordinary space like we're used to.
This Minkowski metric tells us about the geometry of flat spacetime. So if we had a particle soaring through empty outer space, far away from any stars or other objects, it would travel along a straight line through this flat spacetime. But a few years after publishing his special theory of relativity, Einstein came back for the one-two punch and generalized his geometric framework for the universe by explaining the geometric origins of gravity. It's called general relativity, and it's probably the most beautiful physical theory that humans have ever written down.
Gravity is different than the other forces that we encounter. From Newton's law $F = ma$, we expect that the acceleration of a particle will in general depend on its mass $m$. But as you likely learned in the first week or two of your first physics class, a falling bowling ball drops at the same rate as a falling penny, despite the order of magnitude difference in their masses.
Gravity is therefore universal; it affects all particles in the same way regardless of their mass. Einstein reasoned that we therefore shouldn't think of gravity as a force at all, but as a feature of the background spacetime on which particles move, and which subsequently affects all particles in the same way.
A particle in a gravitational field isn't being accelerated at all, it's just traveling along on its merry way—it's the spacetime around the particle that has changed.
This is what lead Einstein to the idea that gravity could be attributed to the shape of spacetime. Like I mentioned at the top, the gist is that the presence of a massive object like a star warps the spacetime around it, deforming it from the flat, Minkowski spacetime of special relativity into a curved spacetime. Then a particle (or planet) passing nearby still does its best to keep traveling along the straightest and shortest line that it can, but now it's tracing out a path in the curved geometry. These paths are the geodesics.
In particular, the action for a particle in general relativity is still going to be given by the same formula we wrote down before: the length of the worldline. Only now we need to replace the flat space Minkowski metric $\mathrm{d}s^2$ of special relativity with the curved metric, and then applying the principal of least action produces the geodesic equation.
I'm going to tell you about how all this works in a little more detail. This is a pretty advanced subject though—physics students usually take their first general relativity class at the end of college or the beginning of grad school. But the ideas are so beautiful that I think it's definitely worth exploring a bit even if you're more of a beginner. So if you are a beginner, don't sweat the details of the equations too much—and definitely don't be scared away by them. If you keep studying physics then they'll make sense in time. For now I hope you'll at least come away with an appreciation for a few of the big ideas of general relativity, and the transformative way that Einstein reshaped the way we look at the universe.
Since general relativity is, like the name implies, a generalization of special relativity, let's start by reviewing what we learned about the action for a particle in special relativity last time. And we'll also introduce some new notation that will make the generalization from special relativity to general relativity more straightforward.
When we went to compute the length of the worldline of a particle in special relativity, we were confronted with the fact that the Minkowski metric $\mathrm{d}s^2$ is negative along the worldine of a massive particle. So instead of using $\mathrm{d}s = \sqrt{\mathrm{d}s^2}$ to measure the length of the worldline, we flipped the sign first, $\sqrt{-\mathrm{d}s^2}$. Then the length of the worldline is
$$\int \sqrt{-\mathrm{d}s^2} = \int \sqrt{c^2 \mathrm{d}t^2 - \mathrm{d}x^2} = c \int \mathrm{d}t \sqrt{1 - \dot x^2/c^2},$$
where $\dot x = \mathrm{d}x/\mathrm{d}t$. The integral on the right—i.e. the length of the worldline divided by $c$—is called the proper time $\tau$ of the particle. It's the time that's ticked off on the particle's watch as it moves through spacetime.
The length of the worldline is maximized along a straight line through spacetime—that it's a maximum instead of a minimum is one of those peculiar features of the Minkowski metric. That's why in the twin paradox, the twin who stays home on Earth winds up older than the twin who flies around outer space in a rocket ship before coming home. The worldline for the twin sitting at home is a straight line, and so the most time has elapsed on their watch. The twin in the rocket ship followed a curvy worldline through spacetime. So even though they begin and end at the same event, less proper time has elapsed on the rocket ship twin's watch, and when they get home they're younger.
That lead us to identify the action for a particle in special relativity with the length of the worldline, up to some factors:
$$S = - mc \int \sqrt{-\mathrm{d}s^2}.$$
The $mc$ has to be there to get the units right. And the minus sign is there because we want the action to be minimized, whereas the proper time along a straight line is maximized.
Now we want to extend this to general relativity. In fact, we don't have to change our action formula at all. The particle is still going to follow the straightest and "shortest" path through spacetime that it can—where again "shortest" really means the maximum proper time. The difference is that the Minkowski metric that describes flat spacetime gets replaced with the curved metric of a spacetime that's been warped by the presence of something like a star.
To describe a curved metric, it's convenient to introduce some new notation. Let's write the spacetime coordinates of the particle as $x^\mu$ ($\mu$ is the greek letter "mu"). So $x^1$ will stand for the $x$ component, $x^2$ will stand for the $y$ component, and $x^3$ will stand for the $z$ component. (We've mostly been ignoring the $y$ and $z$ components so far to keep things simple.) As for the time component, we'll write that as $x^0 = ct$. So our spacetime coordinates are
$$x^\mu = \begin{pmatrix} x^0\\x^1\\x^2\\x^3\end{pmatrix}= \begin{pmatrix} ct\\x\\y\\z \end{pmatrix},\quad \mu = 0,1,2,3.$$
Note that those superscripts are labels, not exponents. Likewise, we can write the displacement vector as
$$\mathrm{d}x^\mu = \begin{pmatrix} c \mathrm{d}t\\ \mathrm{d}x\\ \mathrm{d}y\\ \mathrm{d}z\end{pmatrix}.$$
Next let's define a $4\times 4$ matrix with components $\eta_{\mu\nu}$ ($\eta$ is the Greek letter "eta" and $\nu$ is the Greek letter "nu") by:
$$\eta_{\mu\nu} = \begin{pmatrix} -1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\0 & 0 & 0 & 1 \end{pmatrix},\quad \mu,\nu = 0,1,2,3.$$
So in other words $\eta_{00} = -1$, and $\eta_{11}=\eta_{22}=\eta_{33} = 1$, and all the other components of the matrix are zero. Note again that we're counting the components from 0 here instead of from 1; it's just a convention to call the time direction the 0th component.
Now we can write the Minkowski metric in a nice and compact form
$$\begin{align} \mathrm{d}s^2 =& \sum_{\mu,\nu = 0}^3 \eta_{\mu\nu}\mathrm{d}x^\mu \mathrm{d}x^\nu \notag\\ =&\eta_{00}(\mathrm{d}x^0 )^2 + \eta_{11}(\mathrm{d}x^1)^2 +\eta_{22}(\mathrm{d}x^2)^2 +\eta_{33} (\mathrm{d}x^3)^2\notag\\ =& -(c\mathrm{d}t)^2 + (\mathrm{d}x)^2 + (\mathrm{d}y)^2 + (\mathrm{d}z)^2.\notag \end{align}$$
All this notation might seem like overkill, since the Minkowski metric isn't all that complicated to begin with. But it's going to be very convenient for the generalization to curved spacetime in a minute.
With our new notation, we can write the length of the worldline like this:
$$\int \sqrt{-\mathrm{d} s^2} = \int \sqrt{- \eta_{\mu\nu}\mathrm{d} x^\mu \mathrm{d} x^\nu}.$$
Notice that I didn't write the summation symbol $\sum_{\mu,\nu=0}^3$ here—sums like this appear so often in relativity that it's very convenient to just declare the convention that any time two indices show up in the same term, we sum over them. So $A_\mu B^\mu$ is shorthand for $\sum_{\mu=0}^3 A_\mu B^\mu$, and likewise $\eta_{\mu\nu}\mathrm{d}x^\mu \mathrm{d}x^\nu$ stands for $\sum_{\mu,\nu=0}^3\eta_{\mu\nu}\mathrm{d}x^\mu \mathrm{d}x^\nu.$
Now remember, we're evaluating this integral along the worldline that the particle traces out through spacetime. We can specify the worldline by giving its coordinates $x^\mu(\lambda)$ as a function of some parameter $\lambda$. The particular parameter you pick doesn't matter—you can use any $\lambda$ that you like. For example, you might pick $\lambda = t$ to coincide with the time in the coordinate system that you've set up. Or you might set $\lambda = \tau$ equal to the proper time on the particle's watch.
Let's multiply and divide the integrand by $\mathrm{d}\lambda$ to get a more standard looking integral:
$$\int \sqrt{-\mathrm{d}s^2} = \int \mathrm{d}\lambda\sqrt{- \eta_{\mu\nu} \frac{\mathrm{d} x^\mu}{\mathrm{d} \lambda } \frac{\mathrm{d} x^\nu}{\mathrm{d} \lambda }}.$$
For example, if we pick $\lambda = t$ here, then we get
$$\begin{align} \int \mathrm{d}t\sqrt{- \eta_{\mu\nu} \frac{\mathrm{d} x^\mu}{\mathrm{d} t} \frac{\mathrm{d} x^\nu}{\mathrm{d} t}} =&\int \mathrm{d}t \sqrt{ -\eta_{00} \left(\frac{\mathrm{d} (ct) }{\mathrm{d} t }\right)^2 - \eta_{11} \left(\frac{\mathrm{d} x }{\mathrm{d} t }\right)^2 }\notag\\ =& \int \mathrm{d}t \sqrt{c^2 - \left(\frac{\mathrm{d} x }{\mathrm{d} t }\right)^2}\notag\\ =& c \int \mathrm{d}t \sqrt{1 - \dot x^2/c^2},\notag \end{align}$$
just like before. (I'm again dropping the $y$ and $z$ directions for simplicity here, but in general they contribute as well.)
This notation makes it really straightforward to go from the flat spacetime of special relativity to the curved spacetime of general relativity. We just replace the constant matrix $\eta_{\mu\nu}$ with a general matrix $g_{\mu\nu}(x)$ that's a function of the coordinates:
$$\mathrm{d}s^2 =g_{\mu\nu}(x)\mathrm{d}x^\mu \mathrm{d}x^\nu.$$
In general, this is going to be the metric of a curved space. Roughly, the reason is that the coefficients $g_{\mu\nu}(x)$ depend on your position $x^\mu$ in spacetime. And so the distance between neighboring points $x^\mu$ and $x^\mu + \mathrm{d}x^\mu$ varies depending on where you are in the space.
So what lead Einstein to think that gravity is related to the curvature of spacetime? Like I briefly mentioned in the introduction, the remarkable feature of gravity is that it's universal: it affects all particles in the same way, regardless of their mass. Galileo demonstrated this long ago for projectiles on Earth, supposedly by dropping balls of different masses from the top of the leaning tower of Pisa. They were all accelerated downward at the same rate and hit the ground at the same time, regardless of their mass.
So on Earth, we observe that the weight of an object is $F = -mg$, where $g \approx 9.8 ~\mathrm{m/s^2}$ is a constant. And so $F = ma$ for a falling object implies that the acceleration $a = -g$ is always the same constant, independent of its mass. Likewise, if we write Newton's inverse square law of gravity, e.g. between a star $M$ and a planet $m$, then $F = ma$ for the planet reads
$$-\frac{GMm}{r^2} \hat{{} r} = m \vec{{}a},$$
and once again the mass $m$ cancels out. (The big $M$ of the star's mass doesn't cancel—that's what sets the strength of gravity around an object of mass $M$. We're talking here about the acceleration of another object $m$ due to the presence of $M$.)
The fact that gravity acts on all particles in the same way made Einstein suspect that it shouldn't really be attributed to a force at all in the sense of $F = ma$. Instead gravity is a feature of the background on which all particles are traveling along—i.e. spacetime—and it's the shape of spacetime that produces the effects we observe as gravity. Any particle, regardless of its mass, just does its best to travel along a straight line through spacetime, but the presence of a big mass like a star warps the geometry and deforms the particle's trajectory away from what it would have been in empty outer space.
The conceptual framework here is very similar to electromagnetism, which you may be more familiar with. Electric charges and currents create electric and magnetic fields according to Maxwell's equations, which then influence the motion of charged particles according to the Lorentz force law, $\vec {{}F} = q (\vec{{} E}+ \vec{{}v}\times \vec{{}B})$. In general relativity, massive objects warp spacetime, and then the shape of the spacetime tells massive particles how to move.
The way that massive objects warp the shape of spacetime is described mathematically by what are called "Einstein's field equations." They're the analog of Maxwell's equations for electromagnetism. I'm not going to get into the details of those equations right now, but the point is that if somebody hands us some distribution of mass like a big star, then we can try to solve Einstein's equations to figure out the curved metric $g_{\mu\nu}$ that results. After that, we write down the action for a particle traveling through this curved spacetime and minimize it to determine the trajectory that it will follow.
The action is just like we wrote down before, only this time we need to compute the length of the worldline using the curved metric:
$$S = - mc \int\mathrm{d}\lambda \sqrt{-g_{\mu\nu} \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda} \frac{\mathrm{d} x^\nu }{\mathrm{d} \lambda } }. $$
Again, the conceptual idea is more important than the detailed equation here: up to some factors, the action is just equal to the length of the particle's worldline through spacetime, which has been warped by the presence of e.g. a star. Then to minimize the action, the particle will take the shortest path that it can through spacetime—or more precisely it takes the path of maximum proper time.
Like we learned in the previous lessons, to apply the principle of least action we take a little variation of the trajectory $x^\mu(\lambda) \to x^\mu(\lambda) +\varepsilon^\mu(\lambda)$ and then insist that the action shouldn't change at leading order in $\varepsilon$. That condition will give us the equation of motion. It takes a little effort so bear with me! Let
$$ l = \sqrt{-g_{\mu\nu} \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda} \frac{\mathrm{d} x^\nu }{\mathrm{d} \lambda }} $$
stand for the integrand of the action. Then when we make the little variation of $x^\mu(\lambda)$, the change in $l$ is
$$\mathrm{d}l = -\frac{1}{2l}\left( 2g_{\mu\nu} \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda } \frac{\mathrm{d} \varepsilon^\nu }{\mathrm{d} \lambda } +\frac{\partial g_{\mu\nu} }{\partial x^\rho } \varepsilon^\rho \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda } \frac{\mathrm{d} x^\nu}{\mathrm{d} \lambda } \right).$$
The first term comes from the change in the $\mathrm{d}x/\mathrm{d}\lambda$ factors—that's what we would have had even in flat space. The second term is new in curved space: when the metric $g_{\mu\nu}(x)$ depends on $x$, then it also changes when you make a variation of $x$.
Now we integrate to get the change in the action, and we integrate by parts on the first term to pull out the common factor of $\varepsilon$:
$$\mathrm{d}S = \frac{1}{2}mc \int \mathrm{d}\lambda~\varepsilon^\rho \left( -\frac{\mathrm{d} }{\mathrm{d} \lambda }\left( \frac{2}{l}g_{\mu\rho} \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda } \right) +\frac{1}{l}\frac{\partial g_{\mu\nu} }{\partial x^\rho } \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda } \frac{\mathrm{d} x^\nu}{\mathrm{d} \lambda } \right) .$$
Since this is supposed to vanish for any variation $\varepsilon(\lambda)$, the quantity it multiplies has to vanish—that's the equation of motion. Expanding out the derivative and doing a little simplifying, we get
$$g_{\mu\rho}\frac{\mathrm{d}^2 x^\mu }{\mathrm{d} \lambda^2 }+\left(\frac{\partial g_{\mu\rho}}{\partial x^\nu} -\frac{1}{2}\frac{\partial g_{\mu\nu} }{\partial x^\rho }\right)\frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda }\frac{\mathrm{d} x^\nu}{\mathrm{d} \lambda } = \frac{1}{l} \frac{\mathrm{d} l }{\mathrm{d} \lambda } g_{\mu\rho} \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda } . $$
We're looking for an equation for $\mathrm{d}^2 x^\mu/\mathrm{d}\lambda^2$, since that will be the generalization of Newton's second law $\ddot x = 0$ for a free particle in Newtonian mechanics. Then we need to get rid of the $g_{\mu\rho}$ in front. To do that, we just need to multiply by the inverse matrix, which it's conventional to indicate by raising up the indices: $g^{\nu\mu}g_{\mu\rho} = \delta^\nu{}_\rho$, where $\delta$ denotes the identity matrix with 1's along the diagonal. Anyway, after multiplying by the inverse matrix we get
$$ \frac{\mathrm{d}^2 x^\mu }{\mathrm{d} \lambda^2 } + g^{\mu\kappa} \left( \frac{\partial g_{\rho\kappa}}{\partial x^\sigma} -\frac{1}{2}\frac{\partial g_{\rho\sigma} }{\partial x^\kappa } \right)\frac{\mathrm{d} x^\rho }{\mathrm{d} \lambda } \frac{\mathrm{d} x^\sigma}{\mathrm{d} \lambda } =\frac{1}{l} \frac{\mathrm{d} l }{\mathrm{d} \lambda } \frac{\mathrm{d} x^\mu }{\mathrm{d} \lambda }. $$
That's looking a little bit better—still complicated, but a little better. It would be even nicer if the right-hand-side vanished. And in fact we can make it vanish, by remembering that $\lambda$ was just an arbitrary parameter that we get to pick. In particular, if we choose $\lambda$ to be equal to the proper time—i.e. $\mathrm{d}\lambda = \mathrm{d}\tau = \frac{1}{c}\sqrt{-g_{\mu\nu}\mathrm{d}x^\mu \mathrm{d}x^\nu}$—then we get
$$ l = \sqrt{-g_{\mu\nu} \frac{\mathrm{d} x^\mu }{\mathrm{d} \tau} \frac{\mathrm{d} x^\nu }{\mathrm{d} \tau }} = c .$$
So with this choice, $l$ is a constant, and so $\mathrm{d}l/\mathrm{d}\lambda = 0.$ Convenient! Then our equation of motion simplifies to
$$ \frac{\mathrm{d}^2 x^\mu }{\mathrm{d} \tau^2 } + g^{\mu\kappa} \left( \frac{\partial g_{\rho\kappa}}{\partial x^\sigma} -\frac{1}{2}\frac{\partial g_{\rho\sigma} }{\partial x^\kappa } \right) \frac{\mathrm{d} x^\rho }{\mathrm{d} \tau } \frac{\mathrm{d} x^\sigma}{\mathrm{d} \tau } =0. $$
This, at last, is the geodesic equation. Back in Minkowski spacetime, where the metric is constant, the second term vanishes. Then we're left with the equation of a straight line:
$$\frac{\mathrm{d}^2 x^\mu }{\mathrm{d}\tau^2 } = 0.$$
But in a curved space, the second term introduces a deformation of the straight line. A geodesic is as straight as you can get in a curved spacetime. By adding any wiggles to the curve, we would increase its length—or, rather, decrease the proper time—and therefore it wouldn't be an extremal path anymore.
There's one more manipulation we should make to put the equation of motion into the standard form that people usually write the geodesic equation. Notice that, in the second term, the combination of $\mathrm{d}x^\rho/\mathrm{d}\tau$ and $\mathrm{d}x^\sigma/\mathrm{d}\tau$ is symmetric in $\rho$ and $\sigma$—if you swap the two of them the equation doesn't change. That means in the thing in parentheses that's multiplying those derivatives, we can freely symmetrize in $\rho$ and $\sigma$—effectively, take the average of that expression with the one we get by exchanging the two indices. Then we can write the same equation as
$$ \frac{\mathrm{d}^2 x^\mu }{\mathrm{d} \tau^2 } +\frac{1}{2} g^{\mu\kappa} \left( \frac{\partial g_{\sigma\kappa}}{\partial x^\rho} + \frac{\partial g_{\rho\kappa}}{\partial x^\sigma} -\frac{\partial g_{\rho\sigma} }{\partial x^\kappa } \right) \frac{\mathrm{d} x^\rho }{\mathrm{d} \tau } \frac{\mathrm{d} x^\sigma}{\mathrm{d} \tau } =0. $$
The combination that appears here is called the Christoffel symbol,
$$\Gamma^\mu_{\rho\sigma} =\frac{1}{2} g^{\mu\kappa} \left( \frac{\partial g_{\sigma\kappa}}{\partial x^\rho} + \frac{\partial g_{\rho\kappa}}{\partial x^\sigma} -\frac{\partial g_{\rho\sigma} }{\partial x^\kappa } \right). $$
It's a very important object in the mathematics of a curved geometry, but for our purposes here we can just think of it as some matrix $\Gamma^\mu$ for each coordinate $\mu$, with components $(\Gamma^\mu)_{\rho\sigma}$.
Anyway, at long last we wind up with the standard form of the geodesic equation,
$$\frac{\mathrm{d}^2x^\mu }{\mathrm{d} \tau^2 } + \Gamma^\mu_{\rho \sigma} \frac{\mathrm{d} x^\rho }{\mathrm{d} \tau } \frac{\mathrm{d} x^\sigma}{\mathrm{d} \tau } = 0.$$
That calculation got a little hairy, which is why I didn't include it in the video itself. If you're new to all this curved geometry business, don't worry too much about the details of these equations for right now. You can learn to unpack them all later on if you're interested in properly studying GR.
The geodesic equation describes the motion of a free particle in the presence of some other much more massive objects that created the warped geometry. It's the generalization of Newton's second law for a free particle, $F = m a = 0$ , to general relativity. Note that the equation doesn't depend on the mass $m$ of the particle. All particles travel along geodesics, regardless of their mass.
The last thing I want to do is give you an intuitive idea of what geodesics are all about by describing what's probably the simplest example of a curved space that we can all picture: the surface of a sphere. These aren't directly relevant to the geodesics in spacetime that we encounter in general relativity, but they'll at least give you an idea that you can picture in your head to understand what a geodesic is.
So picture a sphere, and pick any two points on it. To find the geodesic between them, just draw an equator of the sphere that goes between the two endpoints. In other words, think of the sphere as an onion, and chop the onion in half so that your knife goes through both of the given points. Call one half the "northern" hemisphere and the other the "southern" hemisphere. The cut you made is along the "equator", and it defines a geodesic between the two points (two, actually, one going the short way around and the other the long way).
Okay, that was a very quick introduction to a bunch of very challenging, but also hopefully very interesting ideas. So let me quickly summarize the key things we learned about.
Spacetime is the stage on which physical processes play out, and Einstein's theory of relativity might better be called the theory of spacetime, because it tells us how to understand the structure of spacetime. It's a framework for doing physics, and we can build on top of it additional features like particles and forces and fields.
Free particles basically travel along the straighest and shortest paths through spacetime that they can—with the caveat that, in spacetime, "shortest" actually means maximizing the proper time, which is the time that's ticked off on a watch that's strapped to the particle.
In special relativity, we ignore the effect of gravity (or, at least, we assume that it's weak). Then spacetime is flat, and a free particle literally travels along a straight line.
General relativity builds gravity into the structure of spacetime by warping the metric into a curved geometry. The way that works is governed by Einstein's equations, which we didn't talk much about here. Then a free particle follows the next-best-thing to a straight line: what we called a geodesic.
The action for a free particle in either special relativity or general relativity is the same: it's simply equal to the length of the worldline that the particle traces out as it moves through spacetime, up to some constant factors. Then the principle of least action says that the particle indeed follows the shortest path that it can in getting from one point to another.
See also:
Part 4: The Action for String Theory
If you encounter any errors on this page, please let me know at feedback@PhysicsWithElliot.com.