Friday, 29 September 2017

The Sleeping Beauty problem

Continuing the theme of probability and self-locating uncertainty from my last post, there is a famous puzzle in philosophy called the Sleeping Beauty problem. It involves the following experiment that Sleeping Beauty volunteers to take part in.

On Sunday, Sleeping Beauty is put to sleep and then a fair coin is tossed. On Monday, Sleeping Beauty is awakened and interviewed. She is then put to sleep again with an amnesia-inducing drug that makes her forget that awakening. On Tuesday, if the coin landed tails she is awakened, interviewed and put to sleep once more. Otherwise, if the coin landed heads, she is left asleep. On Wednesday, she is awakened and the experiment ends.

When she is awakened and interviewed she does not know which day it is or whether she has been awakened before. During the interview, Beauty is asked, "What is the probability that the coin landed heads?" The experiment is illustrated below.

Sleeping Beauty problem (illustration by Stuart Armstrong)











There are two popular but opposing solutions that are commonly given by philosophers. The solution known as the thirder position is that Sleeping Beauty should report that the probability of heads is 1/3. [1] The intuition is that Sleeping Beauty cannot distinguish between the three awake states, so she should assign the same probability to each state. Also, if the experiment is run many times, Sleeping Beauty is awakened on average in a state where the coin has landed heads 1/3 of the time and in a state where the coin has landed tails 2/3 of the time.

The halfer position is that that Sleeping Beauty should report that the probability of heads is 1/2. [2] The intuition here is that Sleeping Beauty seems to learn no new information when she is awakened. So she should not update on the probability that she held on the Sunday before the experiment began. That is, it is the coin toss that is the relevant event, not how many times she is awakened. [3]

So which position is correct?

Before the experiment starts, Sleeping Beauty knows that during the experiment she can be in one of the four following states with equal probability.

Monday  Tuesday
Heads   1/4  1/4
Tails  1/4  1/4
She also knows that she will not be awakened on Tuesday if the coin lands heads, so the probability for that state is 0 when she is awakened. The question then is how the other probabilities should be updated.

The halfers distribute the probability to the remaining heads state, as follows:

Monday Tuesday
Heads  1/2 0
Tails 1/4 1/4
From the table:
P(Heads and Monday) = 1/2
P(Tails and Monday) = 1/4
P(Monday) = P(Heads and Monday) + P(Tails and Monday)
          = 1/2 + 1/4 = 3/4
Using the formula for conditional probability: [4][5]
P(Heads|Monday) = P(Heads and Monday) / P(Monday)
                = 1/2 / 3/4 = 2/3
This, I think, reveals the fatal flaw with the halfer position. If Sleeping Beauty is told it is Monday, then she will think the probability for the coin landing heads is 2/3!

The thirders distribute the probability evenly to all awake states, as follows:

Monday Tuesday
Heads  1/3 0
Tails 1/3 1/3
From the table:
P(Heads and Monday) = 1/3
P(Tails and Monday) = 1/3
P(Monday) = P(Heads and Monday) + P(Tails and Monday)
          = 1/3 + 1/3 = 2/3
Using the formula for conditional probability: [6]
P(Heads|Monday) = P(Heads and Monday) / P(Monday)
                = 1/3 / 2/3 = 1/2
So, if Sleeping Beauty is told it is Monday, then she will think the probability for the coin landing heads is 1/2, which intuitively seems correct.

But the thirder position seems to imply that Beauty has learnt new information since Sunday (when the probability for heads was 1/2) which causes her to update her probability for the coin landing heads to 1/3. What could that new information be?

That it is both Tuesday and that the coin landed heads is a possible state for Beauty to be in when she is asleep. So she learns that she is not in that state when she wakes up. As a consequence, she updates by excluding that state and normalizing the probabilities for the remaining awake states.

One further test for the thirder position is to consider the probabilities when Sleeping Beauty is awakened on, say, 1000 consecutive days when the coin lands tails (instead of two consecutive days). [7] The probability of the coin landing heads when Beauty is awakened is then 1/1001. This gives:
P(Heads) = 1/1001
P(Tails) = 1000/1001
P(Heads and Monday) = 1/1001
P(Tails and Monday) = 1/1001
P(Monday) = P(Heads and Monday) + P(Tails and Monday)
          = 1/1001 + 1/1001 = 2/1001
P(Heads|Monday) = P(Heads and Monday) / P(Monday)
                = 1/1001 / 2/1001 = 1/2
As with the original experiment, if Sleeping Beauty is told that it is Monday, then she will think the probability for the coin landing heads is 1/2 which (as before) intuitively seems correct. Note that there are still 1000 states that Beauty can be in as a result of the coin landing heads. But she will just not be awake when she is in 999 of those states so they are excluded from her awake state calculations.

--

[1] The thirder position was initially introduced by Adam Elga in this paper.

[2] The halfer position was defended by David Lewis in his reply to Elga.

[3] The thirders interpret the problem as an experiment about awakening events (i.e., which branch Beauty will find herself on) whereas the halfers interpret the problem as an experiment about a coin toss event (i.e., which branch Beauty will be placed on). But both sides agree about what odds should be accepted if Beauty is offered a bet that the coin landed heads. If she is offered a bet whenever she is awakened, she should accept 1/3 odds. If she is only offered a bet once, she should accept 1/2 odds.

[4] The formula for calculating the conditional probability of A given B is:
P(A|B) = P(A and B) / P(B)
[5] The probability calculations for Sleeping Beauty when awake according to halfers:
P(Heads) = 1/2
P(Tails) = 1/2
P(Heads and Monday) = 1/2
P(Heads and Tuesday) = 0
P(Tails and Monday) = 1/4
P(Tails and Tuesday) = 1/4
P(Monday|Heads) = P(Heads and Monday) / P(Heads) = 1/2 / 1/2 = 1
P(Tuesday|Heads) = P(Heads and Tuesday) / P(Heads) = 0 / 1/2 = 0
P(Monday|Tails) = P(Tails and Monday) / P(Tails) = 1/4 / 1/2 = 1/2
P(Tuesday|Tails) = P(Tails and Tuesday) / P(Tails) = 1/4 / 1/2 = 1/2
P(Monday) = P(Heads and Monday) + P(Tails and Monday) = 1/2 + 1/4 = 3/4
P(Tuesday) = P(Heads and Tuesday) + P(Tails and Tuesday) = 0 + 1/4 = 1/4
P(Heads|Monday) = P(Heads and Monday) / P(Monday) = 1/2 / 3/4 = 2/3
P(Heads|Tuesday) = P(Heads and Tuesday) / P(Tuesday) = 0 / 1/4 = 0
P(Tails|Monday) = P(Tails and Monday) / P(Monday) = 1/4 / 3/4 = 1/3
P(Tails|Tuesday) = P(Tails and Tuesday) / P(Tuesday) = 1/4 / 1/4 = 1
[6] The probability calculations for Sleeping Beauty when awake according to thirders:
P(Heads) = 1/3
P(Tails) = 2/3
P(Heads and Monday) = 1/3
P(Heads and Tuesday) = 0
P(Tails and Monday) = 1/3
P(Tails and Tuesday) = 1/3
P(Monday|Heads) = P(Heads and Monday) / P(Heads) = 1/3 / 1/3 = 1
P(Tuesday|Heads) = P(Heads and Tuesday) / P(Heads) = 0 / 1/3 = 0
P(Monday|Tails) = P(Tails and Monday) / P(Tails) = 1/3 / 2/3 = 1/2
P(Tuesday|Tails) = P(Tails and Tuesday) / P(Tails) = 1/3 / 2/3 = 1/2
P(Monday) = P(Heads and Monday) + P(Tails and Monday) = 1/3 + 1/3 = 2/3
P(Tuesday) = P(Heads and Tuesday) + P(Tails and Tuesday) = 0 + 1/3 = 1/3
P(Heads|Monday) = P(Heads and Monday) / P(Monday) = 1/3 / 2/3 = 1/2
P(Heads|Tuesday) = P(Heads and Tuesday) / P(Tuesday) = 0 / 1/3 = 0
P(Tails|Monday) = P(Tails and Monday) / P(Monday) = 1/3 / 2/3 = 1/2
P(Tails|Tuesday) = P(Tails and Tuesday) / P(Tuesday) = 1/3 / 1/3 = 1
[7] A similar example is used by Nick Bostrom in this paper intending to show that the thirder position is counter-intuitive. I disagree, for the reasons I give in my conclusion. Bostrom also makes several other arguments against both the halfer and thirder positions and defends a hybrid position.

Thursday, 7 September 2017

Probability in a causal universe

Does God play dice with the universe?
Probability is a familiar idea. When we roll a dice, we know there is a 1/6 chance of seeing a particular number between 1 and 6.

While we often say that the number we see came up by chance, we also know that that description really just reflects our lack of knowledge about the physical forces that acted on the dice. If we knew precisely its initial orientation, how fast it was rolled, what the surface characteristics were and so on, we could accurately predict which number would come up.

However the situation is more complex with quantum mechanics. An observer could have complete knowledge about a particular system, but not know which state they will measure it in. For example, suppose a photon is sent through a balanced beam splitter. The system evolves into a superposition of two relative states. In one state, the photon is on the reflection path, in the second state the photon is on the transmission path.

The Born rule states that the probability of measuring a system in a particular state is given by squaring the magnitude of the amplitude for the state. In the beam splitter example above, the amplitude for each relative state is √(1/2).[1] So the probability for measuring each state is 1/2.

If the observer has complete knowledge about the system, it would appear that the probabilities are intrinsic to nature. It was this idea that Einstein disputed with when he famously proclaimed that God does not play dice with the universe. As he put it in a letter to Max Born, "Quantum theory yields much, but it hardly brings us close to the Old One's secrets. I, in any case, am convinced He does not play dice with the universe."

So in many textbooks the Born rule is stated as a fundamental postulate of quantum mechanics and not as something that needs to be explained. But it need not be this way. The Everett (Many Worlds) interpretation of quantum mechanics takes Einstein's side on the issue and requires that the Born rule be derivable from quantum mechanics, not merely postulated. If this succeeds, it returns us to the original understanding where probability is a result of a lack of knowledge. The twist, though, is that the observer could still have complete knowledge of the system under observation, but just not of their own location with respect to that system. I'm going to outline a derivation below that draws from Carroll's and Sebens' derivation.

1. On the Everett interpretation, measurement leads to initial self-locating uncertainty. An observer can have complete knowledge about the relative states of the system, but not which particular state they have just measured (since there is a version of them that has measured each state). This raises the question of how to quantify their uncertainty in terms of probabilities.

2. If the state amplitudes are equal, the observer should initially be indifferent about which state they have measured. So the states can simply be counted to calculate the probability that a particular state has been measured.

3. If the state amplitudes are not equal, they can be mathematically factored into states that do have equal amplitudes. And again the states can be counted to calculate the probability. The number of factored states exactly tracks the square of the initial amplitude, so it is equivalent to applying the Born rule.

This last point is interesting. Suppose that the wave function for a particular beam splitter gives a (non-normalized)[2] amplitude of 1 for the reflection path state and an amplitude of 2 for the transmission path state. The Born rule says that the probability of observing the two states is in the ratio 1:4. That is, it is not correct to just count the states, or even just apportion the amplitude. Instead the amplitudes must be squared. But why should this be the rule?

This can be demonstrated by transforming the initial setup into a new setup with equal amplitude states as shown in Diagram 1. To do this, add a second beam splitter (with 1/2 probability of reflection and transmission) to the transmission path of the first beam splitter. Note: the amplitude for the reflection and transmission relative states is √(1/2) each.

Diagram 1: Factoring beam splitter states (amplitudes shown)
When a photon is sent through this setup, there are two new relative states on the transmission path with an amplitude of √2 each (i.e., 2 * √(1/2) = √2). Now add two more balanced beam splitters on each of those two paths. When a photon is sent through this setup, there are four new relative states on the initial transmission path with an amplitude of 1 each (i.e., √2 * √(1/2) = 1). Now the amplitudes for all five final relative states are equal to 1 and the ratio of the initial reflection path states to the initial transmission path states is 1:4 as required by the Born rule.

This can also be understood geometrically as an application of the Pythagorean Theorem as shown in Diagram 2.

Diagram 2: Factoring beam splitter states
(squared amplitudes shown)
The lengths of the sides of the green triangle at the far left represents the amplitudes for the first beam splitter states. The reflection path state has an amplitude of 1 and the transmission path state has an amplitude of 2. Since the relative states are orthogonal, the triangle is right-angled. The hypotenuse represents the superposition state which has an amplitude of √5. The red numbers represent the squares of the sides (which is the non-normalized probability of measuring the state).

The second green triangle represents the second beam splitter states with a hypotenuse of length 2. Since this beam splitter is balanced, the shorter sides have the same length. Therefore, by the Pythagorean Theorem, 22 = a2 + a2; 4 = 2a2; 2 = a2; a = √2. Each of these shorter sides now becomes the hypotenuse for two further right-angled triangles with equal length short sides. Again, by the Pythagorean Theorem, (√2)2 = a2 + a2; 2 = 2a2; 1 = a2; a = 1. Thus the amplitudes are all equal to 1 with a reflection/transmission ratio of 1:4 as required.

--

[1] If the photon entered the front of the beam splitter then, due to a phase change, the amplitude for the reflection path would be -√(1/2). For the purpose of this post, assume that the photon enters the rear of the beam splitters and so both the transmission and reflection amplitudes will be positive.

[2] Amplitudes are usually normalized such that the probabilities (the squares of the amplitudes) of the relative states sum to 1.