Calculating the posterior probability with a quantum circuit.
Let’s look at quantum computing from the programmer’s perspective and work with qubits in a practical way.
If you know a little bit about probabilities, that’s enough. If not, here’s a brief recap.
There are different types of probabilities.
The Marginal Probability is the absolute probability of an event
The Joint Probability is the probability of two events occurring together
The Conditional Probability is the probability of one event given the knowledge that another event occurred
In the last post, we already learned how to calculate the marginal and joint probability.
In this post, we calculate the posterior probability given a prior and a modifier.
Graphically, the posterior probability is almost the same as the joint probability. In fact, the area representing the positive cases is the same. It is the overlap of event A and event B. But the base set is different. While we consider all possible cases when calculating the joint probability, we only consider the cases where one event occurred when calculating the posterior probability.
Image by author, Frank Zickert
Bayes’ Theorem tells us how to calculate the posterior probability. A conditional probability is a probability of an event (our hypothesis) given the knowledge that another event occurred (our evidence). Bayes tells us we can calculate the conditional probability of P(Hypothesis|Evidence) as the product of the marginal probability of the hypothesis (P(Hypothesis), called the prior probability) and a modifier. This modifier is the quotient of the “backward” conditional probability P(Evidence|Hypothesis) and the marginal probability of the new piece of information P(Evidence). The backward probability (the numerator of the modifier) answers the question “what is the probability of observing this evidence in a world where our hypothesis is true?”. The denominator is the probability of observing the Evidence on its own.
The following equation depicts Bayes’ Theorem mathematically:
The modifier can be any positive number. But most likely, it is a number close to 1. If the modifier was exactly 1, it would mean the prior is equal to the posterior probability. Then, the Evidence would not have provided any information.
Start your Jupyter Notebook and import Qiskit. If you haven’t set up your machine yet, here’s a tutorial on how to setup JupyterLab for quantum computing.
We will create and run quite a few circuits in this post. Therefore, here’s a helper function that takes a configured QuantumCircuit instance, runs it, and returns the histogram.
from qiskit import QuantumCircuit, Aer, execute
from qiskit.visualization import plot_histogram
import matplotlib.pyplot as plt
def run_circuit(qc, simulator='statevector_simulator', shots = 1, hist = True):
# Tell Qiskit how to simulate our circuit
backend = Aer.get_backend(simulator)
# execute the qc
results=execute(qc,backend, shots=shots).result().get_counts()
# plot the results
return plot_histogram(results, figsize=(18,4)) if hist else results
In this post, we introduced the RY-gate. It takes a parameter we can use to specify the exact probability. For the RY-gate takes an angle θ as its parameter, not a probability, we need to convert the probability into an angle before we pass it to the gate. This is what the function prob_to_angle does for us.
from math import asin, sqrt
def prob_to_angle(prob):
"""
Converts a given P(psi) value into an equivalent theta value.
"""
return 2*asin(sqrt(prob))
Now, let’s start with the fun!
In the first case, let’s say the modifier is a number between 0.0 and 1.0. We can use the following quantum circuit to calculate a joint probability (see this post for a detailed explanation of this circuit).
# Specify the prior probability and the modifier
prior = 0.4
modifier = 0.9
qc = QuantumCircuit(4)
# Set qubit to prior
qc.ry(prob_to_angle(prior), 0)
# Apply the controlled RY-gate
qc.cry(prob_to_angle(modifier), 0, 1)
run_circuit(qc)
Image by author, Frank Zickert
Qubit 1 shows the resulting posterior probability of 0.36. Let’s have a look at what happens for a modifier greater than 1.0.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Set qubit to prior
qc.ry(prob_to_angle(prior), 0)
# Apply modifier
qc.cry(prob_to_angle(modifier), 0,1)
run_circuit(qc)
We get a math domain error. Of course, we do because the function prob_to_angle is only defined for values between 0 and 1. For values greater than 1.0 the arcsine is not defined. The arcsine is the reverse of the sine-function. Its gradient at 0.0 and 1.0 tend to infinity. Therefore, we can't really define the function for values greater than 1.0 in a meaningful way.
Let’s rethink our approach. If the modifier is greater than 1.0, it increases the probability. The resulting probability must be bigger than the prior probability. It must be greater by exactly (modifier−1)⋅prior.
The transformation gates let us cut the overall probability of 1.0 into pieces. Why don't we separate the prior not once but twice? Then, we apply the reduced modifier (modifier−1) on one of the two states representing the prior. The sum of the untouched prior and the applied reduced modifier should be the posterior.
In the following code, we apply the prior to qubit 0 (line 8) and to qubit 1 (line 11). Then, we apply the reduced modifier to qubit 2 through an RY-gate controlled by qubit 0.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.ry(prob_to_angle(prior), 1)
# Apply modifier to qubit 2
qc.cry(prob_to_angle(modifier-1), 0,2)
run_circuit(qc)
Image by author, Frank Zickert
We get six different states. Our posterior probability should be the sum of the states where qubit 1 is 1 plus the sum of the states where qubit 2 is 1. These are the four states on the right-hand side. Let's add them: 0.240+0.128+0.048+0.032=0.448
This didn't work. The expected result is 0.4+0.4∗0.2=0.48
What happened?
The problem is the case where both qubits 1 and 2 are 1. This is the state 0111. In order to get the correct posterior probability, we would need to count this state twice: 0.448+0.032=0.48.
This problem originated when we applied the prior probability for the second time. We aimed at two states, each representing the prior. When we look at the result, we can see that, in fact, the probability of measuring qubit 0 as 1 is 0.4 (the prior) and the probability of measuring qubit 1 as 1 is 0.4, too. But we also see that these probabilities are not independent of each other. But they overlap in the state 0011.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.ry(prob_to_angle(prior), 1)
run_circuit(qc)
Image by author, Frank Zickert
When we apply the prior to qubit 1, we need to leave the states where qubit 0 is 1 untouched. Have a look.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
qc.x(0)
run_circuit(qc)
Image by author, Frank Zickert
There are three lines that do the trick:
qc.x(0)
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
qc.x(0)
Let’s go through these lines step by step. In the first step, we apply the NOT-gate to qubit 0. It switches the probabilities of the states where qubit 0 is 0 with these where qubit 0 is 1.
The following figure depicts the state after the first NOT-gate.
Image by author, Frank Zickert
We set the prior (0.4) as the probability of measuring qubit 0 as 1. The NOT-gate reverses this. Now, we have the probability of 0.4 of measuring qubit 0 as 0.
This also means we measure the remainder (0.6) when qubit 0 is 1. Simply put, the NOT-gate is our way of saying: "Let's proceed to work with the remainder, not the prior".
This is the preparation for our next step. The controlled RY-gate.
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
We only apply a rotation of qubit 1 when qubit 0 is 1. This is the case only for the remainder. For the remainder is not 1.0 but it is 1.0-prior, we modify the probability we use in the controlled RY-gate. By specifying the size of the remainder as the denominator, we account for the smaller size.
The figure below depicts the state after the controlled RY-gate.
Image by author, Frank Zickert
The controlled RY-gate splits the remainder into two parts. The one part (state 0011) represents the prior. So does the state 0000 we separated in the very first step. There is no more overlap between these two states. To keep things ordered, we apply the NOT-gate on qubit 0 again. The state 0000 becomes 0001 and vice versa, and the state 0011 becomes 0010. It leaves us with the qubits 0 and 1 each representing the prior probability without overlap.
Image by author, Frank Zickert
We’re now prepared to apply the reduced modifier to one of the priors.
We can now cut the part of the modifier out of one of the priors. Again, we choose the lower qubit so that we have the resulting ones at the right.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
qc.x(0)
# Apply the modifier to qubit 2
qc.cry(prob_to_angle(modifier-1), 0,2)
run_circuit(qc)
Image by author, Frank Zickert
This looks better, Now, the states 0010 and 0101 add up to the posterior probability. Let’s clean this up a little more.
But wouldn’t it be nice to have a single one qubit representing the posterior?
First, we apply the CNOT-gate on qubits 1 and 3 with qubit 1 as the control qubit.
qc.cx(1,3)
If qubit 1 is 1 it applies the NOT-gate on qubit 3.
The following figure depicts the state after this gate.
Image by author, Frank Zickert
As usual, the CNOT-gate does not change the probabilities we see. It only changes the states representing them. In this case, the state 0010 was the only state where qubit 1 is 1. This state has now changed to 1010. The only difference is that qubit 3 is 1 in the given case now, too.
Next, we want to do the same for state 0101. Since this state is the only state where qubit 2 is 1 we can use the CNOT-gate again to set qubit 3 to 1 if qubit 2 is 1. The following code contains all the steps.
# Specify the prior probability and the modifier
prior = 0.4
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
qc.x(0)
# Apply the modifier to qubit 2
qc.cry(prob_to_angle(modifier-1), 0,2)
# Make qubit 3 represent the posterior
qc.cx(1,3)
qc.cx(2,3)
run_circuit(qc)
Image by author, Frank Zickert
We applied two CNOT-gates (lines 19–20). The size of the bars did not change. But the states representing them did. Now, we measure qubit 3 as 1 with the posterior probability.
So far, so good. There’s but one problem. This approach does not work for a prior greater than 0.5 because we only have a total probability of 1.0 to work with. But if the prior is greater 0.5 we can't have two independent states representing it.
Have a look at what happens.
# Specify the prior probability and the modifier
prior = 0.6
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(prior/(1-prior)), 0, 1)
qc.x(0)
# Apply the modifier to qubit 2
qc.cry(prob_to_angle(modifier-1), 0,2)
# Make qubit 3 represent the posterior
qc.cx(1,3)
qc.cx(2,3)
run_circuit(qc)
Again, we get a math domain error. Mathematically, it fails when calculating (prior/(1-prior) because the term would be greater 1 and thus, it is not a valid input for the prob_to_angle-function. For instance:0.6/(1.0−0.6)=0.6/0.4=1.50
Solving this situation is a little tricky. I'd argue it is even a hack.
If you’re a mathematician, I’m quite sure you won’t like it. If you’re a programmer, you might accept it. Let’s have a look, first. Then, it’s open for criticism.
When the prior is greater than 0.5 and the modifier is greater than 1.0, the trick with using the prior twice does not work because our overall probability must not exceed 1.0.
Of course, we could use the prior to adjusting the remaining probability in a way so that we can precisely apply the modifier afterwards. But in this case, we would need to know the prior when we apply the modifier. This would not be different than initializing the qubit with the product of prior∗modifier in the first place.
But we aim for a qubit system that represents a given prior and that lets us apply a modifier without knowing the prior. So, we need to prepare the remainder (1−prior)(1−prior) in a way that lets us work with it (i.e. apply the reduced modifier) without knowing the prior.
Rather than using the prior when we apply the modifier to the remainder, we pre-apply the prior to the remainder with some auxiliary steps. For instance, we set aside a part that is 0.3 of the prior.
We can do this in the same way we set aside the full prior earlier.
# Specify the prior probability
prior = 0.6
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply 0.3*prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(0.3*prior/(1-prior)), 0, 1)
run_circuit(qc)
Image by author, Frank Zickert
As a result, the state 0000 represents the prior probability (0.6) and the state 0011 represents 0.3∗prior=0.18.
We can now apply the reduced modifier to this state without knowing the prior.
Let’s have a look.
# Specify the prior probability and the modifier
prior = 0.6
modifier = 1.2
qc = QuantumCircuit(4)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Apply 0.3*prior to qubit 1
qc.x(0)
qc.cry(prob_to_angle(0.3*prior/(1-prior)), 0, 1)
# Apply the modifier to qubit 2
qc.cry(prob_to_angle((modifier-1)/0.3), 1,2)
# Make qubit 3 represent the posterior
qc.x(0)
qc.cx(0,3)
qc.cx(2,3)
run_circuit(qc)
Image by author, Frank Zickert
Up until line 12, there’s nothing new. The important part is line 15.
qc.cry(prob_to_angle((modifier-1)/0.3), 1,2)
We apply a controlled RYRY-gate. Thus, we only change states where qubit 1 is 1. This is the case for the state 00100010 that represents 0.30.3 of the prior. The important part is that we adjust our reduced modifier to 0.3 by dividing by it. If the portion is only 0.3 of the prior, we need to separate a part that is accordingly greater.
The remaining code (lines 18–20) changes the states so that we get the resulting posterior by measuring qubit 3.
There’s a caveat, though. Of course there is. You may have wondered how I came up with 0.30.3. The fraction we choose must be smaller than the remaining probability (1−prior1−prior). If it was greater, we would exceed the overall probability of 1.0 again. But it must be greater than the effect the modifier has, too. If it is too small, we can't separate a part of it that accounts for the effect the modifier has on the prior.
So, when settling for the best value, we need to know both, prior and modifier. This is where the solution becomes a hack. While we don’t want to work with the prior and the modifier at the same time, we do not set aside one specific fraction of the prior. But we set aside many of them. We set aside all the portions from 0.1 to 1.0. This way, we are prepared to any modifier up to 2.0.
for i in range(1,10):
qc.cry(prob_to_angle(min(1, (i*0.1)*prior/(1-prior))), 0,i)
In order not to feed the prob_to_angle-function with a value greater than 1.0, we limit the input with the min function. So, whenever the part we want to set aside is bigger than the remainder, we only set aside the remainder. However, this means that this part is useless. It does not represent the corresponding portion of the prior anymore.
When we apply the modifier, we need to select the right part. The right part is the smallest possible one that is big enough to contain the effect of the modifier.
We calculate the maximum the reduced modifier could be by multiplying it by 10 (the reverse of the step size we chose above). The ceil function rounds that up. So, we have the next greater position.
pos = ceil((modifier-1)*10)
qc.cry(prob_to_angle((modifier-1)/(pos*0.1)), pos,11)
But what if we chose a part that does not correctly represent the corresponding portion of the prior? Technically, we get the wrong result. However, this is only the case, when the actual result (prior∗modifier) exceeds 1.0. Such a result would not make any sense in the first place. It would imply that observing a certain event would cause another event to occur with a probability greater than 1. In that case, we would need to question our input data.
Depending on the step size we choose, there is a little area close to `1` where the resulting probability is not calculated correctly.
So, let’s have a look at the final code. Due to the size of qubits we’re using for the prepared parts, we exceed the limits of what can be represented in the histogram. Rather than showing all the states, we include a measurement into the circuit. We measure qubit 3 that holds the result (line 31).
A measured qubit is either 0 or 1. We receive only a single number as output, not the probability. But we can run the circuit several times (here 1000 shots, line 33) to calculate the resulting probability. Due to the empiric reconstruction of the probability, it is not perfectly accurate, though.
from math import ceil
from qiskit import ClassicalRegister, QuantumRegister
# Specify the prior probability and the modifier
prior=0.6
modifier=1.2
# Prepare the circuit with qubits and a classical bit to hold the measurement
qr = QuantumRegister(12)
cr = ClassicalRegister(1)
qc = QuantumCircuit(qr, cr)
# Apply prior to qubit 0
qc.ry(prob_to_angle(prior), 0)
# Separate parts of the prior
qc.x(0)
for i in range(1,10):
qc.cry(prob_to_angle(min(1, (i*0.1)*prior/(1-prior))), 0,i)
# Apply the modifier
pos = ceil((modifier-1)*10)
qc.cry(prob_to_angle((modifier-1)/(pos*0.1)), pos,11)
# Make qubit 11 represent the posterior
qc.x(0)
qc.cx(0,11)
# measure the qubit
qc.measure(qr[11], cr[0])
run_circuit(qc, simulator = 'qasm_simulator', shots = 1000 )
Image by author, Frank Zickert
Conclusion
The circuit correctly calculates the posterior probability given a prior and a modifier. We have seen that it gets quite tricky to calculate the posterior for a prior greater than 0.5 and a modifier greater 1.0.
Source: towardsdatascience
The Tech Platform
Comments