# Noise and Quantum Computation¶

## Modeling Noisy Quantum Gates¶

### Pure States vs. Mixed States¶

Errors in quantum computing can introduce classical uncertainty in what the underlying state is. When this happens we sometimes need to consider not only wavefunctions but also probabilistic sums of wavefunctions when we are uncertain as to which one we have. For example, if we think that an X gate was accidentally applied to a qubit with a 50-50 chance then we would say that there is a 50% chance we have the \(\ket{0}\) state and a 50% chance that we have a \(\ket{1}\) state. This is called an “impure” or “mixed”state in that it isn’t just a wavefunction (which is pure) but instead a distribution over wavefunctions. We describe this with something called a density matrix, which is generally an operator. Pure states have very simple density matrices that we can write as an outer product of a ket vector \(\ket{\psi}\) with its own bra version \(\bra{\psi}=\ket{\psi}^\dagger\). For a pure state the density matrix is simply

The expectation value of an operator for a mixed state is given by

where \(\tr{\cdot}\) is the trace of an operator, which is the sum of its diagonal elements, which is independent of choice of basis. Pure state density matrices satisfy

which you can easily verify for \(\rho_\psi\) assuming that the state is normalized. If we want to describe a situation with classical uncertainty between states \(\rho_1\) and \(\rho_2\), then we can take their weighted sum

where \(p\in [0,1]\) gives the classical probability that the state is \(\rho_1\).

Note that classical uncertainty in the wavefunction is markedly different from superpositions. We can represent superpositions using wavefunctions, but use density matrices to describe distributions over wavefunctions. You can read more about density matrices here [DensityMatrix].

[DensityMatrix] | https://en.wikipedia.org/wiki/Density_matrix |

### Quantum Gate Errors¶

For a quantum gate given by its unitary operator \(U\), a “quantum gate error” describes the scenario in which the actually induces transformation deviates from \(\ket{\psi} \mapsto U\ket{\psi}\). There are two basic types of quantum gate errors:

**coherent errors**are those that preserve the purity of the input state, i.e., instead of the above mapping we carry out a perturbed, but unitary operation \(\ket{\psi} \mapsto \tilde{U}\ket{\psi}\), where \(\tilde{U} \neq U\).**incoherent errors**are those that do not preserve the purity of the input state, in this case we must actually represent the evolution in terms of density matrices. The state \(\rho := \ket{\psi}\bra{\psi}\) is then mapped as\[\rho \mapsto \sum_{j=1}^n K_j\rho K_j^\dagger,\]where the operators \(\{K_1, K_2, \dots, K_m\}\) are called Kraus operators and must obey \(\sum_{j=1}^m K_j^\dagger K_j = I\) to conserve the trace of \(\rho\). Maps expressed in the above form are called Kraus maps. It can be shown that every physical map on a finite dimensional quantum system can be represented as a Kraus map, though this representation is not generally unique. You can find more information about quantum operations here

In a way, coherent errors are *in principle* amendable by more precisely
calibrated control. Incoherent errors are more tricky.

### Why Do Incoherent Errors Happen?¶

When a quantum system (e.g., the qubits on a quantum processor) is not perfectly isolated from its environment it generally co-evolves with the degrees of freedom it couples to. The implication is that while the total time evolution of system and environment can be assumed to be unitary, restriction to the system state generally is not.

**Let’s throw some math at this for clarity:** Let our total Hilbert
space be given by the tensor product of system and environment Hilbert
spaces: \(\mathcal{H} = \mathcal{H}_S \otimes \mathcal{H}_E\). Our
system “not being perfectly isolated” must be translated to the
statement that the global Hamiltonian contains a contribution that
couples the system and environment:

where \(V\) non-trivally acts on both the system and the environment. Consequently, even if we started in an initial state that factorized over system and environment \(\ket{\psi}_{S,0}\otimes \ket{\psi}_{E,0}\) if everything evolves by the Schrödinger equation

the final state will generally not admit such a factorization.

### A Toy Model¶

**In this (somewhat technical) section we show how environment
interaction can corrupt an identity gate and derive its Kraus map.** For
simplicity, let us assume that we are in a reference frame in which both
the system and environment Hamiltonian’s vanish \(H_S = 0, H_E = 0\)
and where the cross-coupling is small even when multiplied by the
duration of the time evolution
\(\|\frac{tV}{\hbar}\|^2 \sim \epsilon \ll 1\) (any operator norm
\(\|\cdot\|\) will do here). Let us further assume that
\(V = \sqrt{\epsilon} V_S \otimes V_E\) (the more general case is
given by a sum of such terms) and that the initial environment state
satisfies \(\bra{\psi}_{E,0} V_E\ket{\psi}_{E,0} = 0\). This turns
out to be a very reasonable assumption in practice but a more thorough
discussion exceeds our scope.

Then the joint system + environment state \(\rho = \rho_{S,0} \otimes \rho_{E,0}\) (now written as a density matrix) evolves as

Using the Baker-Campbell-Hausdorff theorem we can expand this to second order in \(\epsilon\)

We can insert the initially factorizable state \(\rho = \rho_{S,0} \otimes \rho_{E,0}\) and trace over the environmental degrees of freedom to obtain

where the coefficient in front of the second part is by our initial assumption very small \(\gamma := \frac{\epsilon t^2}{2\hbar^2}\tr{V_E^2 \rho_{E,0}} \ll 1\). This evolution happens to be approximately equal to a Kraus map with operators \(K_1 := I - \frac{\gamma}{2} V_S^2, K_2:= \sqrt{\gamma} V_S\):

This agrees to \(O(\epsilon^{3/2})\) with the result of our derivation above. This type of derivation can be extended to many other cases with little complication and a very similar argument is used to derive the Lindblad master equation.

## Noisy Gates on the Rigetti QVM¶

As of today, users of our Forest SDK can annotate their QUIL programs by certain pragma statements that inform the QVM that a particular gate on specific target qubits should be replaced by an imperfect realization given by a Kraus map.

The QVM propagates **pure states** — so how does it simulate noisy gates?
It does so by yielding the correct outcomes **in the average over many
executions of the QUIL program**: When the noisy version of a gate
should be applied the QVM makes a random choice which Kraus operator is
applied to the current state with a probability that ensures that the
average over many executions is equivalent to the Kraus map. In
particular, a particular Kraus operator \(K_j\) is applied to
\(\ket{\psi}_S\)

with probability \(p_j:= \bra{\psi}_S K_j^\dagger K_j \ket{\psi}_S\). In the average over many execution \(N \gg 1\) we therefore find that

where \(j_n\) is the chosen Kraus operator label in the \(n\)-th trial. This is clearly a Kraus map itself! And we can group identical terms and rewrite it as

where \(N_{\ell}\) is the number of times that Kraus operator label \(\ell\) was selected. For large enough \(N\) we know that \(N_{\ell} \approx N p_\ell\) and therefore

which proves our claim. **The consequence is that noisy gate simulations
must generally be repeated many times to obtain representative
results**.

### Getting Started¶

Come up with a good model for your noise. We will provide some examples below and may add more such examples to our public repositories over time. Alternatively, you can characterize the gate under consideration using Quantum Process Tomography or Gate Set Tomography and use the resulting process matrices to obtain a very accurate noise model for a particular QPU.

Define your Kraus operators as a list of numpy arrays

`kraus_ops = [K1, K2, ..., Km]`

.For your QUIL program

`p`

, call:p.define_noisy_gate("MY_NOISY_GATE", [q1, q2], kraus_ops)

where you should replace

`MY_NOISY_GATE`

with the gate of interest and`q1, q2`

the indices of the qubits.

**Scroll down for some examples!**

```
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom
import matplotlib.colors as colors
%matplotlib inline
```

```
from pyquil.quil import Program, MEASURE
from pyquil.api import QVMConnection
from pyquil.gates import CZ, H, I, X
from scipy.linalg import expm
```

```
cxn = QVMConnection()
```

### Example 1: Amplitude Damping¶

Amplitude damping channels are imperfect identity maps with Kraus operators

where \(p\) is the probability that a qubit in the \(\ket{1}\) state decays to the \(\ket{0}\) state.

```
def damping_channel(damp_prob=.1):
"""
Generate the Kraus operators corresponding to an amplitude damping
noise channel.
:params float damp_prob: The one-step damping probability.
:return: A list [k1, k2] of the Kraus operators that parametrize the map.
:rtype: list
"""
damping_op = np.sqrt(damp_prob) * np.array([[0, 1],
[0, 0]])
residual_kraus = np.diag([1, np.sqrt(1-damp_prob)])
return [residual_kraus, damping_op]
def append_kraus_to_gate(kraus_ops, g):
"""
Follow a gate `g` by a Kraus map described by `kraus_ops`.
:param list kraus_ops: The Kraus operators.
:param numpy.ndarray g: The unitary gate.
:return: A list of transformed Kraus operators.
"""
return [kj.dot(g) for kj in kraus_ops]
def append_damping_to_gate(gate, damp_prob=.1):
"""
Generate the Kraus operators corresponding to a given unitary
single qubit gate followed by an amplitude damping noise channel.
:params np.ndarray|list gate: The 2x2 unitary gate matrix.
:params float damp_prob: The one-step damping probability.
:return: A list [k1, k2] of the Kraus operators that parametrize the map.
:rtype: list
"""
return append_kraus_to_gate(damping_channel(damp_prob), gate)
```

```
%%time
# single step damping probability
damping_per_I = 0.02
# number of program executions
trials = 200
results = []
outcomes = []
lengths = np.arange(0, 201, 10, dtype=int)
for jj, num_I in enumerate(lengths):
print("{}/{}, ".format(jj, len(lengths)), end="")
p = Program(X(0))
# want increasing number of I-gates
p.inst([I(0) for _ in range(num_I)])
p.inst(MEASURE(0, [0]))
# overload identity I on qc 0
p.define_noisy_gate("I", [0], append_damping_to_gate(np.eye(2), damping_per_I))
cxn.random_seed = int(num_I)
res = cxn.run(p, [0], trials=trials)
results.append([np.mean(res), np.std(res) / np.sqrt(trials)])
results = np.array(results)
```

```
0/21, 1/21, 2/21, 3/21, 4/21, 5/21, 6/21, 7/21, 8/21, 9/21, 10/21, 11/21, 12/21, 13/21, 14/21, 15/21, 16/21, 17/21, 18/21, 19/21, 20/21, CPU times: user 138 ms, sys: 19.2 ms, total: 157 ms
Wall time: 6.4 s
```

```
dense_lengths = np.arange(0, lengths.max()+1, .2)
survival_probs = (1-damping_per_I)**dense_lengths
logpmf = binom.logpmf(np.arange(trials+1)[np.newaxis, :], trials, survival_probs[:, np.newaxis])/np.log(10)
```

```
DARK_TEAL = '#48737F'
FUSCHIA = "#D6619E"
BEIGE = '#EAE8C6'
cm = colors.LinearSegmentedColormap.from_list('anglemap', ["white", FUSCHIA, BEIGE], N=256, gamma=1.5)
```

```
plt.figure(figsize=(14, 6))
plt.pcolor(dense_lengths, np.arange(trials+1)/trials, logpmf.T, cmap=cm, vmin=-4, vmax=logpmf.max())
plt.plot(dense_lengths, survival_probs, c=BEIGE, label="Expected mean")
plt.errorbar(lengths, results[:,0], yerr=2*results[:,1], c=DARK_TEAL,
label=r"noisy qvm, errorbars $ = \pm 2\hat{\sigma}$", marker="o")
cb = plt.colorbar()
cb.set_label(r"$\log_{10} \mathrm{Pr}(n_1; n_{\rm trials}, p_{\rm survival}(t))$", size=20)
plt.title("Amplitude damping model of a single qubit", size=20)
plt.xlabel(r"Time $t$ [arb. units]", size=14)
plt.ylabel(r"$n_1/n_{\rm trials}$", size=14)
plt.legend(loc="best", fontsize=18)
plt.xlim(*lengths[[0, -1]])
plt.ylim(0, 1)
```

### Example 2: Dephased CZ-gate¶

Dephasing is usually characterized through a qubit’s \(T_2\) time. For a single qubit the dephasing Kraus operators are

where \(p = 1 - \exp(-T_2/T_{\rm gate})\) is the probability that the qubit is dephased over the time interval of interest, \(I_2\) is the \(2\times 2\)-identity matrix and \(\sigma_Z\) is the Pauli-Z operator.

For two qubits, we must construct a Kraus map that has *four* different
outcomes:

- No dephasing
- Qubit 1 dephases
- Qubit 2 dephases
- Both dephase

The Kraus operators for this are given by

where we assumed a dephasing probability \(p\) for the first qubit and \(q\) for the second.

Dephasing is a *diagonal* error channel and the CZ gate is also
diagonal, therefore we can get the combined map of dephasing and the CZ
gate simply by composing \(U_{\rm CZ}\) the unitary representation
of CZ with each Kraus operator

**Note that this is not always accurate, because a CZ gate is often
achieved through non-diagonal interaction Hamiltonians! However, for
sufficiently small dephasing probabilities it should always provide a
good starting point.**

```
def dephasing_kraus_map(p=.1):
"""
Generate the Kraus operators corresponding to a dephasing channel.
:params float p: The one-step dephasing probability.
:return: A list [k1, k2] of the Kraus operators that parametrize the map.
:rtype: list
"""
return [np.sqrt(1-p)*np.eye(2), np.sqrt(p)*np.diag([1, -1])]
def tensor_kraus_maps(k1, k2):
"""
Generate the Kraus map corresponding to the composition
of two maps on different qubits.
:param list k1: The Kraus operators for the first qubit.
:param list k2: The Kraus operators for the second qubit.
:return: A list of tensored Kraus operators.
"""
return [np.kron(k1j, k2l) for k1j in k1 for k2l in k2]
def append_kraus_to_gate(kraus_ops, g):
"""
Follow a gate `g` by a Kraus map described by `kraus_ops`.
:param list kraus_ops: The Kraus operators.
:param numpy.ndarray g: The unitary gate.
:return: A list of transformed Kraus operators.
"""
return [kj.dot(g) for kj in kraus_ops]
```

```
%%time
# single step damping probabilities
ps = np.linspace(.001, .5, 200)
# number of program executions
trials = 500
results = []
for jj, p in enumerate(ps):
corrupted_CZ = append_kraus_to_gate(
tensor_kraus_maps(
dephasing_kraus_map(p),
dephasing_kraus_map(p)
),
np.diag([1, 1, 1, -1]))
print("{}/{}, ".format(jj, len(ps)), end="")
# make Bell-state
p = Program(H(0), H(1), CZ(0,1), H(1))
p.inst(MEASURE(0, [0]))
p.inst(MEASURE(1, [1]))
# overload identity I on qc 0
p.define_noisy_gate("CZ", [0, 1], corrupted_CZ)
cxn.random_seed = jj
res = cxn.run(p, [0, 1], trials=trials)
results.append(res)
results = np.array(results)
```

```
0/200, 1/200, 2/200, 3/200, 4/200, 5/200, 6/200, 7/200, 8/200, 9/200, 10/200, 11/200, 12/200, 13/200, 14/200, 15/200, 16/200, 17/200, 18/200, 19/200, 20/200, 21/200, 22/200, 23/200, 24/200, 25/200, 26/200, 27/200, 28/200, 29/200, 30/200, 31/200, 32/200, 33/200, 34/200, 35/200, 36/200, 37/200, 38/200, 39/200, 40/200, 41/200, 42/200, 43/200, 44/200, 45/200, 46/200, 47/200, 48/200, 49/200, 50/200, 51/200, 52/200, 53/200, 54/200, 55/200, 56/200, 57/200, 58/200, 59/200, 60/200, 61/200, 62/200, 63/200, 64/200, 65/200, 66/200, 67/200, 68/200, 69/200, 70/200, 71/200, 72/200, 73/200, 74/200, 75/200, 76/200, 77/200, 78/200, 79/200, 80/200, 81/200, 82/200, 83/200, 84/200, 85/200, 86/200, 87/200, 88/200, 89/200, 90/200, 91/200, 92/200, 93/200, 94/200, 95/200, 96/200, 97/200, 98/200, 99/200, 100/200, 101/200, 102/200, 103/200, 104/200, 105/200, 106/200, 107/200, 108/200, 109/200, 110/200, 111/200, 112/200, 113/200, 114/200, 115/200, 116/200, 117/200, 118/200, 119/200, 120/200, 121/200, 122/200, 123/200, 124/200, 125/200, 126/200, 127/200, 128/200, 129/200, 130/200, 131/200, 132/200, 133/200, 134/200, 135/200, 136/200, 137/200, 138/200, 139/200, 140/200, 141/200, 142/200, 143/200, 144/200, 145/200, 146/200, 147/200, 148/200, 149/200, 150/200, 151/200, 152/200, 153/200, 154/200, 155/200, 156/200, 157/200, 158/200, 159/200, 160/200, 161/200, 162/200, 163/200, 164/200, 165/200, 166/200, 167/200, 168/200, 169/200, 170/200, 171/200, 172/200, 173/200, 174/200, 175/200, 176/200, 177/200, 178/200, 179/200, 180/200, 181/200, 182/200, 183/200, 184/200, 185/200, 186/200, 187/200, 188/200, 189/200, 190/200, 191/200, 192/200, 193/200, 194/200, 195/200, 196/200, 197/200, 198/200, 199/200, CPU times: user 1.17 s, sys: 166 ms, total: 1.34 s
Wall time: 1min 49s
```

```
Z1s = (2*results[:,:,0]-1.)
Z2s = (2*results[:,:,1]-1.)
Z1Z2s = Z1s * Z2s
Z1m = np.mean(Z1s, axis=1)
Z2m = np.mean(Z2s, axis=1)
Z1Z2m = np.mean(Z1Z2s, axis=1)
```

```
plt.figure(figsize=(14, 6))
plt.axhline(y=1.0, color=FUSCHIA, alpha=.5, label="Bell state")
plt.plot(ps, Z1Z2m, "x", c=FUSCHIA, label=r"$\overline{Z_1 Z_2}$")
plt.plot(ps, 1-2*ps, "--", c=FUSCHIA, label=r"$\langle Z_1 Z_2\rangle_{\rm theory}$")
plt.plot(ps, Z1m, "o", c=DARK_TEAL, label=r"$\overline{Z}_1$")
plt.plot(ps, 0*ps, "--", c=DARK_TEAL, label=r"$\langle Z_1\rangle_{\rm theory}$")
plt.plot(ps, Z2m, "d", c="k", label=r"$\overline{Z}_2$")
plt.plot(ps, 0*ps, "--", c="k", label=r"$\langle Z_2\rangle_{\rm theory}$")
plt.xlabel(r"Dephasing probability $p$", size=18)
plt.ylabel(r"$Z$-moment", size=18)
plt.title(r"$Z$-moments for a Bell-state prepared with dephased CZ", size=18)
plt.xlim(0, .5)
plt.legend(fontsize=18)
```

## Adding Decoherence Noise¶

In this example, we investigate how a program might behave on a
near-term device that is subject to *T1*- and *T2*-type noise using the convenience function
`pyquil.noise.add_decoherence_noise()`

. The same module also contains some other useful
functions to define your own types of noise models, e.g.,
`pyquil.noise.tensor_kraus_maps()`

for generating multi-qubit noise processes,
`pyquil.noise.combine_kraus_maps()`

for describing the succession of two noise processes and
`pyquil.noise.append_kraus_to_gate()`

which allows appending a noise process to a unitary
gate.

```
from pyquil.quil import Program
from pyquil.paulis import PauliSum, PauliTerm, exponentiate, exponential_map, trotterize
from pyquil.gates import MEASURE, H, Z, RX, RZ, CZ
import numpy as np
```

### The Task¶

We want to prepare \(e^{i \theta XY}\) and measure it in the \(Z\) basis.

```
from numpy import pi
theta = pi/3
xy = PauliTerm('X', 0) * PauliTerm('Y', 1)
```

### The Idiomatic PyQuil Program¶

```
prog = exponential_map(xy)(theta)
print(prog)
```

```
H 0
RX(pi/2) 1
CNOT 0 1
RZ(2*pi/3) 1
CNOT 0 1
H 0
RX(-pi/2) 1
```

### The Compiled Program¶

To run on a real device, we must compile each program to the native gate set for the device. The high-level noise model is similarly constrained to use a small, native gate set. In particular, we can use

- \(I\)
- \(RZ(\theta)\)
- \(RX(\pm \pi/2)\)
- \(CZ\)

For simplicity, the compiled program is given below but generally you will want to use a compiler to do this step for you.

```
def get_compiled_prog(theta):
return Program([
RZ(-pi/2, 0),
RX(-pi/2, 0),
RZ(-pi/2, 1),
RX( pi/2, 1),
CZ(1, 0),
RZ(-pi/2, 1),
RX(-pi/2, 1),
RZ(theta, 1),
RX( pi/2, 1),
CZ(1, 0),
RX( pi/2, 0),
RZ( pi/2, 0),
RZ(-pi/2, 1),
RX( pi/2, 1),
RZ(-pi/2, 1),
])
```

### Scan Over Noise Parameters¶

We perform a scan over three levels of noise each at 20 theta points.

Specifically, we investigate T1 values of 1, 3, and 10 us. By default, T2 = T1 / 2, 1 qubit gates take 50 ns, and 2 qubit gates take 150 ns.

In alignment with the device, \(I\) and parametric \(RZ\) are noiseless while \(RX\) and \(CZ\) gates experience 1q and 2q gate noise, respectively.

```
from pyquil.api import QVMConnection
cxn = QVMConnection()
```

```
t1s = np.logspace(-6, -5, num=3)
thetas = np.linspace(-pi, pi, num=20)
t1s * 1e6 # us
```

```
array([ 1. , 3.16227766, 10. ])
```

```
from pyquil.noise import add_decoherence_noise
records = []
for theta in thetas:
for t1 in t1s:
prog = get_compiled_prog(theta)
noisy = add_decoherence_noise(prog, T1=t1).inst([
MEASURE(0, 0),
MEASURE(1, 1),
])
bitstrings = np.array(cxn.run(noisy, [0,1], 1000))
# Expectation of Z0 and Z1
z0, z1 = 1 - 2*np.mean(bitstrings, axis=0)
# Expectation of ZZ by computing the parity of each pair
zz = 1 - (np.sum(bitstrings, axis=1) % 2).mean() * 2
record = {
'z0': z0,
'z1': z1,
'zz': zz,
'theta': theta,
't1': t1,
}
records += [record]
```

### Plot the Results¶

Note that to run the code below you will need to install the pandas and seaborn packages.

```
%matplotlib inline
from matplotlib import pyplot as plt
import seaborn as sns
sns.set(style='ticks', palette='colorblind')
```

```
import pandas as pd
df_all = pd.DataFrame(records)
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12,4))
for t1 in t1s:
df = df_all.query('t1 == @t1')
ax1.plot(df['theta'], df['z0'], 'o-')
ax2.plot(df['theta'], df['z1'], 'o-')
ax3.plot(df['theta'], df['zz'], 'o-', label='T1 = {:.0f} us'.format(t1*1e6))
ax3.legend(loc='best')
ax1.set_ylabel('Z0')
ax2.set_ylabel('Z1')
ax3.set_ylabel('ZZ')
ax2.set_xlabel(r'$\theta$')
fig.tight_layout()
```

## Modeling Readout Noise¶

Qubit-Readout can be corrupted in a variety of ways. The two most relevant error mechanisms on the Rigetti QPU right now are:

- Transmission line noise that makes a 0-state look like a 1-state or
vice versa. We call this
**classical readout bit-flip error**. This type of readout noise can be reduced by tailoring optimal readout pulses and using superconducting, quantum limited amplifiers to amplify the readout signal before it is corrupted by classical noise at the higher temperature stages of our cryostats. - T1 qubit decay during readout (our readout operations can take more
than a µsecond unless they have been specially optimized), which
leads to readout signals that initially behave like 1-states but then
collapse to something resembling a 0-state. We will call this
**T1-readout error**. This type of readout error can be reduced by achieving shorter readout pulses relative to the T1 time, i.e., one can try to reduce the readout pulse length, or increase the T1 time or both.

### Qubit Measurements¶

This section provides the necessary theoretical foundation for accurately modeling noisy quantum measurements on superconducting quantum processors. It relies on some of the abstractions (density matrices, Kraus maps) introduced in our notebook on gate noise models.

The most general type of measurement performed on a single qubit at a single time can be characterized by some set \(\mathcal{O}\) of measurement outcomes, e.g., in the simplest case \(\mathcal{O} = \{0, 1\}\), and some unnormalized quantum channels (see notebook on gate noise models) that encapsulate 1. the probability of that outcome 2. how the qubit state is affected conditional on the measurement outcome.

Here the *outcome* is understood as classical information that has been
extracted from the quantum system.

#### Projective, Ideal Measurement¶

The simplest case that is usually taught in introductory quantum mechanics and quantum information courses are Born’s rule and the projection postulate which state that there exist a complete set of orthogonal projection operators

i.e., one for each measurement outcome. Any projection operator must
satisfy \(\Pi_x^\dagger = \Pi_x = \Pi_x^2\) and for an *orthogonal*
set of projectors any two members satisfy

and for a *complete* set we additionally demand that
\(\sum_{x\in\mathcal{O}} \Pi_x = 1\). Following our introduction to
gate noise, we write quantum states as density matrices as this is more
general and in closer correspondence with classical probability theory.

With these the probability of outcome \(x\) is given by \(p(x) = \tr{\Pi_x \rho \Pi_x} = \tr{\Pi_x^2 \rho} = \tr{\Pi_x \rho}\) and the post measurement state is

which is the projection postulate applied to mixed states.

If we were a sloppy quantum programmer and accidentally erased the measurement outcome then our best guess for the post measurement state would be given by something that looks an awful lot like a Kraus map:

The completeness of the projector set ensures that the trace of the post measurement is still 1 and the Kraus map form of this expression ensures that \(\rho_{\text{post measurement}}\) is a positive (semi-)definite operator.

#### Classical Readout Bit-Flip Error¶

Consider now the ideal measurement as above, but where the outcome
\(x\) is transmitted across a noisy classical channel that produces
a final outcome \(x'\in \mathcal{O}' = \{0', 1'\}\) according to
some conditional probabilities \(p(x'|x)\) that can be recorded in
the *assignment probability matrix*

Note that this matrix has only two independent parameters as each column must be a valid probability distribution, i.e. all elements are non-negative and each column sums to 1.

This matrix allows us to obtain the probabilities \(\mathbf{p}' := (p(x'=0), p(x'=1))^T\) from the original outcome probabilities \(\mathbf{p} := (p(x=0), p(x=1))^T\) via \(\mathbf{p}' = P_{x'|x}\mathbf{p}\). The difference relative to the ideal case above is that now an outcome \(x' = 0\) does not necessarily imply that the post measurement state is truly \(\Pi_{0} \rho \Pi_{0} / p(x=0)\). Instead, the post measurement state given a noisy outcome \(x'\) must be

where

where we have exploited the cyclical property of the trace \(\tr{ABC}=\tr{BCA}\) and the projection property \(\Pi_x^2 = \Pi_x\). This has allowed us to derive the noisy outcome probabilities from a set of positive operators

that must sum to 1:

The above result is a type of generalized **Bayes’ theorem** that is
extremely useful for this type of (slightly) generalized measurement and
the family of operators \(\{E_{x'}| x' \in \mathcal{O}'\}\) whose
expectations give the probabilities is called a **positive operator
valued measure** (POVM). These operators are not generally orthogonal
nor valid projection operators but they naturally arise in this
scenario. This is not yet the most general type of measurement, but it
will get us pretty far.

#### How to Model \(T_1\) Error¶

T1 type errors fall outside our framework so far as they involve a
scenario in which the *quantum state itself* is corrupted during the
measurement process in a way that potentially erases the pre-measurement
information as opposed to a loss of purely classical information. The
most appropriate framework for describing this is given by that of
measurement instruments, but for the practical purpose of arriving at a
relatively simple description, we propose describing this by a T1
damping Kraus map followed by the noisy readout process as described
above.

#### Further Reading¶

Chapter 3 of John Preskill’s lecture notes http://www.theory.caltech.edu/people/preskill/ph229/notes/chap3.pdf

## Working with Readout Noise¶

Come up with a good guess for your readout noise parameters \(p(0|0)\) and \(p(1|1)\), the off-diagonals then follow from the normalization of \(P_{x'|x}\). If your assignment fidelity \(F\) is given, and you assume that the classical bit flip noise is roughly symmetric, then a good approximation is to set \(p(0|0)=p(1|1)=F\).

For your QUIL program

`p`

, and a qubit index`q`

call:p.define_noisy_readout(q, p00, p11)

where you should replace

`p00`

and`p11`

with the assumed probabilities.

**Scroll down for some examples!**

```
from __future__ import print_function, division
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from pyquil.quil import Program, MEASURE, Pragma
from pyquil.api.qvm import QVMConnection
from pyquil.gates import I, X, RX, H, CNOT
from pyquil.noise import (estimate_bitstring_probs, correct_bitstring_probs,
bitstring_probs_to_z_moments, estimate_assignment_probs)
DARK_TEAL = '#48737F'
FUSCHIA = '#D6619E'
BEIGE = '#EAE8C6'
cxn = QVMConnection()
```

### Example 1: Rabi Sequence with Noisy Readout¶

```
%%time
# number of angles
num_theta = 101
# number of program executions
trials = 200
thetas = np.linspace(0, 2*np.pi, num_theta)
p00s = [1., 0.95, 0.9, 0.8]
results_rabi = np.zeros((num_theta, len(p00s)))
for jj, theta in enumerate(thetas):
for kk, p00 in enumerate(p00s):
cxn.random_seed = hash((jj, kk))
p = Program(RX(theta, 0))
# assume symmetric noise p11 = p00
p.define_noisy_readout(0, p00=p00, p11=p00)
p.measure(0, 0)
res = cxn.run(p, [0], trials=trials)
results_rabi[jj, kk] = np.sum(res)
```

```
CPU times: user 1.2 s, sys: 73.6 ms, total: 1.27 s
Wall time: 3.97 s
```

```
plt.figure(figsize=(14, 6))
for jj, (p00, c) in enumerate(zip(p00s, [DARK_TEAL, FUSCHIA, "k", "gray"])):
plt.plot(thetas, results_rabi[:, jj]/trials, c=c, label=r"$p(0|0)=p(1|1)={:g}$".format(p00))
plt.legend(loc="best")
plt.xlim(*thetas[[0,-1]])
plt.ylim(-.1, 1.1)
plt.grid(alpha=.5)
plt.xlabel(r"RX angle $\theta$ [radian]", size=16)
plt.ylabel(r"Excited state fraction $n_1/n_{\rm trials}$", size=16)
plt.title("Effect of classical readout noise on Rabi contrast.", size=18)
```

```
<matplotlib.text.Text at 0x104314250>
```

### Example 2: Estimate the Assignment Probabilities¶

Here we will estimate \(P_{x'|x}\) ourselves! You can run some simple experiments to estimate the assignment probability matrix directly from a QPU.

**On a perfect quantum computer**

```
estimate_assignment_probs(0, 1000, cxn, Program())
```

```
array([[ 1., 0.],
[ 0., 1.]])
```

**On an imperfect quantum computer**

```
cxn.seed = None
header0 = Program().define_noisy_readout(0, .85, .95)
header1 = Program().define_noisy_readout(1, .8, .9)
header2 = Program().define_noisy_readout(2, .9, .85)
ap0 = estimate_assignment_probs(0, 100000, cxn, header0)
ap1 = estimate_assignment_probs(1, 100000, cxn, header1)
ap2 = estimate_assignment_probs(2, 100000, cxn, header2)
```

```
print(ap0, ap1, ap2, sep="\n")
```

```
[[ 0.84967 0.04941]
[ 0.15033 0.95059]]
[[ 0.80058 0.09993]
[ 0.19942 0.90007]]
[[ 0.90048 0.14988]
[ 0.09952 0.85012]]
```

### Example 3: Correct for Noisy Readout¶

#### 3a) Correcting the Rabi Signal from Above¶

```
ap_last = np.array([[p00s[-1], 1 - p00s[-1]],
[1 - p00s[-1], p00s[-1]]])
corrected_last_result = [correct_bitstring_probs([1-p, p], [ap_last])[1] for p in results_rabi[:, -1] / trials]
```

```
plt.figure(figsize=(14, 6))
for jj, (p00, c) in enumerate(zip(p00s, [DARK_TEAL, FUSCHIA, "k", "gray"])):
if jj not in [0, 3]:
continue
plt.plot(thetas, results_rabi[:, jj]/trials, c=c, label=r"$p(0|0)=p(1|1)={:g}$".format(p00), alpha=.3)
plt.plot(thetas, corrected_last_result, c="red", label=r"Corrected $p(0|0)=p(1|1)={:g}$".format(p00s[-1]))
plt.legend(loc="best")
plt.xlim(*thetas[[0,-1]])
plt.ylim(-.1, 1.1)
plt.grid(alpha=.5)
plt.xlabel(r"RX angle $\theta$ [radian]", size=16)
plt.ylabel(r"Excited state fraction $n_1/n_{\rm trials}$", size=16)
plt.title("Corrected contrast", size=18)
```

```
<matplotlib.text.Text at 0x1055e7310>
```

We find that the corrected signal is fairly noisy (and sometimes exceeds the allowed interval \([0,1]\)) due to the overall very small number of samples \(n=200\).

#### 3b) Corrupting and Correcting GHZ State Correlations¶

In this example we will create a GHZ state \(\frac{1}{\sqrt{2}}\left[\left|000\right\rangle + \left|111\right\rangle \right]\) and measure its outcome probabilities with and without the above noise model. We will then see how the Pauli-Z moments that indicate the qubit correlations are corrupted (and corrected) using our API.

```
ghz_prog = Program(H(0), CNOT(0, 1), CNOT(1, 2),
MEASURE(0, 0), MEASURE(1, 1), MEASURE(2, 2))
print(ghz_prog)
results = cxn.run(ghz_prog, [0, 1, 2], trials=10000)
```

```
H 0
CNOT 0 1
CNOT 1 2
MEASURE 0 [0]
MEASURE 1 [1]
MEASURE 2 [2]
```

```
header = header0 + header1 + header2
noisy_ghz = header + ghz_prog
print(noisy_ghz)
noisy_results = cxn.run(noisy_ghz, [0, 1, 2], trials=10000)
```

```
PRAGMA READOUT-POVM 0 "(0.85 0.050000000000000044 0.15000000000000002 0.95)"
PRAGMA READOUT-POVM 1 "(0.8 0.09999999999999998 0.19999999999999996 0.9)"
PRAGMA READOUT-POVM 2 "(0.9 0.15000000000000002 0.09999999999999998 0.85)"
H 0
CNOT 0 1
CNOT 1 2
MEASURE 0 [0]
MEASURE 1 [1]
MEASURE 2 [2]
```

##### Uncorrupted probability for \(\left|000\right\rangle\) and \(\left|111\right\rangle\)¶

```
probs = estimate_bitstring_probs(results)
probs[0, 0, 0], probs[1, 1, 1]
```

```
(0.50419999999999998, 0.49580000000000002)
```

As expected the outcomes `000`

and `111`

each have roughly
probability \(1/2\).

##### Corrupted probability for \(\left|011\right\rangle\) and \(\left|100\right\rangle\)¶

```
noisy_probs = estimate_bitstring_probs(noisy_results)
noisy_probs[0, 0, 0], noisy_probs[1, 1, 1]
```

```
(0.30869999999999997, 0.3644)
```

The noise-corrupted outcome probabilities deviate significantly from their ideal values!

##### Corrected probability for \(\left|011\right\rangle\) and \(\left|100\right\rangle\)¶

```
corrected_probs = correct_bitstring_probs(noisy_probs, [ap0, ap1, ap2])
corrected_probs[0, 0, 0], corrected_probs[1, 1, 1]
```

```
(0.50397601453064977, 0.49866843912900716)
```

The corrected outcome probabilities are much closer to the ideal value.

##### Estimate \(\langle Z_0^{j} Z_1^{k} Z_2^{\ell}\rangle\) for \(jkl=100, 010, 001\) from non-noisy data¶

*We expect these to all be very small*

```
zmoments = bitstring_probs_to_z_moments(probs)
zmoments[1, 0, 0], zmoments[0, 1, 0], zmoments[0, 0, 1]
```

```
(0.0083999999999999631, 0.0083999999999999631, 0.0083999999999999631)
```

##### Estimate \(\langle Z_0^{j} Z_1^{k} Z_2^{\ell}\rangle\) for \(jkl=110, 011, 101\) from non-noisy data¶

*We expect these to all be close to 1.*

```
zmoments[1, 1, 0], zmoments[0, 1, 1], zmoments[1, 0, 1]
```

```
(1.0, 1.0, 1.0)
```

##### Estimate \(\langle Z_0^{j} Z_1^{k} Z_2^{\ell}\rangle\) for \(jkl=100, 010, 001\) from noise-corrected data¶

```
zmoments_corr = bitstring_probs_to_z_moments(corrected_probs)
zmoments_corr[1, 0, 0], zmoments_corr[0, 1, 0], zmoments_corr[0, 0, 1]
```

```
(0.0071476770049732075, -0.0078641261685578612, 0.0088462563282706852)
```

##### Estimate \(\langle Z_0^{j} Z_1^{k} Z_2^{\ell}\rangle\) for \(jkl=110, 011, 101\) from noise-corrected data¶

```
zmoments_corr[1, 1, 0], zmoments_corr[0, 1, 1], zmoments_corr[1, 0, 1]
```

```
(0.99477496902638118, 1.0008376440216553, 1.0149652015905912)
```

Overall the correction can restore the contrast in our multi-qubit observables, though we also see that the correction can lead to slightly non-physical expectations. This effect is reduced the more samples we take.