Mobile QR Code QR CODE

  1. (Department of AI Convergence, Pukyong National University, Pusan 48513, Korea
  2. (AI Department of Computer & Information Technology, Incheon Jaeneung University, Incheon, Korea )
  3. (Department of Computer Engineering, Pukyong National University, Pusan 48513, Korea

Qubit mapping, Noisy intermediate scale quantum (NISQ) computer, Quantum circuit

1. Introduction

In the past several years, advanced quantum technologies have been explored and developed rapidly in terms of hardware and software. Many approval quantum algorithms are solving tasks significantly faster than classical ones, such as the Glover, Deutsch, and Shor algorithm. In this study, we examine how quantum computing improves other fields, such as quantum cryptography (which strengthens digital security) [1], artificial intelligence (AI) [2], and quantum chemistry [3]. We also explore the early-stage quantum computing of commercial use cases, such as quantum simulation, quantum-assisted optimization, and quantum sampling.

All of these faster computing fields can empower beneficial new machine learning, finance, and health care [4]. Consequently, major industrial and start-up companies are racing to build quantum computers, including IBM [5], Intel [6], Rigetti [7], IonQ [8], Psi-Quantum [9], and Google [10]. We are in a new era where we can run quantum algorithms on a physical quantum machine rather than a quantum simulator.

However, there is a gap between quantum hardware and quantum algorithms because of the limited number of qubits, circuit compilation, and noise that can interfere with the computation. The number of qubits is the key to building a practical quantum application. For example, a quantum computer would require 100 million qubits to factorize a 2,000-bit number [4]. Google plans to reach one million qubits in 10 years [11].

Noise and decoherence are significant problems in building a stable and reliable quantum computer, and it also disturbs the fidelity of quantum algorithms. Accordingly, solving this problem tends to require more qubits for error correction. However, having sufficiently powerful quantum hardware alone cannot achieve milestones (quantum supremacy). We require a compiler that can manipulate the physical qubits significantly enough to improve the fidelity and execution time.

Compiling quantum programs for execution on quantum hardware still requires a classical device to map qubits in logic to physical qubits in devices, which is called $\textit{qubit allocation}$. The qubit allocation problem is a concern because in a quantum program, there is a special operation (entanglement state) that acts on two qubits. Moreover, quantum computers, such as the noisy intermediate scale quantum (NISQ) type, enable this special operation only if physical qubits are nearby or interact with each other. Therefore, a quantum program cannot be perfectly executed directly on an NISQ device. It is impossible to copy the state of qubits due to the no-cloning theorem [12]. Furthermore, many solutions can exist with different numbers of gates or final state fidelities. Mapping logical to physical qubits has been proven as NP-complete problem [13].

Many technological methods addressing the allocation problem have been proposed. In this paper, we examine useful methods in two categories: (1) depth optimal and (2) variation-aware qubit allocation. Zulehner et al. proposed a solution using combinatorial optimization called the A* algorithm [14], which ran faster than the IBM solution available at that time. A year later, the $\textit{sabre}$ algorithm was introduced by Li et al. [15] and outperformed the A* algorithm. $\textit{Sabre}$ provides a search algorithm and design heuristic cost function with look-ahead ability. A good solution may be hidden behind high barriers that are difficult to overcome, so the HA algorithm fostered a new optimum cost function for $\textit{sabre}$ that can process 28% more efficiently than $\textit{sabre}$ in terms of reducing circuit size [16].

Simultaneously, Zhu et al. [17] proposed a dynamic look-ahead mapping and expansion-from-center scheme that helps to overcome the qubit allocation problem. In a different methodology, Siraichi et al. used graph-based algorithms such as subgraph isomorphism and token swapping, labeling it the BMT algorithm [18]. Optimization algorithms from QURE [19] help to improve the fidelity of the circuit by finding isomorphic subgraphs or using QURE’s greedy strategy.

The rest of this paper is organized as follows. In Section 2, we briefly describe quantum computing, and in Section 3, we describe the qubit mapping problem. Section 4 presents depth optimal qubit allocation (DOQA) methods, and variation-aware qubit allocation (VQA) methods are detailed in Section 5. Finally, we discuss the time complexity of all six methods in Section 6.

2. Background

2.1 Foundation of Quantum Computing

The basic unit of classical computing is a bit, with the concept of switching off and on (zeros and ones). Similarly, quantum computing uses a quantum bit called a qubit. A qubit can be off, on, or both simultaneously, the so-called superposition state. These qubits can be written using sophisticated Dirac notation as $\left.|0\right\rangle $ and $\left.|1\right\rangle $. The qubit state can be described as a linear superposition:

ϕ =   α 0 + β 1 = α 1 0 + β 0 1

The amplitudes $\alpha $ and $\beta $ are complex numbers, and $\alpha ^{2}+\beta ^{2}=1$. This qubit can be measured by $\left| \alpha \right| ^{2}$ and $\left| \beta \right| ^{2}$. The values $\left.|0\right\rangle $ and $\left.|1\right\rangle $ are the basis states that can also be written as two-dimensional vectors.

A classical computer applies logic gates to bits, including AND, OR, NOT, NOR, NAND, or XOR gates. Likewise, a quantum gate is expressed with a single unitary matrix and applied on a qubit. For instance, a Pauli-X gate or NOT gate (denoted as X in a quantum circuit) rotates the Bloch sphere 180$^{\circ}$ on the x-axis. The Pauli-X gate turns the qubit in the opposite direction: $X\left.|0\right\rangle \rightarrow \left.|1\right\rangle $. The Hadamard-Walsh gate H is a well-known single gate that modifies the basis state to the superposition state.

Quantum gates are reversible, such that $H\otimes H$ = I (identity matrix). For example, the first two H gates on $q_{3}$ in Fig. 1 can be removed. The CNOT gate is an entangling gate that acts on two different qubits, CNOT$\left.|ab\right\rangle $. If the control qubit $a$ is $\left.|1\right\rangle $, the target qubit $b$ will be flipped to the opposite direction when applying the Pauli-X gate. Hence, if the control qubit is $\left.|0\right\rangle $, the gate has no actions on target qubit $b$. As shown in Fig. 1, the control and target qubits are connected by a vertical line and denoted as ${\bullet}$ and $\otimes $, respectively. An elementary CNOT gate and a set of single-qubit gate compositions can be expressed as an arbitrary quantum circuit [20].

2.2 Quantum Circuit

A quantum circuit is a digital computation board (Fig. 1) that accumulates quantum gates to represent how an algorithm or quantum program is executed. A quantum circuit is executed from left to right like a musical score that plays an algorithm as reading along. In Fig. 1, the square box represents a quantum gate that is interconnected by a quantum wire (horizontal line). Each quantum wire line represents a qubit with an initial state. The vertical line with ${\bullet}$ and $\otimes $ is a CNOT gate.

The two concepts of quantum circuits are the size and depth of the circuit. Quantum circuit size is the number of elementary gates, and circuit depth refers to the number of time steps (time complexity) executed from input to output gates [21]. As shown in Fig. 1, the circuit size is 17, and the circuit depth is 8. The qubits existing in the quantum circuit are commonly known as logical qubits.

Fig. 1. Quantum circuit.

2.3 NISQ Hardware

There are various quantum architecture systems, such as NISQ [22], fault tolerant (FT) [23], and quantum annealing systems [24]. An NISQ device refers to a quantum computer that does not have enough qubits to correct the error (noise) but is sufficiently scalable to perform computational tasks [15,22]. However, it is still limited by the number of applicable qubits and accuracy. Quantum computers are very constrained by the limited number of qubits available with poor qubit connectivity and noise problems. These noise problems cause a quantum computer to malfunction, including gate error, T1 relaxation, and T2 dephasing [19]. Consequently, if the quantum system can store and process information reliably, the system must be kept perfectly isolated from the outside world [22].

The NISQ machine enables only local communications between qubits that must be physically connected. The qubits used in hardware are called $\textit{physical qubits}$. Connectivity, such as CNOT gates between physical qubits, is restricted by the hardware itself. There are two types of physical qubit connections in NISQ devices: directed/bidirected coupling graphs [15] and full connections (all-to-all connection) [25]. In IBM QX, quantum computer models used to be designed on a directed coupling graph in which the communication of qubits is expressed by an arrow point. In Fig. 2(a), $Q_{3}$ can be directly connected with $Q_{4}$, but not vice versa. In a recent IBM computer model, two coupled qubits are connected by a bidirectional arrow, where all nearby qubits can interact with each other, as shown in Fig. 2(b).

The distance between these qubits’ connectivity and the imperfection of operation gates also affects the fidelity of quantum states. Fidelity measures two relationships between two quantum states. If one state is a significant distance away, the fidelity will decrease. However, higher fidelity means a higher probability of correct output [15,19].

Fig. 2. Directed (a) and Bidirected, (b) coupling graph.
Fig. 3. Transformation gates.

3. Qubit Mapping Problem

Qubit allocation is a mapping algorithm that allocates a physical qubit in quantum hardware to a logical qubit of a quantum circuit [13]. Qubit allocation solves the problem that occurs when two qubits are not directly connected in the quantum hardware but need to be entangled in a quantum circuit. A perfect mapping algorithm does not exist for an arbitrary quantum circuit on NISQ devices because of hardware constraints, although it could improve over time.

Currently, however, the qubit allocation problem has not been unsolvable: the simple solution is to move the logical qubit or transform the gate by inserting transformation gates. A transformation gate composed of elementary gates is used to emulate semantic CNOT relations or switch the physical qubit state. The well-known transformation gates in qubit mapping solutions are swap, reversal, and bridge gates. These gates have different behaviors on qubits [13].

The swap gate swaps two logical qubits’ states. As shown in Fig. 3(a), a swap gate can be decomposed into three CNOT gates, one of which is in the opposite direction. To make it the same direction, the swap gate can also be decomposed into three CNOT gates and four Hadamard gates. The reversal and bridge gates perform a similar job. The reversal gate in Fig. 3(b) emulates the CNOT ($q_{0},q_{1}$) to CNOT ($q_{1},q_{0}$) with an extra two Hadamard gates. The bridge gate also emulates the CNOT gate that skips a qubit nearby, as shown in Fig. 3(c), which is decomposed into four CNOT gates.

These transformation gates are used to address the qubit allocation problem. Assume the quantum circuit Fig. 4(a) map is directed to the coupling graph in Fig. 2(a). In the initial state, logical qubits $q_{0},~ q_{1},~ q_{2},~ ~ \text{and~ }q_{3}$ are assigned to $Q_{0},Q_{1},Q_{2},\,\,\,\mathrm{and}\,\,Q_{3}$, respectively. In the first gate $g_{1}$, the physical qubits $Q_{0}$ and $Q_{1}$ are in opposite directions, so the reversal gate is applied. The second gate $g_{2}$ is located on $Q_{1}$ and $Q_{4}$, which are not near each other, so a bridge gate is needed to build a connection of $Q_{1}\rightarrow Q_{2}\rightarrow Q_{4}$. Furthermore, for gate $g_{3}$, $Q_{2}$ and $Q_{1}$ are not neighbors. Thus, a swap gate must be inserted before $g_{3}$ to swap logical qubits $q_{0}$ and $q_{1}$. As depicted in Fig. 4(b), there are 12 new gates added to the quantum circuit, and the circuit size is 15.

NISQ qubit allocation algorithms are considered to be in two categories: variation-aware qubit allocation (VQA) policies and depth optimal qubit allocation (DOQA). VQA concerns qubit allocation algorithms that focus on high fidelity. In contrast, DOQA focuses on lower depth. Consequently, VQA will choose higher fidelity even if the circuit is a long path, and DOQA will choose lower depth (short paths) and does not consider higher fidelity.

Fig. 4. Solving qubit allocation by reversal, bridge and swap gate.

4. Depth Optimal Qubit Allocation

4.1 Naive Algorithm

We can solve the qubit allocation problem by identifying a transformation gate (swap gate) that can be inserted before the unresolved gate or by replacing it (with a reversal or bridge gate). However, searching through this space is challenging. One gate-mapping problem must have at least one solution. In a naive algorithm, we verify each logical qubit against all qubits that satisfy the coupling graph. As an example, a quantum circuit is shown in Fig. 1, and a bidirected coupling graph is shown in Fig. 2(b).

We assume that the initialization maps logical qubits $q_{0}$, $q_{1}$, $q_{2}$, and $q_{3}~ $ to physical qubits $Q_{0}$, $Q_{1}$, $Q_{2}$, and $Q_{3}$, respectively. In the physical qubits, the corresponding CNOT ($q_{1},q_{2}$) gates are not nearby, so a swap operation is needed. There are many possible solutions, such as moving $q_{1}$ close to $q_{2}$, moving $q_{2}$ to $q_{1}$, and moving them both to be close to each other. In the coupling graph in Fig. 2(b), we can only move $q_{1}$ to $q_{2}$ or $q_{2}$ to $q_{1}$. Moving from $q_{2}$ to $q_{1}$ has two solutions: SWAP ($q_{2},q_{3}$) and SWAP ($q_{2},q_{0}$). Moving from $q_{1}$ to $q_{2}$ also has two solutions: SWAP ($q_{1},q_{0}$) and SWAP ($q_{1},q_{3}$). Therefore, the solution of this qubit mapping problem for CNOT ($q_{1}$,$q_{2}$) has four swap candidates.

Without any optimization, we just select one swap gate among them. This swap gate will affect the compilation of the subsequent gates, so we must know how much this one swap gate can increase or decrease the number of gates in the circuit. Moreover, fidelity and execution time are concerns. The process of searching for the optimal swap can require exponential time. In pursuit of the best circuit depth or size, many researchers have applied heuristic search algorithms from graph theory to qubit mapping problems.

4.2 Optimization

The well-known solution for qubit allocation is referred to as $\textit{jku}$, which was presented by Zulehner et al. [14]. The $\textit{jku}$ algorithm first partitions the circuit into layers. Each layer $l_{i}$ contains only gates that do not overlap with another gate qubit. In Fig. 5(a), which is generated from Fig. 1, gates $g_{1}$ and $g_{3}$ overlap on qubit $q_{1}$, so the first layer obtains only gates $g_{0}$ and $g_{1}$. After portioning, if logical qubits of the gate do not satisfy the physical coupling graph, $\textit{jku}$ finds and inserts additional swap gates before layer $l_{i}$. The sequence of swap gates is referred to as permutation layer $\pi _{i}$.

Instead of initializing mapping, $\textit{jku}$ will choose a physical qubit with the lowest cost for a qubit of a gate that has not been mapped yet. $\textit{jku}$ explores through space and attempts to minimize the number of swap gates in $\pi _{i}$. In the worst case, the time complexity is $\mathrm{\mathcal{O}}\left(m!/\left(m-n\right)!\right)$, where $m$ and $n$ are the numbers of physical and logical qubits, respectively.

Because of time complexity, the A* algorithm is applied to prevent searching of the entire search space. In each step, the A* algorithm chooses the swap based on a value of $f\left(x\right)$. $f$ is a value from the sum of two parameters $g\left(x\right)$ and $h\left(x\right)$, while $g$ is the number of swaps that have been added, and $h$ is the heuristic cost. $h$ is determined by the fidelity cost and distance of a physical qubit in the coupling graph with respect to the logical qubit mapping. For optimization, a look-ahead scheme incorporates information of the next layer into the cost function $f\left(x\right)$ of the A* algorithm. This means that it adds a term that estimates the minimal cost of depth for swap gates added to the cost of the A* algorithm.

The $\textbf{$\textit{sabre}$}$ algorithm [15] is a promising algorithm for qubit mapping with $\mathrm{\mathcal{O}}\left(N^{2.5}g\right)$ time complexity. In preprocessing, $\textit{sabre}$ uses the All-Pairs Shortest Path (APSP) algorithm to obtain the distance of all pairs in a coupling graph and represents it as a 2D-array matrix. Next, it represents a two-qubit gate of the quantum circuit using a Directed Acyclic Graph (DAG).

As shown in Fig. 5(b), the two-qubit (CNOT) gates in each layer act on dependent logical qubits using the same approach as $\textit{jku}$’s partitioning. The two-qubit gates that do not have predecessors are stored in front layers $F$. In a swap-based heuristic search process, $\textit{sabre}$ checks emitting the successor of the gates in $F$. If these gates can be executed on hardware (resolvable gates), then they will be moved to executable gate lists that contain only the gates that can compile on a coupling graph.

Next, $\textit{sabre}$ searches for swap-qubit-mapping solutions and chooses the optimal one based on the heuristic cost function $H$ of each solution. $H$ in $\textit{sabre}$ is designed to select a solution with three considerations: the front layer, extent layers, and the decay effect. For consideration of the front layer, $\textit{sabre}$ computes the average distance of these physical qubits using the Nearest Neighbor Cost Function (NNC), as shown in Eq. (2), where $D$ is a function of NNC and computes the cost of gates applied to $q_{1}$ and $q_{2}$.

$H_{F}=\frac{1}{\left| F\right| }\sum _{gate\in F}D\left[\pi \left(gate,q_{1}\right)\right]\left[\pi \left(gate,q_{2}\right)\right]$

To determine how this impacts the next layer, $\textit{sabre}$ computes $H_{E}$ of the extent layer $E_{x}$ that contains the closest gate to the front layer. This computation is the same as $H_{F}$ except that $gate\in E_{x}$. Another consideration of $\textit{sabre}$ is the $\textit{decay effect}$ that is introduced to choose non-overlapping swaps and increase parallelism. If qubit $q_{i}$ has been involved in a swap recently, this decay parameter increases. The cost function $H$ of $\textit{sabre}$ is shown in Eq. (3), where $W$ indicates the extent to which layers are considered, and its value can be tuned from 0 to 1. \uppercase{t}his $H$ can be improved using different methodologies.

$H=\max \left(swap\left(q_{1}\right),swap\left(q_{2}\right)\right)\times \left(H_{F}+W\times H_{E}\right)$

For the initialization of qubit mapping, $\textit{sabre}$ performs random initial mapping for the first time and then computes the reverse traversal technique after all the gates are solved. This technique uses the final mapping of the original circuit as the initial mapping to compute the reverse circuit, and the final mapping of the reverse circuit is used as the initial mapping for the original circuit. The reverse circuit reforms the gates of the original circuit from the last gate $g_{i}$ to $g_{0}$. The initialized mapping can impact the quality of the entire quantum circuit [15]. Thus, the reverse traversal technique can significantly increase the initial mapping quality [15,16].

The Bounded Mapping Tree ($\textbf{BMT}$) algorithm [18] is also designed based on graph theory algorithms. $\textit{BMT}$ employs subgraph isomorphism and a token swapping algorithm. It uses the machine architecture (coupling graph) and input program (quantum circuit) as inputs. In the program-partitioning phase, $\textit{BMT}$ performs subgraph isomorphism to find a viable mapping for each instruction (CNOT gates) to the coupling graph to obtain maximum-isomorphism sub-lists.

Fig. 6 shows that CNOT ($q_{1},q_{0}$) is mapped to a coupling graph with three possible mappings to physical qubits: ($Q_{0},Q_{1}$), ($Q_{0},Q_{3}$), and ($Q_{0},Q_{2}$). If ($q_{1},q_{0}$) is mapped to ($Q_{0},Q_{3}$), $Q_{3}$ is called a dead-end, which does not have another connection. However, if it is mapped to ($Q_{0},Q_{1}$), $Q_{0}$ is still ``alive,'' which indicates we can map another next logical qubit to a neighbor of $Q_{0}$, such as $Q_{2}$ and $Q_{3}$.

For the next instruction ($q_{2},q_{0}$), $q_{0}$ has been mapped already, so there is only $q_{2}$ left. If $(Q_{0},Q_{1})$ has been chosen previously, then there are two possible mappings of $q_{2}$: $Q_{2}$ or $Q_{3}$. We perform this mapping until all instructions are mapped. As a result of the program partitioning process, the dotted box in Fig. 6 indicates the sub-list with maximum subgraph isomorphism as a partition. More precisely, a logical qubit is never mapped to a different physical qubit in the same partition. Note that finding the maximum subgraph isomorphism requires exponential time, so $\textit{BMT}$ constrains the time by manually limiting the quantity of finding the mapping for each qubit. Thus, if the time for finding it is greater than the limitation goal, a random mapping is adopted.

In the combing mapping step, token swapping [26] is performed to find the optimal swap ($\pi _{i}$) to connect each partition together $\left(f_{prev}\rightarrow f\right)$. As shown in Fig. 7, the token swapping processes from $f_{prev}$ by swapping $\pi _{i}$ to reach $f$. The cost of the token swapping algorithm is $\mathrm{\mathcal{O}}\left(\left| Q\right| ^{3}\right)$, where $Q$ is the number of physical qubits. The last phase is code generation that will be assembling the partitions with swap. This phase is also an alternative way to find a swap by replacing $\pi _{i}$ using the four-approximative $\Delta $ algorithm to solve the colored token swapping problem because $\Delta $ may produce a lower cost for swap. Then, $\textit{BMT}$ will choose the least expensive solution between the first solution and $\Delta $.

Fig. 5. Example of front layer and DAG generation.
Fig. 6. BMT - Program partitioning
Fig. 7. BMT - combining mapping.
Fig. 8. QURE - Finding isomorphic subgraph.

5. Variation-aware Qubit Allocation

5.1 Greedy Approach

Qubit re-allocation (QURE) [19] is an algorithm that is performed after qubit mapping to improve the fidelity. QURE is applied to quantum circuits that can be executed directly on target quantum hardware after applying any DOQA algorithm. $\textit{QURE}$ takes the quantum circuit from DOQA as an input and provides two algorithms: a greedy approach and a subgraph search.

$\textit{QURE}$’s greedy approach ranks all physical qubits in a coupling graph in descending order based on the average two-qubit gate fidelities of individual qubits. Furthermore, $\textit{QURE}$ ranks the physical qubits by the highest-fidelity score fetched from the quantum hardware. As shown in Fig. 9, it is assumed that a physical qubits are ranked as follows: $Q_{5}$, $Q_{0}$, $Q_{3}$, $Q_{7}$, $Q_{8}$, $Q_{4}$, $Q_{1}$, $Q_{6}$, and $Q_{2}$ ($Q_{5}$ is the optimal qubit with highest fidelity). In logical qubits, it ranks all qubits in descending order based on how many times the two-qubit gate is applied to that qubit. Thus, the ranking list is $q_{2}$, $q_{1}$, $q_{3}$, and $q_{0}$. After ranking the best qubits, $\textit{QURE}$ starts to select the best physical qubit ($Q_{5}$) and logical qubit ($q_{2}$) and then assigns (maps) $q_{2}$${\rightarrow}$$Q_{5}$.

Next, $\textit{QURE}$ looks around at the neighbors of $Q_{5}$ ($Q_{8}$, $Q_{4}$, and $Q_{2}$) and selects the optimal one among them. The same occurs for the logical qubits, among which the neighbors of $q_{2}$ are $q_{1}$ and $q_{0}$, and then $q_{1}$ is selected as an optimal one. In this step, $q_{1}$ is mapped to $Q_{5}$. The selected logical and physical qubits will not be chosen anymore. $\textit{QURE}$ continues looking and assigning until all logical qubits are mapped. Consequently, $\textit{QURE}$ obtains the optimal qubit mapping, as shown in Fig. 9.

Another method from $\textit{QURE}$ is subgraph search, which finds a quantum circuit’s isomorphic subgraph from the coupling graph. This method also performs after logical qubits of the circuit are allocated, but it will be re-allocated to increase fidelity. Accordingly, $\textit{QURE}$ extracts an arbitrary subgraph from the coupling graph (physical qubits) for logical qubits to map, as indicated by the dotted box in Fig. 9. Next, $\textit{QURE}$ computes this mapping with a physical qubit to obtain a fidelity score based on the gate error of each qubit.

As shown in Fig. 9, if the dotted box moves to the left, there is another possible mapping. Moreover, the other two possible mappings are in the dotted box rotated by 90$^{\circ}$. There are four possible mappings in Fig. 9. However, each mapping, such as the vertical ($V$) mapping, can be computed by flipping horizontal ($V_{2}$), flipping vertical ($V_{4}$), and rotating 180$^{\circ}$ ($V_{3}$).

The same occurs for horizontal mapping. More formally, all possible mapping $A$ can be formulated as in Eq. (4). $Q_{v}$ and $q_{v}$ are many physical and logical qubits for vertical mapping, respectively. $Q_{H}$ and $q_{H}$ are for horizontal mapping. Therefore, Fig. 9 shows that there will be $4\times \left(16-2+1\right)\times \left(16-2+1\right)$ or 900 possible mappings. $\textit{QURE}$’s subgraph search will compute all these possible mappings and then select the optimal mapping (high fidelity).

$A=4\times \left(Q_{v}-q_{v}+1\right)\times \left(Q_{H}-q_{H}+1\right)$
Fig. 9. QURE - Greedy approach.
Fig. 10. Label logical qubit edge.

5.2 Optimization

A hardware-aware heuristic ($\textit{HA}$) mapping algorithm [16] inspired by $\textit{sabre}$ efficiently reduces additional gates by 28% and improves fidelity as well. $\textit{HA}$ performs both swap and bridge gates for mapping. Like $\textit{sabre}$, $\textit{HA}$ also constructs a quantum circuit to DAG and layers, as shown in Fig. 5. $\textit{HA}$ follows the $\textit{sabre}$ heuristic search algorithm but simply modifies the cost function $H$.

$\textit{HA}$ introduces three matrices to estimate the cost of a swap: a swap matrix (number of swaps) $S$, a swap error matrix $E$, and a swap execution time matrix $T$. Along with these three matrices, there are three weights ($\alpha _{1},\alpha _{2},\alpha _{3}$) that can be tuned. More precisely, $S$, $E$, and $T$ are computed using the Floyd-Warshall algorithm that finds the shortest paths of all pairs in a directed graph. Computing the Floyd-Warshall algorithm requires using other weights for each matrix ($W_{E}$,$W_{T}$,$W_{S}$). $W_{E}$ is computed using $e$, an error rate given in the calibration data.

For example, $e\left(Q_{0},Q_{1}\right)=5$ indicates that the minimum error rate between the logical qubits corresponding to $Q_{0}$ moving to $Q_{1}$ is 5. As shown in equation (5), $e_{ij}$ and $e_{ji}$ refer to the error rates of $Q_{i},Q_{j}$ and $Q_{j},Q_{i}$ , respectively.

$W_{E}=1-e_{ij}+e_{ji}+\max \left(e_{ij},e_{ji}\right)$

The execution time cost $W_{T}$ based on the $t$ which is also extracted from hardware and indicates the execution time between $Q_{i}$ to $Q_{j}$ and vice versa.

$W_{T}=t_{ij}+t_{ji}+\min \left(t_{ij},t_{ji}\right)$

The last width is $W_{S}$. It is calculated by 1 cost for 1 Swap. For instance, it requires a minimum of 1 swap gate to move the state of $Q_{0}$ to $Q_{3}$ in Fig. 2(b). After computing the Floyd-Warshall algorithm, $\textit{HA}$ calculates the sum of three matrices $E$, $S$, and $T$, which is referred to as the $\textit{distance matrix}$ $D$ in Eq. (7).

$D=\alpha _{1}\times S+\alpha _{2}\times E+\alpha _{3}\times T$

Another component is used to find the optimal solution between the bridge and swap gates. Thus, to choose the optimal solution, $\textit{HA}$ computes the $\textit{Effect}$ function, as shown in Eq. (8), where the distance matrix of a bridge ($D_{$\textit{bridge}$}$) is subtracted from $D_{swap}$.

E f f e c t b r i d g e = g a t e E D s w a p D b r i d g e

Recently, Zhu et al. introduced two core algorithms: an expansion-from-center scheme ($\textbf{EFC}$) and the maximum consecutive positive effect ($\textbf{MCPE}$) of a swap gate as the heuristic cost function [27]. The $\textit{EFC}$ scheme is used in the initial mapping process. In the first $\textit{EFC}$ stage, the gate index is labeled at the edge of the logical qubit graph depending on the first applied gate.

As illustrated in Fig. 10, the gate $g_{0}$ firstly acts on $q_{0}$ and $q_{2}$, so these qubits’ edge is the index of $g_{0}$ or 0. These labels are used as weights in the breadth-first search to arrange all logical qubits in a queue starting from the center logical qubit in the graph topology. This center logical qubit is then mapped to the center physical one, and each logical qubit in the queue is mapped to the physical qubits one by one with the closest physical qubit.

In preparatory work for the swap-based heuristic algorithm, the dependency list is generated by counting the dependency gates of each logical qubit, the so-called DAG. Next, the cost distance matrix is computed as a 2D array like $W_{S}$ in $\textit{HA}$ (the number of Swaps from $Q_{i}$ to $Q_{j}$). The core algorithm is used to compute the difference of distance of a gate applied to ($Q_{i},~ Q_{j}$) before and after applying swaps, the so-called $\textit{Effect}$ in Eq. (9). The $dist$ function is obtained from the cost distance matrix.

Assume that the circuit in Fig. 5(a) is mapped to the coupling graph in Fig. 2(b) by initializing mapping from qubit index 0 to 5. For gate $g_{1}$, the distance neighbor from $q_{1}$ to $q_{2}$ in the coupling graph is 1, but after applying SWAP ($q_{0},q_{1}$), the distance is 0. Thus, the effect of this swap operation is 1 minus 0 or 1:

E f f e c t g = d i s t g , π b e f o r e d i s t g , π a f t e r

The $\textit{mcpe}$ function computes the sum of the cost effect of every gate on qubits to which the swap operation is applied until $\textit{effect}(g_{i}$) is negative. The range from the first gate to the gate with a negative effect is called the $\textit{window}$. For instance, for the $\textit{mcpe}$ of SWAP($q_{2},q_{0}$) on $q_{2}$, there are three gates: $g_{1}$, $g_{4}$, and $g_{5}$. The effect on $g_{1}$ is 1 (positive), and that on $g_{4}$ is -1 (negative). Thus, in $\textit{window}$, the list contains only $g_{1}$. If there is no positive effective gate, the value of $\textit{mcpe}$ is 0 by default.

m c p e S w a p , q i = g w i n d o w E f f e c t g

A dynamic look-ahead algorithm is introduced as the $\textit{MCPE}$ cost function. $\textit{MCPE}$ illustrates how much SWAP ($Q_{i}$, $Q_{j}$) can affect all the gates on $Q_{i}$ and $Q_{j}$.

$MCPE\left(Swap\right)=~ mcpe\left(Swap,q_{i}\right)+mcpe\left(Swap,q_{j}\right)$

As an example, we consider the same previous assumption of using the heuristic search algorithm to solve gate $g_{1}$. There are four solutions: SWAP($q_{0},q_{1}$), SWAP($q_{2},q_{0}$), SWAP($q_{2},$ $q_{3}$), and SWAP($q_{1}$,$q_{3}$). First, we consider the cost function MCPE of SWAP($q_{0},q_{1}$), which corresponds to $Q_{0}$ and $Q_{1}$.

For $mcpe$(SWAP,$q_{0}$), two gates are applied to $q_{0}$: gates $g_{3}$ and $g_{8}$. The nearest-distance qubit in $g_{3}$ before swapping is 0 (nearby), and after applying SWAP($q_{0}$, $q_{1}$), it is 0, so the effect on $g_{3}$ is still 0 in Eq. (9). The distance of $g_{8}$ is 1, and after swapping, it is 0; thus, the effect cost is 1. Therefore, $\textit{mcpe}$(SWAP, $q_{0}$) is 1 in Eq. (10). For $mcpe$(SWAP,$q_{1}$), the cost effect is 2 with three CNOT gates ($g_{1},g_{3},g_{5}$). Therefore, $\textit{MCPE}$(SWAP ($q_{0},q_{1}$)) is 2 + 1 or 3. Both $\textit{MCPEs}$ of SWAP($q_{2},q_{0}$) and SWAP($q_{1},q_{3}$) are 1, and that of SWAP($q_{2},q_{3}$) is 2. Consequently, SWAP ($q_{0},q_{1}$) is selected to be inserted before $g_{1}$.

Table 1. Summary of qubit allocation algorithms.



Quantum Architecture


Algorithm Details





Reduce the number of gates by 23% compared with baseline from IBM's solution.

- Partition the circuit into layers.

- Search the optimal solution by the A* algorithm.

- Searching space is $\mathrm{O }\left(\exp \left(N\right)\right).$



IBM Q20 Tokyo

Small size reduces 98.7% and large size reduces 10% compared with JKU.

- Compute All-Pairs Shortest Path with $\mathrm{O }\left(N^{3}\right)$ time.

- Initial mapping by reverse traversal technique.

- Swap-based search space with $\mathrm{O }\left(N\right)$ time.

- Time complexity is $\mathrm{O }\left(N^{2.5}\cdot g\right)$.



Reduce the number of gates by 16% and 35% for depth, compared with Sabre

- Partition circuit using subgraph isomorphism.

- Combine mapping using Token Swapping,

- Compute 4-approximate algorithm.





Improve fidelity 1.54x times than the baseline from Revlib.

- Greedy approach: rank optimal fidelity of qubits, then map qubits focus on the highest fidelity qubits.

- Sub-graph search: compute all possible isomorphic subgraph.





IBM Q20 Tokyo

Reduce the number of gates by 22.92% compared with Sabre.

- Use expansion-from-center scheme to determine the initial mapping.

- Compute dynamic look-ahead cost function.

- Time complexity is $\mathrm{O }\left(N^{1.5}\cdot ~ g\right).$



IBM Q Valencia

8%, 19% and 38% are superior to Sabre in terms of improved fidelity, reduced execution time and number of gates

- Follow heuristic search of Sabre

- Allow swap and bridge for mapping

- Compute cost function with three matrices: number of swap matrix, swap error matrix, and execution time matrix, using Floyd-Warshall algorithm.

- Mapping requires $\mathrm{O }\left(N^{2.5}\cdot ~ g\right)$ time.

IBM Q Almaden

Reduce the number of gates by 28% against Sabre.

6. Performance Evaluation

Table 1 summarizes six qubit allocation algorithms. The first well-known algorithm $\textit{jku}$ [14] divides gates into independent layers and searches all possible solutions to minimize the sum of the distance between the physical qubits in the layer. It then applies the A* algorithm to minimize the cost of the depth circuit. The $\textit{jku}$ method reduces the number of gates by 23% compared with IBM’s approach targeting on IBM QX3 with 16 qubits; however, the time complexity is still exponential. $\textit{Sabre}$ [15] is an efficient method that reduces additional gates by 97% on average for small circuits and 10% for large circuits compared with $\textit{jku}$. The $\textit{sabre}$ approach computes a swap-based bidirectional heuristic search with a cost function that ensures flexibility, controllability, scalability, and high-quality initial mapping.

The $\textit{BMT}$ algorithm [17] exploits subgraph isomorphism for partitioning the circuit and substituting token swapping with the four-approximate algorithm to find the optimal swap. $\textit{BMT}$ outperforms $\textit{sabre}$ by 16% in decreasing the number of gates and by 35% in reducing the depth of the circuit. For increasing the fidelity of the final state, $\textit{QURE}$ [19] performs reallocation on the top of the compiler when the quantum circuit satisfies the coupling constraint. $\textit{QURE}$ improves the fidelity of the final state by 1.54x times over the baseline from Revlib (benchmark resource).

$\textit{MCPE/EFC}$ [27] uses the $\textit{EFC}$ scheme to determine the initial mapping and compute the look-ahead cost function that dynamically estimates cost to update the mapping. $\textit{MCPE/EFC}$ outperforms $\textit{Sabre}$ by 22.92% in reducing the number of gates evaluated for the IBM Q20 Tokyo. It can also be applied to other bidirectional connectivity constraints of the quantum architecture.

$\textit{HA}$ qubit mapping [16] inherits the same mapping time complexity of $\mathrm{\mathcal{O}}\left(N^{2.5}g\right)$ from sabre. $\textit{HA}$ is designed as a quantum mapping transition method using calibration data from a quantum architecture and selects either a swap or bridge gate for qubit movement. $\textit{HA}$ significantly outperforms $\textit{sabre}$ for IBM Q Valencia by 8% in fidelity improvement, 19% in reducing execution time, and 38% in reducing the number of additional gates. For the evaluation in IBM Q Almaden, $\textit{HA}$ decreased the number of gates by 28%.

7. Conclusion

The arena of quantum computing will soon reach supremacy by the rapidly increasing use of quantum algorithm applications. There is a gap between quantum circuit execution and imperfect NISQ hardware. In this review paper, we presented six algorithms that try to solve the qubit allocation problem caused by physical qubit hardware constraints. The algorithms can be briefly summarized as follows: $\textit{jku}$ uses the A* algorithm, $\textit{sabre}$ uses a cost function with a distance qubit matrix and reverse traversal initialization mapping, $\textit{BMT}$ solves the NP-complete problem using NP-solutions such as subgraph isomorphism and the token swapping algorithm, and $\textit{MCPE/EFC}$ determines efficient initial mapping by $\textit{EFC}$. $\textit{HA}$ uses swap and bridge gates to satisfy the qubit constraint. Both $\textit{MCPE/EFC}$ and $\textit{HA}$ designed a new cost function from $\textit{sabre}$ that can improve both fidelity and reduce the number of gates. All algorithms were evaluated on NISQ devices with various types of quantum architectures.


This work was supported by a Research Grant of Pukyong National University (2019).


Bhatt A. P., 2019, Quantum Cryptography for Internet of Things Securitya, Vol. 17, No. 3, pp. 8DOI
Taylor R. D., Jul 2020, Quantum Artificial Intelligence: A “precautionary” U.S. approach?, Telecommuni-cations Policy, Vol. 44, No. 6, pp. 101909DOI
McClean J. R., et al. , Sep 2020, Discontinuous Galerkin discretization for quantum simulation of chemistry, New J. Phys., Vol. 22, No. 9, pp. 093015URL
Mohseni M., Read P., Neven H., accessed Feb 20, 2021, Commercialize early quantum technologies, pp. 5Google Search
Gomes L., Apr. 2018, Quantum computing: Both here and not here, IEEE Spectr., Vol. 55, No. 4, pp. 42-47DOI
accessed Feb. 19, 2021, ‘CES 2018: Intel’s 49-Qubit Chip Shoots for Quantum Supremacy - IEEE Spectrum’Google Search
Sete E. A., Zeng W. J., Rigetti C. T., A functional architecture for scalable quantum computing, in 2016 IEEE International Conference on Rebooting Computing (ICRC), San Diego, CA, USA, Oct. 2016, pp. 1-6DOI
accessed Feb. 19, 2021, ‘IonQ plans to launch a rack-mounted quantum computer for data centers in 2023', TechCrunchGoogle Search
Murgia M., accessed Feb. 19, 2021, British quantum computing experts leave for Silicon Valley, Jun. 23, 2019Google Search
Courtland R., Jun 2017, Google aims for quantum computing supremacy [News], IEEE Spectr., Vol. 54, No. 6, pp. 9-10DOI
A. ChoSep. 15, 2020, and 5:45 Pm , accessed Feb. 20, 2021, IBM promises 1000-qubit quantum computer—a milestone—by 2023, Science | AAAS, Sep. 15, 2020Google Search
Wootters W. K., Zurek W. H., Oct. 1982, A single quantum cannot be cloned, Nature, Vol. 299, No. 5886, pp. 802-803DOI
Siraichi M. Y., dos Santos V. F., Collange S., Pereira F. M. Q., Feb. 2018, Qubit allocation, in Proceedings of the 2018 International Symposium on Code Generation and Optimization, Vienna Austria, pp. 113-125DOI
Zulehner A., Paler A., Wille R., Mar. 2018, Efficient mapping of quantum circuits to the IBM QX architectures, in 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, pp. 1135-1138Google Search
Li G., Ding Y., Xie Y., 2019, Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices, pp. 14Google Search
Niu S., Suau A., Staffelbach G., Todri-Sanial A., 2020, A Hardware-Aware Heuristic for the Qubit Mapping Problem in the NISQ Era, IEEE Trans. Quantum Eng., Vol. 1, pp. 1-14DOI
Zhu P., Cheng X., Guan Z., Nov. 2020, An exact qubit allocation approach for NISQ architectures, Quantum Inf Process, Vol. 19, No. 11, pp. 391-DOI
Siraichi M. Y., dos Santos V. F., Collange C., Pereira F. M. Q., Oct. 2019, Qubit allocation as a combination of subgraph isomorphism and token swapping, Proc. ACM Program. Lang., Vol. 3, No. oopsla, pp. 1-29DOI
Ash-Saki A., Alam M., Ghosh S., Jun. 2019, QURE: Qubit Re-allocation in Noisy Intermediate-Scale Quantum Computers, in Proceedings of the 56th Annual Design Automation Conference 2019, Las Vegas NV USA, pp. 1-6DOI
Barenco A., et al. , Nov. 1995, Elementary gates for quantum computation, Phys. Rev. A, Vol. 52, No. 5, pp. 3457-3467DOI
Gyongyosi L., Imre S., Dec. 2020, Circuit Depth Reduction for Gate-Model Quantum Computers, Sci Rep, Vol. 10, No. 1, pp. 11229DOI
Preskill J., Aug. 2018, Quantum Computing in the NISQ era and beyond, Quantum, Vol. 2, pp. 79DOI
Van Meter R., Ladd T. D., Fowler A. G., Yamamoto Y., Feb. 2010, Distributed Quantum Computation Architecture Using Semiconductor Nanophotonics, Int. J. Quantum Inform., Vol. 08, No. 01n02, pp. 295-323DOI
McGeoch C. C., Harris R., Reinhardt S. P., Bunyk P. I., Jun. 2019, Practical Annealing-Based Quantum Computing, Computer, Vol. 52, No. 6, pp. 38-46DOI
accessed Feb. 02, 2021, The World’s Highest Performing Quantum Computer is HereGoogle Search
Miltzow T., Narins L., Okamoto Y., Rote G., Thomas A., Uno T., Aug. 2016, Approximation and Hardness for Token SwappingDOI
Zhu P., Guan Z., Cheng X., Dec 2020, A Dynamic Look-Ahead Heuristic for the Qubit Mapping Problem of NISQ Computers, IEEE Trans. Integr. Circuits Syst., Vol. 39, No. 12, pp. 4721-4735DOI


Sengthai Heng

Sengthai Heng received his Bachelor's degree in Information Technology from the University of Cambodia, Phnom Penh, Cambodia, in 2019. He presently is a master's degree student of AI convergence at Pukyong National University (PKNU), Pusan, South Korea. His current research interests include quantum computing, high-performance computing, and AI.

Taekyung Kim

Taekyung Kim received his ME and PhD from the Department of Information Industry Engineering, Chungbuk National University, Cheongju, South Korea, in 2005 and 2010, respectively. He was a director at the Korea Software HRD Center, Phnom Penh, Cambodia, from 2013 to 2019. He is currently an assistant professor in the AI Department of Computer & Information Technology, Incheon Jaeneung University, South Korea. His research interests include big data, AI, and SW education.

Youngsun Han

Youngsun Han received the B.S. and Ph.D. degrees in electrical engineering from the Korea University, Seoul, South Korea, in 2003 and 2009, respectively. He was a senior engineer at the System LSI, Samsung Electronics, Suwon, South Korea, from 2009 to 2011. He was an assistant/associate professor with the Department of Electronic Engineering, Kyungil University, Gyeongsan-si, South Korea, from 2011 to 2019. He is currently an associate professor with the Department of Computer Engineering, Pukyong National University, Busan, South Korea. His research interests include microarchitecture, high-performance computing, and SoC design.