Quantum Classification Using a Real-World Dataset
Dr. Vishal Sharma, a Postdoctoral Research Fellow, recently asked me via LinkedIn messaging about quantum classification. I won’t reveal anything about his 5-column, 26,156-row dataset other than its size — 130,780 total data points — but I will reveal the modified algorithm that I shared with him.
Quantum computing and machine learning are two distinct fields. Although they compliment each other nicely, it is possible to be familiar with one but not the other. Therefore, before providing a brief introduction to quantum classification, it may be beneficial to provide a brief introduction to classification.
An oversimplified description of classification is that we have a number of data points that are divided into classes (groups) and we have new data that we want to assign to an existing group. Typically, that new data is a single data point, just like all the other single data points that are already grouped together, although it is also possible to assign a new group to an existing group.
Quantum classification, therefore, is the use of quantum processors, or simulations thereof, to perform this task. Because of both fields’ relationship to linear algebra, quantum computing is expected to excel at machine learning tasks.
My initial intent was to run this algorithm on both a NISQ processor and a quantum computing simulator, and to compare runtimes, but it quickly became evident that this would not be possible.
This circuit compares one test state to one class. Because of the limited number of qubits available, it would have to run one time per class.
The first five qubits each represent one feature of one class from the dataset, and the second five qubits each represent one feature from the test data point. The last qubit is the ancilla qubit for the SWAP Test, to determine how close the test data is to the class data. Each Fredkin gate (controlled-SWAP) uses the ancilla qubit as its control and the qubits that represent the same feature as the target qubits.
This is what the transpiled circuit looks like. Not only do Fredkin gates add significant circuit depth during transpilation, the limited connectivity of the processor requires many additional SWAP operations.
Because of the overall circuit depth, the results are worthless. The SWAP Test measures |0> with a probability of 1 when states are identical and with a probability of 0.5 when states are maximally opposite, but the probability of measuring |0> was well under 0.5. That’s just wrong.
Because the 32-qubit IBM Q Experience simulator allows reset gates — NISQ processors do not — it is possible to compare the test data to multiple classes within one circuit. Reset operations also reduce the number of qubits that need to be simulated, which can reduce runtime considerably for large circuits.
The first round doesn’t use reset gates because the qubits are freshly initialized, but here is how each class comparison works:
- initialize or reset the class data and test data qubits
- map the data to the respective qubits using RY rotations; because the test data remains the same for each class comparison, the “set” block is a subroutine simplifying the OpenQASM code
- compare each test feature to each class feature; the “test” block is a subroutine that performs the same Hadamard operations and SWAP Tests as shown in the ibmq_16_melbourne circuit above
- not shown in this cropped circuit: measure the control qubit “ax,” which is the ancilla qubit for the SWAP Tests
This algorithm requires a modest amount of classical preprocessing. Because I received the data in a CSV file, I opened it up in Excel and did the computation there. Functions such as MIN, MAX, and AVERAGE are straightforward to use, and even the normalization function is merely subtraction and division.
- Calculate the global minimum and global maximum for each feature
- Calculate the mean value (average) per feature per class
- Select testing data
- Use the global minima and maxima to normalize the average values and test data points from 0 to 1 using the formula (( input - min ) / ( max - min ))
- Rotate qubits around the y axis by the normalized values multiplied by pi; use one qubit per feature for the class data and one qubit per feature for the test data
- Use the SWAP Test to compare the test data to the class data, feature by feature
- The SWAP Test with the highest probability of measuring |0> is the closest class to the test data
You could do all this in Python, of course, but I prefer to do everything related to quantum computing in OpenQASM so that the focus is on the circuit.
Understanding the Data
Because of my familiarity with SWAP Tests and my confidence that this algorithm should work, I identified an issue with the data that might be helpful when pursuing other machine learning approaches. In a nutshell, the normalized data was way too close together. Consequently, any test data point could have been classified as any of the classes, as if the algorithm was just generating results randomly. Not one test result was accurate.
Fortunately, it was possible to select a subset of the data that resolved this issue without compromising the results. This allowed the normalized values to have a little bit of space between them, and I finally began to see accurate classifications.
Results and Conclusion
I normally include histograms in my articles, but this project required far too much testing for that. Instead, I will summarize my findings.
If you visually inspect the test data and the class data, you will see that this algorithm accurately classifies test data that obviously belongs to a certain class. The accuracy diminishes when the test values are in between the class values. Some data looks like it could be classified one way, and yet it is labelled a different way. I don’t know enough about the data to know why that is, but I don’t consider that to be an algorithm problem. The data points are legitimately closer to other labels than their own.
Dr. Sharma initially inquired about Quantum Restricted Boltzmann Machines (QRBM), but I am far more familiar with SWAP Tests so I presented this approach to him. It would, of course, be interesting to compare my approach to a QRBM implementation, as well as a scikit-learn implementation. Simple QRBM, it seems, might have a possibility of working on a NISQ processor, and that would certainly be interesting.
All images are from IBM Q Experience.