The research group of Beijing Institute of Technology has made important progress in the research of optical convolutional neural networks and machine learning
Release date: 2024-01-31 Contributed by: School of Physics Photography: School of Physics
Editor: Wang Lirong Audit: Ke Chen Number of readings:Recently, Professor Zhang Xiangdong's research group at the School of Physics of Beijing Institute of Technology has realized the optical convolutional network with "quantum acceleration" function by using the correlation properties of classical light。The results are published in Light: Science under the title "Correlated Optical Convolutional Neural Network with Quantum Speedup. & Journal of Applications [Light Sci. Appl. 13, 36 (2024)]上。Sun Yifan, associate researcher of the School of Physics, Beijing Institute of Technology, is the first author of the paper, Professor Zhang Xiangdong of the School of Physics, Beijing Institute of Technology, is the corresponding author, and Kong Lingjun, researcher of the School of Physics, Beijing Institute of Technology, and Dr. Li Gan also made important contributions to the work。The research work was strongly supported by the National Natural Science Foundation of China。
In recent years, with the blessing of modern computer technology, artificial intelligence technology represented by machine learning algorithm has been greatly developed。These technologies enable people to achieve unprecedented efficiency in tasks such as image recognition, natural language generation and processing, and object detection [Nature, 521, 436(2015)]。Especially in this year, many countries and regions have named ChatGPT, a language processing software based on deep neural network algorithms, as the English word of the year, which shows the great influence of CHATGPT。The software can almost real-time general answer, text editing, and even high quality semantic induction, multi-language translation and other related work。However, such a great feature requires a huge amount of computing power to support。According to official sources, ChatGPT has more than 170 billion model parameters, and the entire training process requires the use of expensive equipment and takes a lot of time。In fact, such computing resources are already saturated with the current level of equipment technology。Therefore, how to effectively reduce the training cost of machine learning models and improve their training efficiency is an important problem in the development process of this field。
In order to solve this problem, in addition to developing and improving the original classical neural network algorithm, people have made bold attempts in two research directions。They are optical neural networks and quantum neural networks。Optical neural network is a classical optical information processing process that uses advanced optical control methods to implement machine learning algorithms。In this process, the execution of the algorithm is mainly accomplished by optical state control。Compared with traditional electrical equipment, optical information processing equipment has unique advantages, such as weak coupling with the ordinary environment, optical parallelism, high-speed conductivity and so on。Therefore, the optical neural network based on this method has the advantages of low energy consumption, low crosstalk and low transmission delay [Nature 589, 44 (2021); Nature 589, 52 (2020)]。But unfortunately, optical neural networks have not been able to accelerate the algorithm structure, such as faster model convergence speed。Another potential improvement is quantum neural networks。This is a neural network algorithm based on quantum computing theory。Recent studies have shown that quantum neural networks can exhibit structural acceleration of algorithms by taking advantage of the special correlation properties of quantum states。A typical example of this is the recent theoretical analysis of quantum convolutional networks [Nat. Phys., 15, 1273(2019)]。The numerical simulation results of the network show that its loss function has a faster convergence rate during training, which means that the total training time can be effectively shortened。However, due to the limitations of technological development, it is difficult for such neural network algorithms to be executed on a large scale in hardware, which makes it difficult to play a role in the practical problems faced by people。
To sum up, on the one hand, optical neural networks have initially possessed a certain scale of implementation capability and demonstrated specific advantages; on the other hand, quantum neural networks have been proved theoretically to provide the acceleration effect on the algorithm structure, but it is difficult to achieve large-scale implementation at present。Recently, we have constructed a new type of classical optical neural network: associative optical convolutional neural network。It can combine the advantages of the two kinds of neural networks, that is, it has the effect of accelerating the algorithm structure, and it can be implemented relatively easily。The following are two aspects of theoretical construction and experimental implementation。
One of the highlights of the research is the theoretical design of the associated optical convolutional network with "quantum acceleration" function
Firstly, using the classical optical correlation properties, the researchers theoretically give an optical neural network structure that can correspond to quantum convolutional networks, and call it associative optical convolutional networks。Their solution is illustrated in Figure 1。The main structure of the network can be summarized into four parts: the light source part, the convolution layer part, the pooling layer part, and the final detection part。Among these four parts, the light source is the most basic part, which is the source of the entire scheme corresponding to the quantum convolutional network. The left side of Figure 1 is marked asThe light beam is shown。Unlike previous optical neural networks, the researchers consider a special classical optical state as the information carrier, namely the multimode polarization state。By constructing orthogonal relationships between different modes in this state, the state can effectively simulate multi-qubit states in quantum computing。In fact, the researchers in this paper have pointed out this property of multimode polarized light in earlier work [Annalen der Physik, 534, 2200360 (2022)]。The light source section is followed by the convolution layer section。The function of this part is to perform a unitary transformation of the associated optical state, as shown in the blue area in Figure 1。The transformation is actually a combination of a series of modules that perform a unitary transformation of two beams of multimode polarized light。These modules are indicated by the blue cuboid in Figure 1, and their details are shown in the blue dashed box。In fact, each module is made up of a series of wave plates and nonlinear elements, whose structure and function are consistent with the general operation of two-qubits in quantum computing。After this is the pooling layer section, as shown in the brown area in Figure 1。The pooling operation designed here by the researchers in this paper is essentially a beam merging operation based on nonlinear optics。By merging, the information carried by multiple polarized beams is partially encoded into fewer beams, which allows the number of associated beams involved in the "calculation" to be effectively reduced。Compared with the pooling layer in traditional convolutional networks, this operation also reduces the data dimension。The difference, however, is that the method presented in this paper allows the efficiency of dimensionality reduction to be exponential, which is similar to the process of measuring partial qubits to obtain subspaces in quantum computing。Finally, after repeated application of convolution and pooling, the output of the associated light needs to be done through the probe part, as shown in the structure on the far right of Figure 1。In this part, we first need to use "balanced zero-beat detection" to "projective measurement" of the polarization state of the outgoing beam, and then make statistics of the correlation of all the output light projection information。The size of the association will be taken as the output。It is worth mentioning that the outstanding feature of the convolutional network proposed by the researchers is that the classical optical association is used as the basic carrier of information, and the information processing process is completed by the modulation, statistics, and final measurement of the association。The correlation optical convolutional network can have a good correspondence with the quantum convolutional network model。
图1. Schematic diagram of associated optical convolutional network scheme。
To further confirm the effect of the associated optical convolutional networks, the researchers also conducted numerical studies, the results of which are shown in Figure 2。First, the researchers compared the training effects of associative optical convolutional networks with classical convolutional networks on specific data sets, as shown in Figures 2 (a) and 2 (b)。Figure 2 (a) shows the convergence of the loss functions of the two convolutional networks in the binary classification task, and Figure 2 (b) shows the convergence of the loss functions of the two convolutional networks in the quad-classification task。It can be seen from the results in the figure that the convergence speed of the associated optical convolutional network in these two cases is faster than that of the traditional convolutional network scheme。This characteristic is consistent with that of quantum convolutional networks。In addition, the researchers present the results of classifying the ground states of the Haldane model using an associative optical convolutional network, as shown in Figure 2 (c)。The researchers first encode the ground state of the Haldane model using optical correlation, then input the corresponding correlation state (i.e., the multi-mode polarized light mentioned above) into the corresponding correlation optical convolutional network, and finally analyze the output results to obtain the broken line marked by the red triangle in Figure 2 (c)。This polyline coincides with the boundary obtained by the standard method (shown by the colored background)。The results are also consistent with those of quantum convolutional networks。These results further demonstrate that associative optical convolutional networks can demonstrate the basic properties of quantum convolutional networks。
图2.(a) The variation of the loss function between the optical convolutional network and the traditional convolutional network with the number of steps in the training of binary classification task。(b) The variation of the loss function between the optical convolutional network and the traditional convolutional network with the number of steps in the four-classification task training。(c) The results of confirming the ground state phase transition boundary of Haldane model by using the associated optical convolutional network。
The second highlight of the research is the experimental verification of associative optical convolutional networks
To verify the functionality and realizability of the associative optical convolutional network, the researchers also present an experimental implementation of the neural network in their work。The schematic diagram of the experimental device is shown in Figure 3。In order to facilitate the experimental implementation, the researchers simplified the original scheme to a certain extent。They choose the spatial mode of laser as the orthogonal mode in the multi-mode polarization state, and select a special state space based on the mathematical properties of the optical correlation state to realize the corresponding correlation optical convolutional network。The associated optical convolutional network shown in Figure 3 is actually a simulation of a quantum convolutional network for state classification, which can correspond to three qubits as inputs。As shown in the figure, the network consists of only one convolutional layer, and the functions of the pooling layer are integrated into the detection process。The different functional units contained in the convolutional layers in the figure are indicated by different colors。
图3. Schematic diagram of experimental implementation of associated optical convolutional networks。
With this experimental setup, the researchers first studied the network output when different states were used as inputs, as shown in Figure 4。In the process, the researchers selected ten different correlated optical states as inputs, each corresponding to ten different three-bit quantum states。Due to the structural setup of the neural network, the output state is actually a projection measurement of a single beam of multimode polarized light, corresponding to a single bit quantum state。Taking into account the correspondence between the classical optical states used and the qubits, the researchers also present the "density matrix" representation shown in Figure 4。Different subgraphs in Figure 4 correspond to different input state cases。The height of the long square column box in each subgraph represents the theoretical results, and the height of the internal fill color represents the experimental results。You can see that the two fit together very well。In addition, the researchers also used this experimental setting to process the ground state of the Haldane Hamiltonian at the three-lattice point and identify the topological phase to which it belongs. The results are shown in Figure 5。The red dot in the left figure of Figure 5 is the output result measured directly in the experiment, and the blue line is the theoretical curve. The x coordinate and y coordinate correspond to the parameters of the Hamiltonian respectively。First, it can be seen from the data on the left that the theoretical and experimental values agree。More importantly, a clear boundary of the phase diagram can be obtained by second-order derivative treatment of the curve, corresponding to the result in Figure 3 (c)。The results fully verify the correctness and feasibility of the theoretical scheme proposed by the researchers。
图4. Correlating the output of an optical convolutional network with different input cases。In each subgraph, the left side is the real part result and the right side is the imaginary part result。
图5. Experimental results of identification of topological phases of Haldane Hamiltonian ground states using associated optical convolutional networks。
The research team designed a new optical convolutional network based on the classical optical association properties, namely the associative optical convolutional network。This network can exhibit the properties corresponding to quantum convolutional networks, including the acceleration effect on certain classification problems, and the realization of some functions of quantum convolutional networks to classify quantum states。In addition, the research team also used the experimental platform to carry out experimental research on the neural network, and verified their theoretical results and the realizability of the network。This research result is an important advance in the direction of optical information processing, and provides a new idea for realizing more efficient optical neural networks。
Paper link:http://doi.org/10.1038/s41377-024-01376-7
Share to: