Multispectral characterization of the multicomponent systems using machine learning algorithms


From 18.11.2019 till 01.12.2021
Grant holder: Alexander Guda
Members: Yury Rusalev, Victor Shapovalov, Ilia Pankin, Yulia Podkovirina, Mikhail Soldatov, Antonina Kravtsova, Andrei Tereshchenko, Sergey Guda

The main goal of the project is the development of a methodology for spectral analysis of multicomponent systems. A simultaneous quantitative analysis of spectral data from different energy ranges is needed to determine the number of phases and their concentrations in a sample. Particularly, we will conduct a simultaneous quantitative analysis of the three regions of the X-ray absorption spectrum (pre-edge, XANES, EXAFS) and extend this approach to the data of elastic scattering and infrared spectroscopy in order to clarify the structure of multicomponent system. The listed techniques are complementary. So, EXAFS is used to determine coordination numbers and bond lengths. The XANES region is sensitive to the bond angles and the charge state of the absorbing atom. The pre-edge region contains information about the symmetry of the local environment and the charge state of the atoms under study. The frequencies and intensities of the vibrational peaks in the infrared spectra characterize the organic fragments surrounding the 3d metal atom, as the X-ray absorption spectra are less sensitive for higher coordination spheres. The project is motivated by the complexity of the real systems under study, rather than some reference examples. Usually there are several phases simultaneously present in the samples and the task is to obtain information about the properties in operando mode with further interpretation of the observed processes at the atomic-molecular level. In recent years, there has been a rapid development of the synchrotron radiation sources (SR) and X-ray free electron lasers. The decision on the modernization and construction of new SR sources was made in Russia as well. Experimental stations appeared that are offering the applications of combined techniques: XAS-XRD, XAS-FTIR-Raman, XAS-XRD-XRF and even XAS-FTIR-XRD- XRF with submicron resolution. Using a combination of several techniques it is possible to obtain complementary data, which leads to more complete and reliable results. However, the processing and interpretation of the obtained data is often carried out by separate groups - experts on experimental techniques in a specific spectral region. Today there is no methodology and software for the simultaneous quantitative analysis of spectral data from different energy ranges in the world. This is especially evident in X-ray absorption spectroscopy, where the quantitative analysis of the pre-edge, the near-edge XANES, and the extended EXAFS is carried out by fundamentally different theoretical methods and different scientific groups. An even deeper separation by specialization occurs during the transition from X-ray to optical spectroscopy. Combining several techniques for quantitative analysis will allow to make a breakthrough in the analysis of complex multicomponent systems. Instead of trying to adjust the experimental conditions for varying the concentration of the phases inside the sample, it will be possible under these conditions to identify several phases from spectral data of different energy ranges. The success of the project is determined by the scientific novelty of the proposed solutions and the ability of the team to implement them. We will for the first time develop a method based on machine learning, which will allow (a) to estimate the amount of structural information contained in the measured spectral data (b) to determine the parameters of the local atomic and electronic structure of the reference substance and the corresponding errors. The quality of the analysis will be determined by advanced methods (including Gradient Boosting of Random Trees and Neural Networks) and the most comprehensive training set. The latter will be built on theoretical spectra calculated from first principles and on the bases of experimental spectra. For the training of the algorithm in the IR range, the existing complete databases will be used. For the first time, a standardized database of X-ray absorption spectra will be compiled as part of the project. Special attention will be paid to the influence of experimental artifacts on the accuracy of structural analysis. To remove the experimental errors, a search will be made for the best comparison metrics between the theoretical and experimental spectra, which will reach the extremum at the point of the true values of the structural parameters. This will compensate for the effect of self-absorption, sample thickness. The development of this technique has an important practical application for a wide range of applications. Particularly, we will show the application of the technique of quantitative analysis of spectral data of different energy ranges for a detailed understanding of the principle of operation of catalysts based on single atoms and molecules. Thus, the global demand for polyethylene (PE) resins for 2018 is almost 108 tons, which corresponds to a commercial value of more than 150 million US dollars, with an expected growth of 4.0% in 2019. In this market, the Phillips catalyst (Cr / SiO2) covers almost 40% of the global demand for high density polyethylene. Despite the industrial significance of this catalyst and the fact that it was patented almost seven decades ago, the nature of the active centers and the degree of oxidation of the active chromium particles are still being discussed. We will show how the quantitative analysis of a set of multispectral data improves the accuracy and reliability of determining structural information about the catalytic centers of chromium compared with an independent analysis of each spectrum separately.