Physics Asked by Cavenfish on February 19, 2021
This is sorta a cross-post from my post in StackOverflow. Although, rather than seeking help with the code I’m here asking for help with a physical/mathematical/statistical problem associated with the code.
I’m trying to cluster infrared (IR) spectra so that spectra most similar to each other are grouped and processed differently. To do this I am implementing a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm on a 2D set of data representing the spectra. Since the IR spectra clusters need to have similar shapes and vertical height (the substance is absorbing the same amount of light for spectra in the cluster) I have adapted a 2D system with the following parameters
The mean
of the spectra (the average %Transmission, this should correlate to the vertical height) and the mean of the first derivative
of the spectra (this should correlate to the shape). These two parameters then give a 2D system that I can use DBSCAN to cluster the IR spectra, it works but it does not cluster them perfectly. Note that the DBSCAN algorithm typically clusters scatter plot points through their Euclidean distance from each other. This is why I have the two parameters which you can imagine being plotted on an X(mean) and Y(mean of derivative) plane, for the clusters algorithm to then cluster them.
I can already make logical sense of the fact that the average of the derivative will not be an accurate measure of the shape of the spectral line, but I can not figure out better method. How can I get a better parameter for representing the shape of a spectral line?
To me an IR spectrum consists of resonances, which are described by a Lorentzian distribution. Therefore, the "shape" of a resonance is Lorentzian, and each resonance has three parameters: (1) the location $mu$, (2) the spread FWHM, and (3) the amplitude. Where the first derivative comes into play is rather unclear to me. Do you try to decompose the resonance into its moments?
As you wrote, in order to cluster the resonances you will have to define a scoring function, which measures/quantifies the "distance" of two resonances / spectra. While defining the scoring function for each of the three fit parameters is straight forward, there exists no natural choice for how to combine the three scores. It really depends on your particular problem/task.
Answered by Semoi on February 19, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP