Over the past few decades, data analytics has gained significant importance for various professional roles within industries and academic research. In detail, in the field of chemistry takes place chemometrics, a branch of chemistry, which has multiple purposes, including enabling researchers to gain valuable insights and knowledge about systems and processes across a wide range of levels of complexity. By the use deep-rooted mathematical and statistical tools chemometrics allows to solve chemical problems. While many chemometric methods have been well-established in chemistry, often some scientists, both in academia and industry, either do not employ these methods or use them with a limited understanding of their underlying principles and recognized statistical properties, known as a black-box approach. The present thesis aims to fill this knowledge gap by demonstrating different approaches applied at different stages of the data analysis pipeline. It highlights the crucial aspect of selecting the suitable and interpretable method, regardless of the case study's complexity, with the primary focus on problems resolution. The studies presented in this thesis include a wide range of chemical cases, conducted in collaboration with international research groups and industries. To achieve this, an extensive exploitation of chemometric tools was employed in the field of Experimental Design or Design of Experiments (DoE) and Multivariate Analysis. DoE facilitated the systematic exploration of the experimental domain, allowing to extract high quality information by a series of planned experiments and therefore to build a mathematical model for quantifying the effect of various factors affecting the system and, when necessary, enabling to make predictions. On the other hand, Multivariate Analysis was employed to extract meaningful insights that enable pattern recognition within complex, high-dimensional data. These two domains of chemometrics are deeply interconnected over the manuscript. Chemometric tools are imperative for instrumental analytical techniques that generate a set of numbers instead of just a single value. This requirement applies to various techniques such as chromatographic or spectroscopic, for instance, vibrational spectroscopy (mid-infrared (MIR), near-infrared (NIR), and Raman), ultraviolet visible spectroscopy (UV-vis), nuclear magnetic resonance (NMR), X-ray spectroscopy, etc. In these cases, the analytical approach is not always focused on identifying or quantifying specific chemicals in a sample. Instead, it aims to provide an overall characterization of the sample, like a unique fingerprint. These a-specific techniques have several advantages compared to traditional methods. They are usually faster, cheaper, require minimal or no sample preparation, do not destroy the sample, do not necessarily require highly high-skilled personnel, can be implemented for real-time or online analysis, and are easy to automate and transport. Being a-specific techniques, the use of chemometrics is well established with methods such as multivariate calibration, classification or multivariate class-modeling. However, there are many other characterization techniques such as granulometry (PSD), rheology, calorimetry, etc., which generate an "analytical profile" and multivariate analysis in these cases is often overlooked. From this profile, so-called descriptors are regularly used. However, these descriptors offer only very limited information without providing global and unequivocal information. A further aim of this thesis is to exploit multivariate analysis to effectively interpret all the techniques and therefore to develop rapid and effective methods for their interpretation. Another important aspect of this thesis is the exploitation of multi-block analysis in production processes. Thanks to the technological progress and the growing availability of powerful tools, it is now possible to gather a vast amount of data, ranging from process variables (temperatures, flow rates, etc.) to determinations made through various analytical techniques mentioned earlier. The aim is to analyze this extensive data using data-fusion approaches and uncover the relationships between different blocks of data, eventually constructing predictive models. Techniques such as Principal Properties or innovative multi-block analysis methods like SO-PLS are employed for this purpose.

Use of Experimental Design and Multivariate Analysis for solving industrial problems

FARININI, EMANUELE
2024-03-26

Abstract

Over the past few decades, data analytics has gained significant importance for various professional roles within industries and academic research. In detail, in the field of chemistry takes place chemometrics, a branch of chemistry, which has multiple purposes, including enabling researchers to gain valuable insights and knowledge about systems and processes across a wide range of levels of complexity. By the use deep-rooted mathematical and statistical tools chemometrics allows to solve chemical problems. While many chemometric methods have been well-established in chemistry, often some scientists, both in academia and industry, either do not employ these methods or use them with a limited understanding of their underlying principles and recognized statistical properties, known as a black-box approach. The present thesis aims to fill this knowledge gap by demonstrating different approaches applied at different stages of the data analysis pipeline. It highlights the crucial aspect of selecting the suitable and interpretable method, regardless of the case study's complexity, with the primary focus on problems resolution. The studies presented in this thesis include a wide range of chemical cases, conducted in collaboration with international research groups and industries. To achieve this, an extensive exploitation of chemometric tools was employed in the field of Experimental Design or Design of Experiments (DoE) and Multivariate Analysis. DoE facilitated the systematic exploration of the experimental domain, allowing to extract high quality information by a series of planned experiments and therefore to build a mathematical model for quantifying the effect of various factors affecting the system and, when necessary, enabling to make predictions. On the other hand, Multivariate Analysis was employed to extract meaningful insights that enable pattern recognition within complex, high-dimensional data. These two domains of chemometrics are deeply interconnected over the manuscript. Chemometric tools are imperative for instrumental analytical techniques that generate a set of numbers instead of just a single value. This requirement applies to various techniques such as chromatographic or spectroscopic, for instance, vibrational spectroscopy (mid-infrared (MIR), near-infrared (NIR), and Raman), ultraviolet visible spectroscopy (UV-vis), nuclear magnetic resonance (NMR), X-ray spectroscopy, etc. In these cases, the analytical approach is not always focused on identifying or quantifying specific chemicals in a sample. Instead, it aims to provide an overall characterization of the sample, like a unique fingerprint. These a-specific techniques have several advantages compared to traditional methods. They are usually faster, cheaper, require minimal or no sample preparation, do not destroy the sample, do not necessarily require highly high-skilled personnel, can be implemented for real-time or online analysis, and are easy to automate and transport. Being a-specific techniques, the use of chemometrics is well established with methods such as multivariate calibration, classification or multivariate class-modeling. However, there are many other characterization techniques such as granulometry (PSD), rheology, calorimetry, etc., which generate an "analytical profile" and multivariate analysis in these cases is often overlooked. From this profile, so-called descriptors are regularly used. However, these descriptors offer only very limited information without providing global and unequivocal information. A further aim of this thesis is to exploit multivariate analysis to effectively interpret all the techniques and therefore to develop rapid and effective methods for their interpretation. Another important aspect of this thesis is the exploitation of multi-block analysis in production processes. Thanks to the technological progress and the growing availability of powerful tools, it is now possible to gather a vast amount of data, ranging from process variables (temperatures, flow rates, etc.) to determinations made through various analytical techniques mentioned earlier. The aim is to analyze this extensive data using data-fusion approaches and uncover the relationships between different blocks of data, eventually constructing predictive models. Techniques such as Principal Properties or innovative multi-block analysis methods like SO-PLS are employed for this purpose.
26-mar-2024
Chemometrics; Design of Experiments; DoE; QbD; Multivariate Analysis; PCA; PLS; SO-PLS; Quality Control; Optimization; Process monitoring; Multi-block analysis; Data fusion; Variable selection; Spectroscopy; NIR; Granulometry; Rheology; Calorimetry
File in questo prodotto:
File Dimensione Formato  
phdunige_3904453.pdf

accesso aperto

Descrizione: Doctoral thesis of Emanuele Farinini
Tipologia: Tesi di dottorato
Dimensione 9.28 MB
Formato Adobe PDF
9.28 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1167637
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact