Alberto Paolo Tonda

Alberto Paolo Tonda
French National Institute for Agriculture, Food, and Environment (INRAE) | INRAE · Department TRANSFORM (Food, bioproducts and waste)

Ph.D.

About

196
Publications
66,415
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,574
Citations
Introduction
My main research interest concerns the application of evolutionary computation and stochastic optimization to real-world problems. I am currently working on semi-supervised modeling of food processes resorting to stochastic meta-heuristics.
Additional affiliations
March 2014 - present
French National Institute for Agriculture, Food, and Environment (INRAE)
Position
  • Master Class "When Nature Inspires Engineers"
Description
  • Class for Master students, consisting in a survey of machine learning methods applied to the agri-food chain
May 2011 - December 2013
French National Institute for Agriculture, Food, and Environment (INRAE)
Position
  • DREAM European Project
Description
  • EU project focused on development of reliable models for food and agricultural processes. http://dream.aaeuropae.org/
May 2011 - July 2012
Institut des Systèmes Complexes, Paris Île-de-France
Position
  • PostDoc Position

Publications

Publications (196)
Preprint
Full-text available
Federated Learning (FL), a privacy-aware approach in distributed deep learning environments, enables many clients to collaboratively train a model without sharing sensitive data, thereby reducing privacy risks. However, enabling human trust and control over FL systems requires understanding the evolving behaviour of clients, whether beneficial or d...
Article
Full-text available
Background In recent years, human microbiome studies have received increasing attention as this field is considered a potential source for clinical applications. With the advancements in omics technologies and AI, research focused on the discovery for potential biomarkers in the human microbiome using machine learning tools has produced positive ou...
Preprint
Full-text available
Background: In recent years, human microbiome studies have receivedincreasing attention as this field is considered a potential source for clinicalapplications. With the advancements in omics technologies and AI, researchfocused on the discovery for potential biomarkers in the human microbime usingmachine learning tools has produced positive outcom...
Article
Full-text available
Insect value chains are a complex system with non-linear links between many economic, environmental, and social variables. Multi-objective optimization (MOO) algorithms for finding optimal options for complex system functioning can provide a valuable insight in the development of sustainable insect chains. This review proposes a framework for MOO a...
Article
Full-text available
Background Not being well controlled by therapy with inhaled corticosteroids and long‐acting β2 agonist bronchodilators is a major concern for severe‐asthma patients. The current treatment option for these patients is the use of biologicals such as anti‐IgE treatment, omalizumab, as an add‐on therapy. Despite the accepted use of omalizumab, patient...
Article
Full-text available
Odor is central to food quality. Still, a major challenge is to understand how the odorants present in a given food contribute to its specific odor profile, and how to predict this olfactory outcome from the chemical composition. In this proof-of-concept study, we seek to develop an integrative model that combines expert knowledge, fuzzy logic, and...
Article
Full-text available
As the COVID-19 pandemic winds down, it leaves behind the serious concern that future, even more disruptive pandemics may eventually surface. One of the crucial steps in handling the SARS-CoV-2 pandemic was being able to detect the presence of the virus in an accurate and timely manner, to then develop policies counteracting the spread. Nevertheles...
Article
Full-text available
Microbiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommenda...
Chapter
Particle Swarm Optimisation (PSO) and Evolutionary Algorithms (EAs) differ in various ways, in particular with respect to information sharing and diversity management, making their scopes of applications very diverse. Combining the advantages of both approaches is very attractive and has been successfully achieved through hybridisation. Another pos...
Preprint
Full-text available
In this paper, we present a test of an interactive modelling scheme in real conditions. The aim is to use this scheme to identify the physiological responses of microorganisms at different scales in a real industrial application context. The originality of the proposed tool, Biosys-LiDeOGraM, is to generate through a human-machine cooperation a con...
Chapter
Nature-inspired optimization Algorithms (NIOAs) are nowadays a popular choice for community detection in social networks. Community detection problem in social network is treated as an optimization problem, where the objective is to either maximize the connection within the community or minimize connections between the communities. To apply NIOAs,...
Article
Food safety is a common concern at the household level, with important variations across different countries and cultures. Nevertheless, identifying the factors that best explain similarities and differences in consumer awareness pertaining to this topic is not straightforward. Starting from a questionnaire administered in seven countries from four...
Preprint
Full-text available
Deep learning methods are highly accurate, yet their opaque decision process prevents them from earning full human trust. Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts. However, state-of-the-art concept-based models rely on high-dimensional concept embedding representations which la...
Preprint
Full-text available
Explainable AI (XAI) aims to answer ethical and legal questions associated with the deployment of AI models. However, a considerable number of domain-specific reviews highlight the need of a mathematical foundation for the key notions in the field, considering that even the term "explanation" still lacks a precise definition. These reviews also adv...
Chapter
Full-text available
Evolutionary ensemble learning methods with Genetic Programming have achieved remarkable results on regression and classification tasks by employing quality-diversity optimization techniques like MAP-Elites and Neuro-MAP-Elites. The MAP-Elites algorithm uses dimensionality reduction methods, such as variational auto-encoders, to reduce the high-dim...
Article
Full-text available
Context Managing land use to promote an ecosystem service (ES) without reducing others is challenging. The spatial scale at which no-loss constraints are imposed is relevant. Objectives We examined the influence of the spatial scale of no-loss constraints on ESs when one ES was optimised. Specifically, we investigated how carbon sequestration coul...
Article
Full-text available
To meet current societal demand for more sustainable transformation processes and bioresources, these processes must be optimized and new ones developed. The evolution of various systems (raw material, food, or process attributes) can be predicted to optimize the uses of biomass for better quality, safety, economic benefit, and sustainability. Pred...
Preprint
Full-text available
Nature-inspired optimization Algorithms (NIOAs) are nowadays a popular choice for community detection in social networks. Community detection problem in social network is treated as optimization problem, where the objective is to either maximize the connection within the community or minimize connections between the communities. To apply NIOAs, eit...
Article
Grape berry ripening is a complex process, and predicting the quality of wine starting from the ripening kinetics of grape berries is a challenging task. To tackle this problem, we present a decision-support system based on coupling expert know-how with probability laws encapsulated in a probabilistic model, a dynamic Bayesian network. The proposed...
Article
Digital Collectible Cards Games such as Hearthstone have become a very prolific test-bed for Artificial Intelligence algorithms. The main researches have focused on the implementation of autonomous agents (bots) able to effectively play the game. However, this environment is also very attractive for the use of Data Mining (DM) and Machine Learning...
Article
Full-text available
Gear backlash is a quite serious problem in industrial robots, it causes vibrations and impairs the robot positioning accuracy. Backlash estimation allows targeted maintenance interventions, preserving robot performances and avoiding unforeseen equipment breakdowns. However, a direct measure of the backlash is hard to obtain, and dedicated auxiliar...
Conference Paper
Full-text available
This paper investigates to what extent food safety is perceived as a concern at the household level in different countries. It aims to identify the factors that best explain food safety concern, among the various foodrelated questions asked through a survey. To do so, a machine learning approach is used. The results show that the most significant e...
Chapter
We propose an unsupervised, model-agnostic, wrapper method for feature selection. We assume that if a feature can be predicted using the others, it adds little information to the problem, and therefore could be removed without impairing the performance of whatever model will be eventually built. The proposed method iteratively identifies and remove...
Preprint
Full-text available
As the COVID-19 pandemic continues to affect the world, a new variant of concern, B.1.1.529 (Omicron), has been recently identified by the World Health Organization. At the time of writing, there are still no available primer sets specific to the Omicron variant, and its identification is only possible by using multiple targets, checking for specif...
Chapter
This chapter presents three examples of data-based machine learning (ML) on time series. The common denominator of these case studies is the sparseness of data, making ML results fragile and inaccurate. We show how human expertise can be effectively mobilized for building useful systems, for instance useful decision support systems, able to better...
Article
Social networks are one the main sources of information transmission nowadays. However, not all nodes in social networks are equal: in fact, some nodes are more influential than others, i.e., their information tends to spread more. Finding the most influential nodes in a network—the so-called Influence Maximization problem—is an NP-hard problem wit...
Chapter
For several medical treatments, it is possible to observe transcriptional variations in gene expressions between responders and non-responders. Modelling the correlation between such variations and the patient’s response to drugs as a system of Ordinary Differential Equations could be invaluable to improve the efficacy of treatments and would repre...
Preprint
Full-text available
As the COVID-19 pandemic persists, new SARS-CoV-2 variants with potentially dangerous features have been identified by the scientific community. Variant B.1.1.7 lineage clade GR from Global Initiative on Sharing All Influenza Data (GISAID) was first detected in the UK, and it appears to possess an increased transmissibility. At the same time, South...
Article
Full-text available
In recent years, modelling techniques have become more frequently adopted in the field of food processing, especially for cereal-based products, which are among the most consumed foods in the world. Predictive models and simulations make it possible to explore new approaches and optimize proceedings, potentially helping companies reduce costs and l...
Chapter
In the field of machine learning, coresets are defined as subsets of the training set that can be used to obtain a good approximation of the behavior that a given algorithm would have on the whole training set. Advantages of using coresets instead of the training set include improving training speed and allowing for a better human understanding of...
Article
In this paper, deep learning is coupled with explainable artificial intelligence techniques for the discovery of representative genomic sequences in SARS-CoV-2. A convolutional neural network classifier is first trained on 553 sequences from the National Genomics Data Center repository, separating the genome of different virus strains from the Coro...
Preprint
Full-text available
The SARS-CoV-2 variant B.1.1.7 lineage, also known as clade GR from Global Initiative on Sharing All Influenza Data (GISAID), Nextstrain clade 20B, or Variant Under Investigation in December 2020 (VUI - 202012/01), appears to have an increased transmissability in comparison to other variants. Thus, to contain and study this variant of the SARS-CoV-...
Conference Paper
Full-text available
Insect value chains in Europe are evolving to large-scale industrial systems overcoming economic and environmental challenges. SUSINCHAIN, a H2020 EU-funded project, aims to define the leverages and solutions for sustainable insect value chains from multiple perspectives: economic, environmental, safety, nutritional, etc. Such perspectives have dif...
Conference Paper
Full-text available
Interaction of food systems and the environment has been in research focus for many years. In order to explain this interaction, scholars have developed and use various approaches in modelling and understanding this phenomenon. This paper gives an overview of three main perspectives in analyzing this issue and provides some future perspectives asso...
Article
Full-text available
Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machi...
Preprint
As machine learning becomes more and more available to the general public, theoretical questions are turning into pressing practical issues. Possibly, one of the most relevant concerns is the assessment of our confidence in trusting machine learning predictions. In many real-world cases, it is of utmost importance to estimate the capabilities of a...
Chapter
Feature selection is the process of choosing, or removing, features to obtain the most informative feature subset of minimal size. Such subsets are used to improve performance of machine learning algorithms and enable human understanding of the results. Approaches to feature selection in literature exploit several optimization algorithms. Multi-obj...
Article
Insufficient cleaning in the food industry can create serious hygienic risks. However, when attempting to avoid these risks, food-processing plants frequently tend to clean for too long, at extremely high temperatures, or with too many chemicals, resulting in high cleaning costs and severe environmental impacts. Therefore, the optimization of clean...
Preprint
Full-text available
One of the reasons for the fast spread of SARS-CoV-2 is the lack of accuracy in detection tools in the clinical field. Molecular techniques, such as quantitative real-time RT-PCR and nucleic acid sequencing methods, are widely used to identify pathogens. For this particular virus, however, they have an overall unsatisfying detection rate, due to it...
Preprint
A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as it allows improving training speed for the algorithms and may help human understanding the results. Building on...
Chapter
Machine learning agents learn to take decisions extracting information from training data. When similar inferences can be obtained using a small subset of the same training set of samples, the subset is called coreset. Coresets discovery is an active line of research as it may be used to reduce the training speed as well as to allow human experts t...
Chapter
In the field of artificial intelligence, agents learn how to take decisions by fitting their parameters on a set of samples called training set. Similarly, a core set is a subset of the training samples such that, if an agent exploits this set to fit its parameters instead of the whole training set, then the quality of the inferences does not chang...
Chapter
Industrial manipulators are robots used to replace humans in dangerous or repetitive tasks. Also, these devices are often used for applications where high precision and accuracy is required. The increase of backlash caused by wear, that is, the increase of the amount by which teeth space exceeds the thickness of gear teeth, might be a significant p...
Book
Full-text available
In the context of global warming and environmental pressure, food chains must adapt to new production conditions while satisfying the evolving consumer demand. Livestock production is known for its negative ecological footprint, bringing forward the question of a possible transition towards more plant-based diets. Citizens' demand evolves at differ...
Article
Full-text available
Background: MicroRNAs (miRNAs) are noncoding RNA molecules heavily involved in human tumors, in which few of them circulating the human body. Finding a tumor-associated signature of miRNA, that is, the minimum miRNA entities to be measured for discriminating both different types of cancer and normal tissues, is of utmost importance. Feature select...
Article
Full-text available
Digital collectible card games are not only a growing part of the video game industry, but also an interesting research area for the field of computational intelligence. This game genre allows researchers to deal with hidden information, uncertainty and planning, among other aspects. This paper proposes the use of evolutionary algorithms (EAs) to d...
Article
Full-text available
This paper gives an overview of the scientific challenges that occur when performing life cycle assessment (LCA) in the food chain. In order to evaluate these risks, Failure Mode and Effect Analysis tool has been used. Challenges related to setting the goal and scope of LCA reveal four hot spots: system boundaries of LCA; functional units used; typ...
Conference Paper
Full-text available
When a machine learning algorithm is able to obtain the same performance given a complete training set, and a small subset of samples from the same training set, the subset is termed coreset. As using a coreset improves training speed and allows human experts to gain a better understanding of the data, by reducing the number of samples to be examin...
Conference Paper
Full-text available
In machine learning a coreset is defined as a subset of the training set using which an algorithm obtains performances similar to what it would deliver if trained over the whole original data. Advantages of coresets include improving training speed and easing human understanding. Coreset discovery is an open line of research as limiting the trainin...
Article
Full-text available
This data article contains annotation data characterizing MultiCriteria Assessment (MCA) Methods proposed in the agri-food sector by researchers from INRA, Europe's largest agricultural research institute (INRA, http://institut.inra.fr/en). MCA can be used to assess and compare agricultural and food systems, andsupport multi-actor decision making a...
Article
Full-text available
Mathematical modelling plays an important role in food engineering having various mathematical models tailored for different food topics. However, mathematical models are followed by limited information on their application in food companies. This paper aims to discuss the extent and the conditions surrounding the usage of mathematical models in th...
Article
A better understanding of protein fouling during the thermal treatment of whey protein concentrate (WPC) solutions is critical for better fouling control. In order to understand the impact of various parameters on the total whey protein fouling mass, a dimensional analysis was applied to the experimental data obtained from a pilot scale plate heat...
Poster
Full-text available
In an optimization problem, a coreset can be defined as a subset of the input points, such that a good approximation to the optimization problem can be obtained by solving it directly on the coreset, instead of using the whole original input. In machine learning, coresets are exploited for applications ranging from speeding up training time, to hel...
Chapter
Full-text available
In an optimization problem, a coreset can be defined as a subset of the input points, such that a good approximation to the optimization problem can be obtained by solving it directly on the coreset, instead of using the whole original input. In machine learning, coresets are exploited for applications ranging from speeding up training time, to hel...
Article
Full-text available
Multi-criteria reverse engineering (MRE) has arisen from the cross-fertilization of advances in mathematics and shifts in social demand. MRE, thus, marks a progressive switch (a) from empirical to formal approaches able to simultaneously factor in diverse parameters, such as environment, economics, and health; (b) from mono-criterion optimization t...
Article
Full-text available
Reducing the effort required by humans in countering malware is of utmost practical value. We describe a scalable, semi-supervised framework to dig into massive datasets of Android applications and identify new malware families. Up to the 2010s, the industrial standard for the detection of malicious applications has been mainly based on signatures;...
Article
Full-text available
VALIS is an effective and robust classification algorithm with a focus on understandability. Its name stems from Vote-ALlocating Immune System, as it evolves a population of artificial antibodies that can bind to the input data, and performs classification through a voting process. In the beginning of the training, VALIS generates a set of random c...
Conference Paper
Full-text available
One of the most relevant problems in social networks is influence maximization, that is the problem of finding the set of the most influential nodes in a network, for a given influence propagation model. As the problem is NP-hard, recent works have attempted to solve it by means of computational intelligence approaches, for instance Evolutionary Al...
Preprint
Full-text available
The role of microRNAs (miRNAs) in cellular processes captured the attention of many researchers, since their dysregulation is shown to affect the cancer disease landscape by sustaining proliferative signaling, evading program cell death, and inhibiting growth suppressors. Thus, miRNAs have been considered important diagnostic and prognostic biomark...
Chapter
Full-text available
Exploiting the availability of the largest collection of Patient-Derived Xenografts from metastatic colorectal cancer annotated for response to therapies, this manuscript aims to characterize the biological phenomenon from a mathematical point of view. In particular, we design an experiment in order to investigate how genes interact with each other...
Chapter
The apparent simplicity of food processes often hides complex systems, where physical, chemical and living organisms’ processes co-exist and interact to create the final product. Data can be plagued by uncertainty; heterogeneity of available information is likely; qualitative and quantitative data may also coexist in the same process, from expert p...
Preprint
Full-text available
Genetic Programming is a powerful optimization technique, able to deliver high-quality results in several real-world problems. One of its most successful applications is symbolic regression, where the objective is to find a suitable expression to model the underlying relationship between data points, with no aprioristic assumptions. In this paper,...
Conference Paper
Full-text available
In the context of social networks, maximizing influence means contacting the largest possible number of nodes starting from a set of seed nodes, and assuming a model for influence propagation. The real-world applications of influence maximization are of uttermost importance, and range from social studies to marketing campaigns. Building on a previo...
Article
Full-text available
Collectible card games have been among the most popular and profitable products of the entertainment industry since the early days of Magic: The GatheringTM in the nineties. Digital versions have also appeared, with HearthStone: Heroes of WarCraftTM being one of the most popular. In Hearthstone, every player can play as a hero, from a set of nine,...
Article
Diversity of food systems and their interaction with the environment has become a research topic for many years. Scientists use various models to explain environmental issues of food systems. This paper gives an overview of main streams in analyzing this topic. A literature review was performed by analyzing published scientific papers on environmen...
Article
Cancer diagnosis is currently undergoing a paradigm shift with the incorporation of molecular biomarkers as part of routine diagnostic panel. This breakthrough discovery directs researches to examine the role of microRNA in cancer, since its deregulation is often associated with almost all human tumors. Such differences frequently recur in tumor-sp...
Book
Building complex models from available data is a challenge in many domains, and in particular in food science. Numerical data are often not enough structured, or simply not enough to elucidate complex structures: human choices have thus a major impact at various levels. LIDeOGraM is an interactive modelling framework adapted to cases where numerica...
Book
The apparent simplicity of food processes often hides complex systems, where physical, chemical and living organisms' processes co-exist and interact to create the final product. Data can be plagued by uncertainty; heterogeneity of available information is likely; qualitative and quantitative data may also coexist in the same process, from expert p...

Questions

Questions (8)
Question
I am comparing different machine learning techniques for learning dynamical systems (e.g. a system of ordinary differential equations), and so far I've used Long-Short-Term Memory Networks (LSTM) and other variations of Recurrent Neural Networks, Dynamic Bayesian Networks, and Symbolic Regression.
However, I know only a part of this fascinating domain, so I wanted to ask the community: Can you suggest other state-of-the-art machine learning techniques for learning dynamical systems? Black-box or white-box, it's not important; I am more focused on getting good data fitting for my application.
Thanks in advance for any suggestion :-)
Question
Imagine you have a sequence of models, for example each one being an equation (or a system of equations): they are connected to each other so that the outputs of a model are used as inputs for one or more other models.
Is there a specific terminology to call this structure? I am trying to find literature on the subject, but I realized I am probably missing some keywords. I tried with "model chains", "model networks", and similar names, but I don't feel that's the right nomenclature.
I think there is a specific terminology for sub-categories of this structure: for example, Bayesian networks could be considered network of models, each node being a probabilistic model described by a conditional probability table. But what if the model inside one of the nodes was deterministic? How would you call the structure, then?
Sorry if the question is naive; in the beginning I thought it would be easy to find an answer, but I found myself skimming through tons of literature without finding anything promising.
Thank you in advance for any help you can provide!
Question
This is a question I stumbled upon while doing something unrelated, and I think it might be interesting for the community at large.
Given the bibliography someone gathered for a certain paper, is there a way to evaluate whether the bibliography is "good"? Or, more in general, to evaluate its "goodness"?
How would you do that? Surely, if it's missing some fundamental citations it might not be good. But citing too many papers without a good reason is also not very appealing. Is there (gasp!) a metric to assess the quality of a bibliography?
Question
I recently started studying the behavior of the Pepsin enzyme. As far as I understand, when it interacts with proteins, it starts cutting the links between amino acids: generally speaking, its activity is really high at the beginning of the process, then it slows down following a pattern resembling a logarithmic function, and after a while the cutting almost stops completely.
What I would like to know (and so far I was not able to find in the literature I am studying) is how much of this behavior is due to the size of the proteins, and how much is due to the pepsin itself.
In other words: does the pepsin "slow down" because the chains of amino acids become smaller and smaller; or does it slow down because pepsin's activity simply lowers over time?
For the moment, I read some papers about the degree of hydrolysis of the pepsin when it comes into contact with proteins containing 500-700 amino acids; but what would happen if we used the pepsin with smaller proteins (e.g. 20-30 amino acids)? Would it present the same "logarithmic" behavior? Or would it just be the last part of the logarithm, so very few "cuts" from the start?
I hope this question is not too naive...thank you for your time :-)

Network

Cited By