Nicholas Lubbers's research while affiliated with Los Alamos National Laboratory and other places

Publications (93)

Article
Full-text available
Reconstructing complex, high-dimensional global fields from limited data points is a challenge across various scientific and industrial domains. This is particularly important for recovering spatio-temporal fields using sensor data from, for example, laboratory-based scientific experiments, weather forecasting, or drone surveys. Given the prohibiti...
Preprint
Full-text available
Coarse-graining is a molecular modeling technique in which an atomistic system is represented in a simplified fashion that retains the most significant system features that contribute to a target output, while removing the degrees of freedom that are less relevant. This reduction in model complexity allows coarse-grained molecular simulations to re...
Article
Full-text available
Modeling effective transport properties of 3D porous media, such as permeability, at multiple scales is challenging as a result of the combined complexity of the pore structures and fluid physics—in particular, confinement effects which vary across the nanoscale to the microscale. While numerical simulation is possible, the computational cost is pr...
Article
Full-text available
Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP...
Preprint
Advances in machine learning have given rise to a plurality of data-driven methods for estimating chemical properties from molecular structure. For many decades, the cheminformatics field has relied heavily on structural fingerprinting, while in recent years much focus has shifted leveraging highly parameterized deep neural networks which usually m...
Article
Full-text available
Background Correlation metrics are widely utilized in genomics analysis and often implemented with little regard to assumptions of normality, homoscedasticity, and independence of values. This is especially true when comparing values between replicated sequencing experiments that probe chromatin accessibility, such as assays for transposase-accessi...
Article
Full-text available
The reconstruction of complex time-evolving fields from sensor observations is a grand challenge. Frequently, sensors have extremely sparse coverage and low-resource computing capacity for measuring highly nonlinear phenomena. While numerical simulations can model some of these phenomena using partial differential equations, the reconstruction prob...
Article
Full-text available
Throughout computational science, there is a growing need to utilize the continual improvements in raw computational horsepower to achieve greater physical fidelity through scale-bridging over brute-force increases in the number of mesh elements. For instance, quantitative predictions of transport in nanoporous media, critical to hydrocarbon extrac...
Article
Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in...
Article
Full-text available
Background and Aims Beneficial plant–microbe interactions can improve plant performance under drought; however, we know less about how drought-induced shifts in microbial communities affect plant traits. Methods We cultivated Zea mays in fritted clay with soil microbiomes originating from contrasting environments (agriculture or forest) under two...
Preprint
Full-text available
Methodologies for training machine learning potentials (MLPs) to quantum-mechanical simulation data have recently seen tremendous progress. Experimental data has a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a trai...
Preprint
Full-text available
The development of machine learning models has led to an abundance of datasets containing quantum mechanical (QM) calculations for molecular and material systems. However, traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM me...
Article
Full-text available
Interactions between stressed organisms and their microbiome environments may provide new routes for understanding and controlling biological systems. However, microbiomes are a form of high-dimensional data, with thousands of taxa present in any given sample, which makes untangling the interaction between an organism and its microbial environment...
Preprint
Full-text available
When molecules are strongly coupled to an optical cavity, a new light-matter hybrid quasiparticle, called a polariton, is formed. Recent experiments have shown that polariton chemistry can be used to manipulate chemical reactions. Polariton chemistry is a collective phenomenon, and its effect increases with the number of molecules in a cavity. Howe...
Article
Full-text available
Typical generative diffusion models rely on a Gaussian diffusion process for training the backward transformations, which can then be used to generate samples from Gaussian noise. However, real world data often takes place in discrete-state spaces, including many scientific applications. Here, we develop a theoretical formulation for arbitrary disc...
Preprint
Full-text available
Typical generative diffusion models rely on a Gaussian diffusion process for training the backward transformations, which can then be used to generate samples from Gaussian noise. However, real world data often takes place in discrete-state spaces, including many scientific applications. Here, we develop a theoretical formulation for arbitrary disc...
Article
Extended Lagrangian Born-Oppenheimer molecular dynamics (XL-BOMD) in its most recent shadow potential energy version has been implemented in the semiempirical PyTorch-based software PySeQM. The implementation includes finite electronic temperatures, canonical density matrix perturbation theory, and an adaptive Krylov subspace approximation for the...
Article
Atomistic machine learning focuses on the creation of models that obey fundamental symmetries of atomistic configurations, such as permutation, translation, and rotation invariances. In many of these schemes, translation and rotation invariance are achieved by building on scalar invariants, e.g., distances between atom pairs. There is growing inter...
Preprint
Full-text available
Background Correlation metrics are widely utilized in genomics analysis and often implemented with little regard to assumptions of normality, homoscedasticity, and independence of values. This is especially true when comparing values between replicated sequencing experiments that probe chromatin accessibility, such as assays for transposase-accessi...
Article
Full-text available
Highly energetic electron-hole pairs (hot carriers) formed from plasmon decay in metallic nanostructures promise sustainable pathways for energy-harvesting devices. However, efficient collection before thermalization remains an obstacle for realization of their full energy generating potential. Addressing this challenge requires detailed understand...
Preprint
Full-text available
Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive quantum chemistry simulations. In practice, developing reactive MLIPs requires prior knowledge of reaction netw...
Article
Full-text available
Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configurat...
Preprint
Full-text available
Extended Lagrangian Born-Oppenheimer molecular dynamics (XL-BOMD) in its most recent shadow potential energy version has been implemented in the semiempirical PyTorch-based software PySeQM. The implementation includes finite electronic temperatures, canonical density matrix perturbation theory, and an adaptive Krylov Subspace Approximation for the...
Article
Full-text available
In recent years, deep learning approaches have shown much promise in modeling complex systems in the physical sciences. A major challenge in deep learning of partial differential equations is enforcing physical constraints and boundary conditions. In this work, we propose a general framework to directly embed the notion of an incompressible fluid i...
Article
Full-text available
Many scientific applications are inherently multiscale in nature. Such complex physical phenomena often require simultaneous execution and coordination of simulations spanning multiple time and length scales. This is possible by combining expensive small-scale simulations (such as molecular dynamics simulations) with larger scale simulations (such...
Preprint
Full-text available
Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive quantum chemistry simulations. In practice, developing reactive MLIPs requires prior knowledge of reaction netw...
Preprint
Full-text available
Atomistic machine learning focuses on the creation of models which obey fundamental symmetries of atomistic configurations, such as permutation, translation, and rotation invariances. In many of these schemes, translation and rotation invariance are achieved by building on scalar invariants, e.g., distances between atom pairs. There is growing inte...
Preprint
Full-text available
Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery. Machine learning interatomic potentials (MLIP) have become an efficient alternative to computationally expensive quantum chemistry simulations. In practice, reactive MLIPs require refitting to extensive datasets for each new...
Conference Paper
Full-text available
Julia is a state-of-the-art tool for scientific computing with good support for automatic differentiation. PyTorch is a leading framework for machine learning. We describe how to perform automatic differentiation across the language boundary and connect these two ecosystems. By using the automatic differentiation ecosystems in each language, the ti...
Preprint
Full-text available
Reactive chemistry atomistic simulation has a broad range of applications from drug design to energy to materials discovery. Machine learning interatomic potentials (MLIP) have become an efficient alternative to computationally expensive quantum chemistry simulations. In practice, reactive MLIPs require refitting to extensive datasets for each new...
Conference Paper
Full-text available
The reconstruction of complex time-evolving fields from a small number of sensor observations is a grand challenge in a wide range of scientific and industrial applications. Frequently, sensors have very sparse spatial coverage, and report noisy observations from highly non-linear phenomena. While numerical simulations can model some of these pheno...
Article
Full-text available
Physical processes that occur within porous materials have wide-ranging applications including - but not limited to - carbon sequestration, battery technology, membranes, oil and gas, geothermal energy, nuclear waste disposal, water resource management. The equations that describe these physical processes have been studied extensively; however, app...
Preprint
Full-text available
Machine learning (ML) models, if trained to datasets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse datasets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuratio...
Preprint
Full-text available
Throughout computational science, there is a growing need to utilize the continual improvements in raw computational horsepower to achieve greater physical fidelity through scale-bridging over brute-force increases in the number of mesh elements. For instance, quantitative predictions of transport in nanoporous media, critical to hydrocarbon extrac...
Article
Full-text available
Advances in machine learning (ML) have enabled the development of interatomic potentials that promise the accuracy of first principles methods and the low-cost, parallel efficiency of empirical potentials. However, ML-based potentials struggle to achieve transferability, i.e., provide consistent accuracy across configurations that differ from those...
Article
Machine learning (ML) is becoming a method of choice for modelling complex chemical processes and materials. ML provides a surrogate model trained on a reference dataset that can be used to establish a relationship between a molecular structure and its chemical properties. This Review highlights developments in the use of ML to evaluate chemical pr...
Article
Full-text available
Conventional machine-learning (ML) models in computational chemistry learn to directly predict molecular properties using quantum chemistry only for reference data. While these heuristic ML methods show quantum-level accuracy with speeds several orders of magnitude faster than traditional quantum chemistry methods, they suffer from poor extensibili...
Article
We propose a data-driven method to describe consistent equations of state (EOS) for arbitrary systems. Complex EOS are traditionally obtained by fitting suitable analytical expressions to thermophysical data. A key aspect of EOS is that the relationships between state variables are given by derivatives of the system free energy. In this work, we mo...
Preprint
Full-text available
Advances in machine learning (ML) techniques have enabled the development of interatomic potentials that promise both the accuracy of first principles methods and the low-cost, linear scaling, and parallel efficiency of empirical potentials. Despite rapid progress in the last few years, ML-based potentials often struggle to achieve transferability,...
Article
Full-text available
The permeability of complex porous materials is of interest to many engineering disciplines. This quantity can be obtained via direct flow simulation, which provides the most accurate results, but is very computationally expensive. In particular, the simulation convergence time scales poorly as the simulation domains become less porous or more hete...
Preprint
Full-text available
We propose a method to describe consistent equations of state (EOS) for arbitrary systems. Complex EOS are traditionally obtained by fitting suitable analytical expressions to thermophysical data. A key aspect of EOS are that the relationships between state variables are given by derivatives of the system free energy. In this work, we model the fre...
Article
Full-text available
Machine learning (ML) plays a growing role in the design and discovery of chemicals, aiming to reduce the need to perform expensive experiments and simulations. ML for such applications is promising but difficult, as models must generalize to vast chemical spaces from small training sets and must have reliable uncertainty quantification metrics to...
Article
Full-text available
Machine learning (ML) is quickly becoming a premier tool for modeling chemical processes and materials. ML-based force fields, trained on large data sets of high-quality electron structure calculations, are particularly attractive due their unique combination of computational efficiency and physical accuracy. This Perspective summarizes some recent...
Preprint
Full-text available
p>Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab-initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling...
Preprint
Full-text available
p>Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab-initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling...
Article
Full-text available
Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling po...
Article
Full-text available
The Hückel Hamiltonian is an incredibly simple tight-binding model known for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials. Part of its simplicity arises from using only two types of empirically fit physics-motivated parameters: the first describes the orbital energies on each ato...
Article
This paper presents a new deep learning data-driven model for predicting structure dependent pore-fluid velocity fields in rock. The model is based on a Convolutional Auto-Encoder (CAE) artificial neural network capable of learning from image data generated by direct numerical simulations of fluid flow through pore-structures, such as by Lattice Bo...
Preprint
p>Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab-initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling...
Article
Full-text available
During carbon sequestration, CO 2 migration is affected by so many uncertainties. • Numerical simulations of multi-phase fluid dynamics are computational expensive. • The combined effects of capillary pressure and relative permeability are explored. • The application of Machine Learning provides a huge computational speed-up. • Capillary pressure i...
Article
Full-text available
Machine learning, trained on quantum mechanics (QM) calculations, is a powerful tool for modeling potential energy surfaces. A critical factor is the quality and diversity of the training dataset. Here we present a highly automated approach to dataset construction and demonstrate the method by building a potential for elemental aluminum (ANI-Al). I...
Article
Full-text available
Machine learning is an extremely powerful tool for the modern theoretical chemist since it provides a method for bypassing costly algorithms for solving the Schrödinger equation. Already, it has proven able to infer molecular and atomic properties such as charges, enthalpies, dipoles, excited state energies, and others. Most of these machine learni...
Preprint
Full-text available
The permeability of complex porous materials can be obtained via direct flow simulation, which provides the most accurate results, but is very computationally expensive. In particular, the simulation convergence time scales poorly as simulation domains become tighter or more heterogeneous. Semi-analytical models that rely on averaged structural pro...
Preprint
Full-text available
The permeability of complex porous materials is of interest to many engineering disciplines. This quantity can be obtained via direct flow simulation, which provides the most accurate results, but is very computationally expensive. In particular, the simulation convergence time scales poorly as simulation domains become tighter or more heterogeneou...
Presentation
Full-text available
Successful geologic CO2 storage projects depend on numerical simulations to predict reservoir performance during site selection, injection verification, and post-injection monitoring phases of the project. These numerical simulations solve non-linear sets of coupled partial differential equations, while accounting for multi-phase fluid dynamics on...
Preprint
The exascale race is at an end with the announcement of the Aurora and Frontier machines. This next generation of supercomputers utilize diverse hardware architectures to achieve their compute performance, providing an added onus on the performance portability of applications. An expanding fragmentation of programming models would provide a compoun...
Article
Predicting the spatial configuration of gas in nanopores of isrelevant in applications such asfluidflow forecasting and hydrocarbonreserves estimation. For example, shale reservoirs have suffered fromcomputationally intractable multiscale problems, sincefluid properties suchas viscosity, density, and adsorption must be calculated by using expensive...
Article
Molecular dynamics (MD) simulations are a powerful tool for the calculation of transport properties in mixtures. Not only are MD simulations capable of treating multicomponent systems, they are also applicable over a wide range of temperatures and densities. In plasma physics, this is particularly important for applications such as inertial confine...
Article
Predicting the functional properties of many molecular systems relies on understanding how atomistic interactions give rise to macroscale observables. However, current attempts to develop predictive models for the structural and thermodynamic properties of condensed-phase systems often rely on extensive parameter fitting to empirically selected fun...
Article
Full-text available
Plasma flows encountered in high-energy-density experiments display features that differ from those of equilibrium systems. Nonequilibrium approaches such as kinetic theory (KT) capture many, if not all, of these phenomena. However, KT requires closure information, which can be computed from microscale simulations and communicated to KT. We present...
Article
Full-text available
Fine-scale models that represent first-principles physics are challenging to represent at larger scales of interest in many application areas. In nanoporous media such as tight-shale formations, where the typical pore size is less than 50 nm, confinement effects play a significant role in how fluids behave. At these scales, fluids are under confine...
Article
Full-text available
A new open-source high-performance implementation of Born Oppenheimer Molecular Dynamics based on semi-empirical quantum mechanics models using PyTorch called PYSEQM is presented. PYSEQM was designed to provide researchers in computational chemistry with an open-source, efficient, scalable, and stable quantum-based molecular dynamics engine. In par...
Preprint
Machine learning models, trained on data from ab initio quantum simulations, are yielding molecular dynamics potentials with unprecedented accuracy. One limiting factor is the quantity of available training data, which can be expensive to obtain. A quantum simulation often provides all atomic forces, in addition to the total energy of the system. T...
Preprint
Full-text available
Predicting the spatial configuration of gas molecules in nanopores of shale formations is crucial for fluid flow forecasting and hydrocarbon reserves estimation. The key challenge in these tight formations is that the majority of the pore sizes are less than 50 nm. At this scale, the fluid properties are affected by nanoconfinement effects due to t...
Article
Determining the structural properties of condensed phase systems is a fundamental problem in theoretical statistical mechanics. Here, we present a machine learning method that is able to predict structural correlation functions with significantly improved accuracy in comparison to traditional approaches. The usefulness of this ex machina (from the...
Article
Full-text available
Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models. In chemistry, ML has been used to develop models for predicting molecular properties, for example quantum mechanics (QM) calculated potential energy surfaces and atomic charge models. The ANI-1x and ANI-1ccx ML-based general-purpose...
Preprint
Atomistic molecular dynamics simulation is an important tool for predicting materials properties. Accuracy depends crucially on the model for the interatomic potential. The gold standard would be quantum mechanics (QM) based force calculations, but such a first-principles approach becomes prohibitively expensive at large system sizes. Efficient mac...
Preprint
Full-text available
In the recent years, deep learning approaches have shown much promise in modeling complex systems in the physical sciences. A major challenge in deep learning of PDEs is enforcing physical constraints and boundary conditions. In this work, we propose a general framework to directly embed the notion of an incompressible fluid into Convolutional Neur...
Preprint
Full-text available
p>Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models. In chemistry, ML has been used to develop models for predicting molecular properties, for example quantum mechanics (QM) calculated potential energy surfaces and atomic charge models. The ANI-1x and ANI-1ccx ML-based eneral-purpos...
Preprint
Full-text available
The H\"uckel Hamiltonian is an incredibly simple tight-binding model famed for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials. Part of its simplicity arises from using only two types of empirically fit physics-motivated parameters: the first describes the orbital energies on each a...
Article
Full-text available
Computational modeling of chemical and biological systems at atomic resolution is a crucial tool in the chemist's toolset. The use of computer simulations requires a balance between cost and accuracy: quantum-mechanical methods provide high accuracy but are computationally expensive and scale poorly to large systems, while classical force fields ar...
Article
We use machine learning to enable large-scale molecular dynamics (MD) of a correlated electron model under the Gutzwiller approximation scheme. This model exhibits a Mott transition as a function of on-site Coulomb repulsion U. The repeated solution of the Gutzwiller self-consistency equations would be prohibitively expensive for large-scale MD sim...
Article
Machine learning regression can predict macroscopic fault properties such as shear stress, friction, and time to failure using continuous records of fault zone acoustic emissions. Here we show that a similar approach is successful using event catalogs derived from the continuous data. Our methods are applicable to catalogs of arbitrary scale and ma...
Preprint
Full-text available
We use machine learning to enable large-scale molecular dynamics (MD) of a correlated electron model under the Gutzwiller approximation scheme. This model exhibits a Mott transition as a function of on-site Coulomb repulsion $U$. Repeated solution of the Gutzwiller self-consistency equations would be prohibitively expensive for large-scale MD simul...
Preprint
Machine learning regression can predict macroscopic fault properties such as shear stress, friction, and time to failure using continuous records of fault zone acoustic emissions. Here we show that a similar approach is successful using event catalogs derived from the continuous data. Our methods are applicable to catalogs of arbitrary scale and ma...
Preprint
We explore the use of deep neural networks for nonlinear dimensionality reduction in climate applications. We train convolutional autoencoders (CAEs) to encode two temperature field datasets from pre-industrial control runs in the CMIP5 first ensemble, obtained with the CCSM4 model and the IPSL-CM5A-LR model, respectively. With the later dataset, c...
Article
Partial atomic charge assignment is of immense practical value to force field parametrization, molecular docking, and cheminformatics. Machine learning has emerged as a powerful tool for modeling chemistry at unprecedented computational speeds given accurate reference data. However, certain tasks, such as charge assignment, do not have a unique sol...
Article
Full-text available
The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this work, we present a fully automated approach for the generation of datasets with the in...
Preprint
Full-text available
p>Partial atomic charge assignment is of immense practical value to force field parametrization, molecular docking, and cheminformatics. Machine learning has emerged as a powerful tool for modeling chemistry at unprecedented computational speeds given ground-truth values, but for the task of charge assignment, the choice of ground-truth may not be...
Preprint
Full-text available
div>Computer simulations are foundational to theoretical chemistry. Quantum-mechanical (QM) methods provide the highest accuracy for simulating molecules but have difficulty scaling to large systems. Empirical interatomic potentials (classical force fields) are scalable, but lack transferability to new systems and are hard to systematically improve...
Article
Full-text available
We introduce the Hierarchically Interacting Particle Neural Network (HIP-NN) to model molecular properties from datasets of quantum calculations. Inspired by a many-body expansion, HIP-NN decomposes properties, such as energy, as a sum over hierarchical terms. These terms are generated from a neural network--a composition of many nonlinear transfor...
Article
Full-text available
We apply machine learning to data sets from shear laboratory experiments, with the goal of identifying hidden signals that precede earthquakes. Here we show that by listening to the acoustic signal emitted by a laboratory fault, machine learning can predict the time remaining before it fails with great accuracy. These predictions are based solely o...
Article
Full-text available
We apply recent advances in machine learning and computer vision to a central problem in materials informatics: The statistical representation of microstructural images. We use activations in a pre-trained convolutional neural network to provide a high-dimensional characterization of a set of synthetic microstructural images. Next, we use manifold...
Article
Full-text available
We extend and explore the general non-relativistic effective theory of dark matter (DM) direct detection. We describe the basic non-relativistic building blocks of operators and discuss their symmetry properties, writing down all Galilean-invariant operators up to quadratic order in momentum transfer arising from exchange of particles of spin 1 or...
Article
Following the construction of the general effective theory for dark matter direct detection in 1203.3542, we perform an analysis of the experimental constraints on the full parameter space of elastically scattering dark matter. We review the prescription for calculating event rates in the general effective theory and discuss the sensitivity of vari...

Citations

... Machine learning (ML) has recently found purchase in computational chemistry. 13,[23][24][25][26][27][28][29][30][31] Graph-based neural networks (GNNs) [32][33][34][35][36][37] in particular have enabled transferable model development across diverse chemistries and multiple prediction tasks by encoding known physical symmetries into the GNN architecture [38][39][40] and leveraging data enhancement techniques. 41 ML has also begun to enable CG representation identification for structural reproduction, 22 calculation of the mapping entropy, [42][43][44] and the matching of human-generated CG maps. ...
... However, limitations in DFT reference dataset or the approximation can lead to MLIP's to predict overstructured liquid structures [57], leading to a deviation with respected to experimentally measured melt structures. Matin et al [58] introduced a novel method to refine machine learning potentials by incorporating experimental observations, specifically focusing on the melt phase of pure aluminium. This method leverages iterative Boltzmann inversion, allowing for the integration of experimental radial distribution function data at specific temperatures to act as a correction to DFT trained MLIP's. ...
... Raw reads from ATAC-seq assays were processed using methods described in ref. 63. Briefly, reads were trimmed and filtered with fastp 64 to remove Nextera adaptors and reads with repetitive sequences. ...
... These networks employ the attention mechanism to focus on the relevant parts of the input data which enables them to effectively handle long-range dependencies and complex data relationships with improved performance in tasks like language translation and image recognition [30][31][32]. Recently, attention based neural networks have also demonstrated great performance in field reconstruction tasks [33]. These networks employ positional encoding to accurately map sensor positions, eliminating the need for a Cartesian grid-based data structure or fine-tuned graph structures. ...
... In fact, combinations have been investigated in the last couple of years, in which ML is used to learn the parameters within semiempirical MO theory. 51,52 Furthermore, reparametrization of the PM6 semiempirical method has recently also been incorporated in the construction of QFFs for large molecular systems such as polycyclic aromatic hydrocarbons. 53,54 Table 3 shows exemplary calculations, in which the 1D and 2D terms of an n-mode expansion PES have been obtained at the levels presented in Table 2 for the multilevel scheme, but different approximations were used for the large number of grid points required in constructing the 3D and 4D terms. ...
... However, the composition and structure of microbial communities are affected by environmental conditions and edaphic properties (Pascual et al. 2018) and the response of the plantmicrobiome complex to extreme events such as floods and droughts is still not clearly understood (Francioli et al. 2021). In respect to drought, some authors showed that soils afflicted by a history of droughts exhibit consistently lower microbial respiration rates, altered microbial community composition and functional response distributions (Veach and Zeglin 2020) but, on the other hand, it is possible to develop the soil microbiome through soil management to positively influence plant drought performance (Carter et al. 2023) and the use of microbes to improve ecosystem functions for specific functions has merited increased attention (de Vries et al. 2020). A combination of drought and heat stress might have a stronger negative influence on plant production (Cohen et al. 2021). ...
... Although both discrete and continuous data are compatible with diffusion models, their generic pipeline has an inductive bias of continuous data [14]. The inherent data difference requires effort to adapt the transition chain to improve the modeling accuracy for different data types [71], [88]. Some efforts focus on converting discrete data to continuous representations before they are calculated [89], [90]. ...
... 13−17 In this particular case, the time-reversible extrapolation is augmented by the inclusion of a dissipative term, which serves to reduce the numerical fluctuations. XLBO can be seen as an intermediate strategy between Car− Parrinello like approaches and extrapolation techniques for BOMD, as it indeed propagates an auxiliary density matrix that can either be used directly in a CPMD spirit, 18,19 possibly after refining the density using an approximate SCF solver, or be used as a guess for the SCF. 13 Here, we focus on the latter approach. ...
... In this paper, we present an ML CG workflow to construct force fields based on the forcematching (or multi-scale coarse-graining) approach 47 using the Hierarchically Interacting Particle Neural Network with Tensor Sensitivity (HIP-NN-TS) 48,49 architecture, which has previously only been applied to AA systems. We show that this workflow is robust in that it is able to consistently build a large number of accurate CG models for a variety of chemical physics systems across many thermodynamic state points. ...
... [99] The latest ANI-1 version (ANI-1xnr) was recently used to study a small set of chemical processes in condensed-phase. [100] Furthermore, different MLPs have been successfully applied to predict reaction dynamics of Diels-Alder reactions, [101] to describe the reaction network of methane combustion, [9] to screen the thermal half-lives of thousands of azobenzene derivatives, [102] and to predict gas-phase activation energies, [103,104] and transition states ( Figure 15A). [105] The applicability domain of MLP is restricted by the diversity of the initial training set (for example the initial ANI iteration was trained for a set of few elements, namely C, H, N and O). ...