ArticlePDF Available

Accelerating Materials Development via Automation, Machine Learning, and High-Performance Computing

Authors:

Abstract and Figures

Successful materials innovations can transform society. However, materials research often involves long timelines and low success probabilities, dissuading investors who have expectations of shorter times from bench to business. A combination of emergent technologies could accelerate the pace of novel materials development by 10x or more, aligning the timelines of stakeholders (investors and researchers), markets, and the environment, while increasing return-on-investment. First, tool automation enables rapid experimental testing of candidate materials. Second, high-throughput computing (HPC) concentrates experimental bandwidth on promising compounds by predicting and inferring bulk, interface, and defect-related properties. Third, machine learning connects the former two, where experimental outputs automatically refine theory and help define next experiments. We describe state-of-the-art attempts to realize this vision and identify resource gaps. We posit that over the coming decade, this combination of tools will transform the way we perform materials research. There are considerable first-mover advantages at stake, especially for grand challenges in energy and related fields, including computing, healthcare, urbanization, water, food, and the environment.
Content may be subject to copyright.
1
Accelerating Materials Development via Automation, Machine Learning, and High-
Performance Computing
Juan Pablo Correa-Baena1, Kedar Hippalgaonkar2, Jeroen van Duren3, Shaffiq Jaffer4, Vijay R.
Chandrasekhar5, Vladan Stevanovic6, Cyrus Wadia7, Supratik Guha8, Tonio Buonassisi1*
1Massachusetts Institute of Technology, Cambridge, MA 02139, USA
2Institute of Materials Research and Engineering (IMRE), A*STAR (Agency for Science, Technology
and Research), Innovis, Singapore
3Intermolecular Inc., San Jose, CA 95134, USA
4TOTAL American Services, Inc., 82 South Street, Hopkington, MA 01748, USA
5Institute for Infocomm Research (I2R), A*STAR (Agency for Science, Technology and Research), #21-
01 Connexis (South Tower), Singapore
6Colorado School of Mines, Golden, CO 80401, USA
7Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
8Center for Nanoscale Materials, Argonne National Laboratory, Argonne, IL 60439, USA
*Corresponding author: Tonio Buonassisi, buonassisi@mac.com
Successful materials innovations can transform society. However, materials research often
involves long timelines and low success probabilities, dissuading investors who have
expectations of shorter times from bench to business. A combination of emergent
technologies could accelerate the pace of novel materials development by 10x or more,
aligning the timelines of stakeholders (investors and researchers), markets, and the
environment, while increasing return-on-investment. First, tool automation enables rapid
experimental testing of candidate materials. Second, high-throughput computing (HPC)
concentrates experimental bandwidth on promising compounds by predicting and inferring
2
bulk, interface, and defect-related properties. Third, machine learning connects the former
two, where experimental outputs automatically refine theory and help define next
experiments. We describe state-of-the-art attempts to realize this vision and identify resource
gaps. We posit that over the coming decade, this combination of tools will transform the way
we perform materials research. There are considerable first-mover advantages at stake,
especially for grand challenges in energy and related fields, including computing, healthcare,
urbanization, water, food, and the environment.
3
The development of novel materials has long been stymied by a mismatch of time constants
(Figure 1). Materials development typically occurs over a 15–25-year time horizon, sometimes
requiring synthesis and characterization of millions of samples. However, corporate and
government funders desire tangible results within the residency time of their leadership, typically
2–5 years. The residency time for postdocs and students in a research laboratory is usually 2–5
years; when a project outlasts the residency of a single individual, seamless continuity of
motivation and intellectual property is often the exception, not the rule. Market drivers of novel
materials development, informed by business competition and environmental considerations, often
demand solutions within a shorter time horizon. This mismatch in time constants results in a
historically poor return-on-investment of energy-materials (cleantech) research relative to
comparable investments in medical or software development.1
Figure 1. Timelines for materials discovery and development. Timelines of examples of certain
technologies (blue area), typical academic funding grants (orange), development capacity (green)
and deployment of sustainable energy (i.e., via solar cells) to fulfill the 2030 climate targets.
4
To bridge this mismatch in time horizons and increase the success rate of materials research, both
public- and private-sector actors endeavor to develop new paradigms for materials development.
The U.S. Materials Genome Initiative focused on three “missing links”: computational tools to
focus experimental efforts in the most promising directions, data repositories to aggregate
learnings and identify trends, and higher-throughput experimental tools.2 This call to action was
mirrored in industry and by university- and laboratory-led consortia, many focused on simulation-
based inverse design and discovery and properties databases. As these tools matured, the
throughput of materials prediction often vastly outstripped experimentalists’ ability to screen for
materials with low rates of false negatives.
Today, a new paradigm is emerging for experimental materials research, which promises
to enable more rapid discovery of novel materials.3,4 Figure 2 illustrates one such prototypical
vision, entitled “accelerated materials development and manufacturing.” Rapid, automated
feedback loops are guided by machine learning, and an emphasis on value creation through end-
product and industry transfer. There is a unique opportunity today to develop these capabilities in
testbed fashion, with considerable improvements in research productivity and first-mover
advantages at stake.
5
Figure 2. Schematic of accelerated materials discovery process. The automated feedback loop,
driven by machine learning, drives process improvement. The theory, synthesis, and device
processes take advantage of high-performance computing and materials databases. Icons from Ref.
31.
As is often the case with convergent technologies, one observes significant advances in individual
“silos” before the leveraged ensemble effect bears its full impact. A historical example is three-
dimensional printing, wherein 3D computer-aided design (CAD), computer-to-hardware interface
protocols, and ink-jet printing technologies evolved individually, before being combined by Prof.
Ely Sachs and his MIT team into the first 3D printer. The ability to observe emergent technologies
within individual silos, and assemble them into an ensemble that is greater than the sum of its
parts, mirrors the challenge in novel materials development today. The following paragraphs
describe the discrete, emergent innovations in “siloed” domains that are presently converging, and
promise to enable this paradigm shift within the next decade.
10x faster
material
syn thesis
10x faster
device
assembly
10100x
faster
diagnostics
Automated Feedback Loop $
Theory Product
Automated Feedback Loop
6
Theory: Today, the rate of theoretical prediction vastly outstrips the rate of experimental synthesis,
characterization, and validation.5 This emergence is enabled by three trends: faster computation,
more efficient and accurate theoretical approaches and simulation tools, and the ability to screen
large databases quickly, such as MaterialsProject.org. To bridge the growing gap between theory
and experiment, researchers are increasingly focusing efforts on predictive materials synthesis
routes, especially synthesis routes that consider environmental factors (e.g., humidity), reaction-
energy barriers, and kinetic limitations (so-called “non-equilibrium” synthesis).17 In parallel,
theorists seek to rationally design materials with combinations of properties — first, by predicting
combinations of properties (e.g., chemical, microstructural, interface, surface…) in one simulation
framework and/or database, then connecting material predictions with device performance &
reliability predictions, then extending this framework to both known and not-yet-discovered
compounds, and ultimately, solving the inverse problem.
High-Throughput Materials, Device, and Systems Synthesis: Historically, slow vacuum-based
deposition methods inhibit materials development. Modern vacuum-based tools, including
combinatorial approaches and large-scale, fast serial deposition/reactions, enable meaningful rate
increases for materials and device synthesis.29,30 Variants of existing deposition methods (e.g.,
close-space sublimation) offer higher growth rates, point-defect control, and precise stoichiometry
and impurity control for process-compatible materials. Solution synthesis has gained acceptance
with the emergence of higher-quality precursors and materials, including CdS quantum dots,
polymer solar cells, and lead-halide perovskites.5,6 The growing diversity of precursors (from
molecular to nanoparticle), synthesis control (including solvent engineering), and thin-film
synthesis methods (lab-based spin-coating to industrially-compatible large-area printing) makes
7
this a powerful and flexible platform to deposit a range of new materials. Emergence of 3D printed
materials provides another ubiquitous alternative. At laboratory scale, throughputs for such rapid
synthesis routes5,7 can be up to an order of magnitude greater than vacuum-based techniques, and
remain to be explored for multinary materials with novel microstructures. With declining
component costs and greater adoption of standards, the ability to rapidly combine discrete devices
into components and systems in a modular and flexible manner is emerging.
Defect Tolerance & Engineering: Often, theoretical predictions are made for “ideal” materials
systems. However, real samples contain defects (impurities, structural defects…), which can harm
(or, occasionally, benefit) bulk and interface properties. To mitigate the risk of defect-induced
false negatives during high-throughput materials screening, it is desirable to identify classes of
materials less adversely affected by defects (so-called “defect tolerant” 8,9), and rapidly diagnose
& decouple the effects of defects on material performance. A notable recent example is the
serendipitous discovery of lead-halide perovskites for optoelectronic applications.6,7 In addition to
being amenable to high-throughput solution-phase deposition, lead-halide perovskites also
required orders of magnitude less research effort to achieve similar performance improvements to
traditional inorganic thin-film materials (Figure 3). It is suspected that part of the facility to
improve performance is owed to increased defect tolerance of lead-halide perovskites, resulting in
improved bulk-transport properties. Determining the underlying physics of and developing design
rules for defect tolerance may inform screening criteria for new materials, especially with new
computational tools such as General Adversarial Networks (GANs) that are state-of-the-art in
anomaly detection.22,23 The next step lies in focusing experimental effort on candidates capable
of rapid performance improvements during early screening and development, and wider process
8
tolerance in manufacturing. In relation to the beneficial aspects of defects and impurities, recent
theory advancements15 in combination with computational tools to rapidly assess and predict
solubility and electrical properties of defects16 allows high-throughput screening of materials for
applications where the desired functionality is enabled by the defects and/or dopants (e.g.,
thermoelectrics, transparent electronics…).
Figure 3. A case study of fast materials development based on photovoltaic applications. a.
certified power conversion efficiency (PCE) over time for CdTe and perovskite solar cells. b.
Number of J-V sweeps measured divided by the increase in percentage point achieved during the
device development of CdTe and perovskite solar cells. Three orders of magnitude fewer J-V
sweeps per percentage efficiency improvement were needed to advance perovskite efficiencies
relative to traditional thin-film solar cell materials. We hypothesize that this difference is partially
due to greater “defect tolerance” of perovskites, enabling a faster and more economical materials
development process.
High-Throughput Diagnosis: Characterization tools have also benefitted from high-throughput
computing, automation, and machine learning. For instance, one high-resolution X-ray
photoelectron spectroscopy spectrum could take an entire day with technology from the 1970’s,
while the same measurement today requires less than an hour. Today, advanced statistics and
9
machine learning promises to further accelerate the rate of learning. Tools now exist that can
acquire multiple XPS spectra on a single sample (e.g., with composition gradients), and automated
spectral analysis of large datasets is now possible, enabling estimation of unknown materials in a
compositional map. Others seek to replace spectroscopy with rapid non-destructive testing; several
bulk and interface properties can be simultaneously diagnosed by using Bayesian inference in
combination with non-destructive device testing, enabling 10x faster (and in certain cases, more
precise) diagnosis vis a vis traditional characterization tools.10 This kind of parameter estimation
can be applied to finished components, devices, and systems, and has the potential to not only
enable faster troubleshooting, but also to accurately estimate ultimate performance potential, thus
informing the decision to pursue or abandon further investment in a given candidate material even
at early stages of materials screening.
Machine Learning comprises a broad class of approaches, which may play several different roles
in the future materials-development cycle. First, a common application of machine learning is for
materials selection, in which historical experimental observations are used to inform predictions
of future properties (attributes) of unknown compounds, or discover new ones.24 Such an approach
has been realized to help discover novel active layers in organic solar cells11 and light-emitting
diodes12, and metal alloys13,27, among many others.28 Second, machine learning tools can help
extract greater and more accurate information from diagnosis, as detailed in the previous section.
Third, machine learning tools may help close the automation loop between diagnosis and synthesis,
shown in Figure 2, by reducing the degree of human intervention and reliance on heuristics. For
example, when relationships between experimental inputs and diagnosis outputs can be inferred
by neural networks, detailed process and device models may no longer be needed to predict
10
outcomes and optimize processes. All three applications of machine learning to the materials
development cycle benefit from the availability of more data, to train and sharpen the predictive
capacity of such tools.
Achieving predictability without losing physical insights is an emergent challenge and
research opportunity. Such methods may also increase learning from diagnosis, by consolidating
research output in singular databases, drawing automated inferences from the data, and in the
future perhaps aggregating the experience and knowledge base via natural language processing of
existing research papers and materials property databases.
Envisioning the “Hardware Cloud”: Materials synthesis equipment today is becoming
increasingly remotely operable—enabling research and operation by an investigator who is not in
proximal presence to the deposition equipment. This opens up two related opportunities with far-
reaching consequences. Large, expensive, synthesis equipment can be grouped together with
massively parallel characterization equipment to form synthesis centers of the future, which are
operated by remote users and researchers and managed by an on-site professional staff. Akin in
concept to the Software Cloud concept, where one’s computing and data is stored across machines
worldwide in a seamless manner, a Hardware Cloud would enable a user to deposit, measure and
carry out research (with real time feedback through in-situ characterization tools) across a number
of networked materials processing systems distributed nationally or internationally in a seamless
manner. This also leads to the second opportunity: to be able to store, curate, access, process and
diagnose all data gathered in these networked experiments in Public or Private Clouds. (Protocols
and formats for such science data collectives will be discussed in the following paragraphs.) This
will greatly facilitate two emerging issues: (a) increasing the efficient availability of data across a
11
wide number of experiments and experimental platforms for post-analysis; and (b) making
available for analysis data that indicates “what did not work” — this is not easily available but is
instrumental in the learning process, and has its own value in increasing the collective efficiency
of research progress.
Infrastructure Investments Toward Accelerated Materials Development and
Manufacturing: Realizing the vision shown in Figure 2 requires a sustained commitment over
several years to develop software, hardware, and human resources, and to connect these new
capabilities in testbed fashion.
Investments in applied machine learning: Supported by ample investments into machine-learning
methods development, a pressing challenge is how to down-select and apply the most appropriate
machine-learning methods to enable the “automated feedback loop” shown in Figure 2. Compared
to other widely recognized applications of machine learning today (e.g., vision recognition,
natural-language processing, and board gaming), materials research often involves sparse data sets
(e.g., small sample sizes and number of experimental inputs & outputs, for training and fitting)
and less well-constrained “rules” (e.g., complex physics and chemistry, non-binary inputs and
outputs, large experimental errors, uncontrolled input variables, and incomplete characterization
of outputs, to name a few). These realities make the typical materials-science problem (e.g., layer-
by-layer atomic assembly of a thin film) decidedly more complex and less well defined than a
match of “Go,” where the rules and playing board are constrained. Deep machine learning (DML)
appears well poised to address this complexity. Computation speed can be improved by developing
12
“pre-trained” neural networks that incorporate the underlying physics and chemistry common to
materials synthesis, performance, and defects, bringing DML within reach of commonly available
hardware and software.
A balance must be found between achieving actionable results and inferring physical
insight from “black-box” computational methods, to advance both engineering and scientific
objectives, and minimize unintended consequences. There is a need to apply “white box” (i.e.,
opposite of black box) machine learning methods to materials science problems. One possible
approach may be application of semi-supervised deep learning algorithms, which learn with lots
of unlabeled data and very little labeled data.25
Lastly, the ability of machine-learning tools to adapt to uncontrolled and changing
experimental conditions is essential. Promising developments include online deep learning, which
builds neural networks on the fly, gradually adding neurons (e.g., as baseline experimental
conditions change, or as new physics becomes dominant).26
Investment in standards governing data formatting and storage would facilitate data entry into
machine-learning software. Standards embed contextual know-how, hierarchy, rational thought.
Some communities have implemented standards governing raw and processed data, e.g.,
crystallography, genetics, and geography. However, in most materials-research communities, there
are no universally accepted and implemented data standards. Several materials databases have
been created, often specialized by material class or application, and with varying protocols for
updating information and enforcing hygiene. Furthermore, these databases often lack ability to
quickly & accurately predict device-relevant combinations of properties (e.g., chemical,
mechanical, optoelectronic, microstructural, surface, interface…). Several data standards have
13
been proposed1921; widespread adoption may hinge on widespread adoption of data-management
systems described in the next paragraph. In the absence of data standards, it is possible that the
burden of data aggregation will shift onto natural language processors18, i.e., computer programs
designed to extract relevant data from available media (e.g., publications, reports, presentations,
and theses).
Investment in data-management tools (e.g., informatics systems) are needed to manage data
obtained from lab equipment and store records, coordinate tasks, and enforce protocols. On one
hand, such systems have been shown to be of high value for well-defined research problems and
tool sets. For early-stage materials research, data management tools require a deft balance between
flexibility and standardization, and the ability to accommodate non-standard workflows, multiple
participants, and equipment spread across multiple sites, including shared-use facilities, in an
elegant and seamless manner. When implemented well, data-management systems can increase
the quality, uniformity, and accessibility of data serving as inputs into machine-learning tools;
when implemented too inflexibly, data-management systems can cause frictions to researcher
workflow and stimulate their resistance. It is possible that, as suggested by Rafael Jaramillo (MIT),
metadata-based distributed data-management systems may warrant strong consideration for early-
stage materials research; a challenge will be, how to capture metadata in an automated, accurate,
thorough, and comprehensive manner.
Investments in infrastructure are needed, to increase throughput of synthesis, device-fabrication,
and diagnosis tools. The potential of automation must be realized, without sacrificing material
quality and offsetting the advantages of higher throughput with an increase in false negatives. The
14
emergence of multi-parameter estimation methodologies, including Bayesian inference and
Design of Experiments (DoE) algorithms, invites the invention new non-destructive diagnostic
apparatus designed to take full advantage of these new methodologies.
There are significant challenges associated with producing and analyzing large quantities
of data. New tools being developed by machine learning specialists invite the possibility of
modifying hardware design to take advantage of machine-learning tools, rather than the other way
around.
Revised policies at institution, funding agency, and government levels may accelerate or
stymie the required ongoing investments at levels large and small, and invites considering how
export control laws, import duties, grant purchasing restrictions, overhead rates, auditing, and
claw-back clauses affect required equipment investments to enable this transformation.
Human-Capital Investments Toward Accelerated Materials Development and
Manufacturing:
Investments in human capital are required to prepare researchers to leverage these new tools. The
transition from being “data-poor” to being “data-rich” invites changes in how we think, how we
incentivize, and how we teach.
How we think: In a “data-poor” world, the time and cost of conducting each experiment is relatively
large, and a risk-adverse mindset is advantageous. In a “data-rich” world, a larger number of
unique experiments can be conducted per unit time, meaning that failure of any given experiment
will have lesser negative impact on a researcher’s milestones and publication record. This will
15
enable researchers to experiment with greater creativity and risk-taking. This has three important
implications for “how we think”: First, a greater premium will be placed on experimental concept
and design, as researchers who design experiments amenable to new tools will be rewarded.
Second, a decreasing cost-per-experiment may result in reduced barriers for junior researchers to
establish themselves, decreasing the premium of initial investment, prompting new as well as
established researchers to explore new fields.
Third, an accelerated materials development framework invites a system-level
perspective14 that mirrors the new tools. Greater experimental throughput suggests that devices
and systems may increasingly be analyzed holistically in lieu of isolated sub-components, test
structures, and proxies. A “data-rich” world will allows us to analyze complex systems more
directly, with lesser need to break into sub-components or impose a priori simplifications even
without complete visibility into each sub-component. Wielding these new computer-based tools
to greatest effect requires that researchers learn to “think” like machine-learning algorithms,
appreciating the nuances and trade-offs of different approaches, requiring a mindset change
providing an opportunity to identify weak links faster, focusing effort on those parameters with
highest returns on investment.
Incentives: Encouraging the mindset change and transitions mentioned in the previous section will
be complemented with a “constant of friction” governed in part by professional incentives of
decades-old institutions. Young researchers will be encouraged to take proactive steps if they are
rewarded by hiring committees, promotion committees, fellowship & awards committees, journal
editors, and conference committees. Funding agencies could encourage open-source development
of equipment that enables integration of high throughput synthesis of materials with data
16
management. Industries may see value in funding solution-driven system-level approaches to
accelerate their development timelines.
Community: Realizing this future requires merging domain expertise currently resident in robotics,
software, computer science, electronics, materials, and design silos, each with their own language
/ acronyms, and academic conferences. The learning curve to become even a generalist in these
different domains remains very steep. Reducing barriers to communication and achieving
percolation of ideas across domains may be facilitated via cross-cutting conferences, workshops,
and creation of funded research centers. Adoption of best practices across various fields can be
encouraged via these percolation pathways of ideas.
Education and Up-Skilling: Public opinion (read: support or opposition) to ML/AI is influenced
by whether or not citizens can envision a hopeful future that includes their employment and
empowers society. First, these transformations require individuals at all levels and employment
types to be willing to up-skill. Educators at all levels have an opportunity to revamp their curricula,
considering both technical and societal impacts. Online tools and courses for machine-learning /
artificial intelligence are growing in availability, but direct applications to materials science and
systems engineering are needed. Second, we are invited to consider how we teach reflects the most
suitable skills and mindsets to harness the full potential of accelerated materials development &
manufacturing platforms. Domain expertise in supporting fields, including advanced statistics, will
increase in utility with the mainstreaming of system-level design of experiments. Third, the
scientific method will still be valid, and the premium will only increase for asking the right
questions, designing good experiments, and disseminating results well.
17
Conclusions
The convergence of high-performance computing, automation, and machine learning promises to
accelerate the rate of materials discovery, better aligning investor and stakeholder timelines. These
new tools are set to become an indispensable part of the scientific process. >10x faster synthesis,
device fabrication, diagnostics in a (semi-)automated feedback loop are distinctly possible in the
near future. Discrete advances in theory, high-throughput materials, device, systems synthesis,
diagnostics, the understanding of defects and defect tolerance, and machine learning are enabling
this transition. There are several infrastructure and human-capital needs to enable this future,
including greater emphasis on appropriate applications of existing methods to materials-relevant
problems, adoption of data and metadata standards, data-management tools, and laboratory
infrastructure, including both decentralized and centralized facilities. To integrate these tools into
the R&D ecosystems depends in part on several human elements — namely, the time needed to
evolve incentive structures, community support, education & up-skilling offerings, and researcher
mindsets, as our field transitions from thinking “data poor” to thinking “data rich.” We envision a
scientific laboratory where the process of materials discovery continues without disruptions, aided
by computational power augmenting the human mind, and freeing the latter to perform research
closer to the speed of imagination, addressing societal challenges in market-relevant timeframes.
Acknowledgements: The ideas represented herein evolved in discussion with numerous
individuals, including but not limited to Riley Brandt, Danny Ren, Felipe Oviedo, Daniil Kitchaev,
Rachel Kurchin, I. Marius Peters, Shijing Sun, Rafael Jaramillo, Gang Chen, and Anantha
Chandrakasan of MIT/SMART; Benjamin Gaddy of Clean Energy Trust; Rolf Stangl, Chaobin
18
He, and Anthony Cheetham of NUS; Dirk Weiss and Raffi Garabedian of First Solar; BJ Stanbery
of Siva Power; Karthik Kumar, Sir John O’Reilly, Pavitra Krishnaswamy, Alfred Huan, and Cedric
Troadec of A*STAR; Andrij Zakutayev, Stephan Lany, Dave Ginley, Greg Wilson, and William
Tumas of NREL; and Lydia Wong, Shuzhou Li, Tim White, and Subbu Venkatraman of NTU,
among many others.
References:
1. B.E. Gaddy, V. Sivaram, T.B. Jones, L. Wayman, “Venture capital and cleantech: The wrong
model for energy innovation,” Energy Policy 102, 385–395 (2017).
2. A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D.
Skinner, G. Ceder and K. A. Persson, “Commentary: The Materials Project: A materials
genome approach to accelerating materials innovation,” APL Materials 1, 011002 (2013).
3. N. Nosengo, “The Material Code: Machine-learning techniques could revolutionize how
materials science is done,” Nature 533, 22 (2016).
4. P. De Luna, J. Wei, Y. Bengio, A. Aspuru- Guzik, E. Sargent, “Use machine learning to find
energy materials,” Nature Materials 552, 23 (2017).
5. E. O. Pyzer-Knapp, C. Suh, R. Gómez-Bombarelli, J. Aguilera-Iparraguirre and A. Aspuru-
Guzik, “What is high-throughput virtual screening? A perspective from organic materials
discovery,” Annual Review of Materials Research 45, 195–216 (2015).
6. A. Kojima, K. Teshima, Y. Shirai and T. Miyasaka, “Organometal halide perovskites as
visible-light sensitizers for photovoltaic cells,” Journal of the American Chemical Society 131,
6050–6051 (2009).
19
7. M. Graetzel, R. A. J. Janssen, D. B. Mitzi and E. H. Sargent, “Materials interface engineering
for solution-processed photovoltaics,” Nature 488, 304–312 (2012).
8. W.-J. Yin, T. Shi and Y. Yan, “Unusual defect physics in CH3NH3PbI3 perovskite solar cell
absorber,” Applied Physics Letters 104, 063903 (2014)
9. R. E. Brandt, V. Stevanović, D. S. Ginley and T. Buonassisi, “Identifying defect-tolerant
semiconductors with high minority-carrier lifetimes: beyond hybrid lead halide perovskites,”
MRS Communications 5, 265–275 (2015).
10. R. E. Brandt, R. C. Kurchin, V. Steinmann, D. Kitchaev, C. Roat, S. Levcenco, G. Ceder, T.
Unold and T. Buonassisi, “Rapid photovoltaic device characterization through Bayesian
parameter estimation,” Joule 1, 843-856 (2017).
11. S. A. Lopez, B. Sanchez-Lengeling, J. de Goes Soares and A. Aspuru-Guzik, “Design
principles and top non-fullerene acceptor candidates for organic photovoltaics,” Joule 1, 857-
870 (2017).
12. R. Gómez-Bombarelli, J. Aguilera-Iparraguirre, T.D. Hirzel, D. Duvenaud, D. Maclaurin,
M.A. Blood-Forsythe, H.S. Chae, M. Einzinger, D.-G. Ha, T. Wu, G. Markopoulos, S. Jeon,
H. Kang, H. Miyazaki, M. Numata, S. Kim, W. Huang, S.I. Hong, M. Baldo, R.P. Adams, A.
Aspuru-Guzik, “Design of efficient molecular organic light-emitting diodes by a high-
throughput virtual screening and experimental approach,” Nature Materials 15, 1120–1127
(2016).
13. B.D. Conduit, N.G. Jones, H.J. Stone, G.J. Conduit, “Design of a nickel-base superalloy using
a neural network,” Materials & Design 131, 358–365 (2017).
20
14. C.R. Cox, J.Z. Lee, D.G. Nocera, T. Buonassisi, “Ten-percent solar-to-fuel conversion with
nonprecious materials,” Proceedings of the National Academy of Sciences 111, 14057–14061
(2014)
15. C. Freysoldt, B. Grabowski, T. Hickel, J. Neugebauer, G. Kresse, A. Janotti, and C. G. Van de
Walle, “First-principles calculations for point defects in solids” Reviews of Modern Physics
86, 253 (2014).
16. A. Goyal, P. Gorai, H. Peng, S. Lany, and V. Stevanovic, “A Computational Framework for
Automation of Point Defect Calculations”, Computational Materials Science 130, 1 (2017).
17. C.L. Phillips and P. Littlewood, “Preface: Special Topic on Materials Genome,” APL Materials
4, 053001 (2016).
18. E. Kim, K. Huang, A. Saunders, A. McCallum, G. Ceder, E. Olivetti, “Materials Synthesis
Insights from Scientific Literature via Text Extraction and Machine Learning,” Chemistry of
Materials 29, 9436–9444 (2017).
19. J. Hill, A. Mannodi-Kanakkithodi, R. Ramprasad, B. Meredig, “Materials Data Infrastructure
and Materials Informatics” in D. Shin, J. Saal (eds) Computational Materials System Design.
Springer (2017). ISBN 978-3-319-68280-8
20. R. Ananthakrishnan, K. Chard, I. Foster, S. Tuecke, “Globus platform-as-a-service for
collaborative science applications,” Concurrency and Computation.: Practice and Experience
27, 290–305 (2015)
21. K. Chard, E. Dart, I. Foster, D. Shifflett, S. Tuecke, J. Williams, “The Modern Research Data
Portal: a design pattern for networked, data-intensive science,” PeerJ Computer Science 4,
e144 (2018).
21
22. H. Zenati, C. Foo, B. Lecouat, G. Manek, V. Chandrasekhar, “Efficient GAN-based Anomaly
Detection,” arxiv
23. G.L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P.L.C. Farias, A. Aspuru-Guzik,
“Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation
Models”, arxiv (2017)
24. L. Ward, A. Agrawal, A. Choudhary, C. Wolverton, “A general-purpose machine learning
framework for predicting properties of inorganic materials,” npj Computational Materials 2,
16028 (2016).
25. B. Lecouat, C. Foo, H. Zenati, V. Chandrasekhar, “Semi-supervised Deep Learning with
GANs: Revisiting Manifold Regularization,” arxiv
26. S. Ramasamy, K. Rajaraman, P. Krishnaswamy, V. Chandrasekhar, “Online Deep Learning:
Growing RBM on the fly,” arxiv
27. O.N. Senkov, J.D. Miller, D.B. Miracle, C. Woodward, “Accelerated exploration of multi-
principal element alloys with solid solution phases,” Nature Communications 6, 6529 (2015).
28. M. L. Green, C. L. Choi, J. R. Hattrick-Simpers, A. M. Joshi, I. Takeuchi, S. C. Barron, E.
Campo, T. Chiang, S. Empedocles, J. M. Gregoire, A. G. Kusne, J. Martin, A. Mehta, K.
Persson, Z. Trautt, J. Van Duren, and A. Zakutayev, “Fulfilling the promise of the materials
genome initiative with high-throughput experimental methodologies,” Applied Physics
Reviews 4, 011105 (2017).
29. J. Eid, H. Liang, I. Gereige, S. Lee, J. Van Duren, “Combinatorial study of NaF addition in
CIGSe films for high efficiency solar cells,” Progress in Photovoltaics 23, 269–280 (2015).
22
30. M.K. Jeon, J.S. Cooper, P.J. McGinn," “Investigation of PtCoCr/C catalysts for methanol
electro-oxidation identified by a thin film combinatorial method,” Journal of Power Sources
192, 391–395 (2009).
31. Icons in Figure 2 are freeware, and were made by Freepik from www.flaticon.com
... From research to applications, traditional methods typically take decades, require high expenditures, are highly inefficient, and usually do not result in the best composition for a given application. 5,[7][8][9] On the other hand, data-driven science using machine learning (ML) of mined material data integrated with high-throughput experiments and computing can be used to design and discover new materials, and a vast space of materials and processing conditions can be studied with minimum time and cost. 6,7,[9][10][11][12][13][14][15][16][17][18][19][20][21] One such important class of materials are alloys used in essential components of electro-magnetic energy conversion devices, for e.g., electrical motors, generators, transformers, inductors, actuators, magnetic-MEMS, and magnetic recording media. ...
... 5,[7][8][9] On the other hand, data-driven science using machine learning (ML) of mined material data integrated with high-throughput experiments and computing can be used to design and discover new materials, and a vast space of materials and processing conditions can be studied with minimum time and cost. 6,7,[9][10][11][12][13][14][15][16][17][18][19][20][21] One such important class of materials are alloys used in essential components of electro-magnetic energy conversion devices, for e.g., electrical motors, generators, transformers, inductors, actuators, magnetic-MEMS, and magnetic recording media. 2,[22][23][24][25] Due to increased global electricity consumption, e.g., an increase of 1.7% from 2018 to 2019 26 and a projected growth rate of 3-4% per annum from 2019 to 2050, 27 there is an urgent need for developing energy-efficient electrical technologies. ...
Article
Full-text available
This study presents a machine learning (ML) framework aimed at accelerating the discovery of multi-property optimized Fe-Ni-Co alloys, addressing the time-consuming, expensive, and inefficient nature of traditional methods of material discovery, development, and deployment. We compiled a detailed heterogeneous database of the magnetic, electrical, and mechanical properties of Fe-Co-Ni alloys, employing a novel ML-based imputation strategy to address gaps in property data. Leveraging this comprehensive database, we developed predictive ML models using tree-based and neural network approaches for optimizing multiple properties simultaneously. An inverse design strategy, utilizing multi-objective Bayesian optimization (MOBO), enabled the identification of promising alloy compositions. This approach was experimentally validated using high-throughput methodology, highlighting alloys such as Fe66.8Co28Ni5.2 and Fe61.9Co22.8Ni15.3, which demonstrated superior properties. The predicted properties data closely matched experimental data within 14% accuracy. Our approach can be extended to a broad range of materials systems to predict novel materials with an optimized set of properties.
... 14,15 While the MGI expressed an acceleration of a factor of 2 and a reduction of costs by a factor of 2, some of the modern MAPs are proposing larger acceleration factors, ideally reducing the entire materials R&D cycle from typically 10 to 20 years to only 1 or 2 years. 5,16 Such acceleration factors become possible when automated laboratory setups are equipped with artificial intelligence (AI). 17,18 These setups build on recent scientific breakthroughs and the ability to program machines so that they can make independent and autonomous decisions to design and optimize materials and processes, moving away from the traditional Edisonian method of discovery. ...
Article
Full-text available
Conspectus In the ever-increasing renewable-energy demand scenario, developing new photovoltaic technologies is important, even in the presence of established terawatt-scale silicon technology. Emerging photovoltaic technologies play a crucial role in diversifying material flows while expanding the photovoltaic product portfolio, thus enhancing security and competitiveness within the solar industry. They also serve as a valuable backup for silicon photovoltaic, providing resilience to the overall energy infrastructure. However, the development of functional solar materials poses intricate multiobjective optimization challenges in a large multidimensional composition and parameter space, in some cases with millions of potential candidates to be explored. Solving it necessitates reproducible, user-independent laboratory work and intelligent preselection of innovative experimental methods. Materials acceleration platforms (MAPs) seamlessly integrate robotic materials synthesis and characterization with AI-driven data analysis and experimental design, positioning them as enabling technologies for the discovery and exploration of new materials. They are proposed to revolutionize materials development away from the Edisonian trial-and-error approaches to ultrashort cycles of experiments with exceptional precision, generating a reliable and highly qualitative data situation that allows training machine learning algorithms with predictive power. MAPs are designed to assist the researcher in multidimensional aspects of materials discovery, such as material synthesis, precursor preparation, sample processing and characterization, and data analysis, and are drawing escalating attention in the field of energy materials. Device acceleration platforms (DAPs), however, are designed to optimize functional films and layer stacks. Unlike MAPs, which focus on material discovery, a central aspect of DAPs is the identification and refinement of ideal processing conditions for a predetermined set of materials. Such platforms prove especially invaluable when dealing with “disordered semiconductors,” which depend heavily on the processing parameters that ultimately define the functional properties and functionality of thin film layers. By facilitating the fine-tuning of processing conditions, DAPs contribute significantly to the advancement and optimization of disordered semiconductor devices, such as emerging photovoltaics. In this Account, we review the recent advancements made by our group in automated and autonomous laboratories for advanced material discovery and device optimization with a strong focus on emerging photovoltaics, such as solution-processing perovskite solar cells and organic photovoltaics. We first introduce two MAPs and two DAPs developed in-house: a microwave-assisted high-throughput synthesis platform for the discovery of organic interface materials, a multipurpose robot-based pipetting platform for the synthesis of new semiconductors and the characterization of thin film semiconductor composites, the SPINBOT system, which is a spin-coating DAP with the potential to optimize complex device architectures, and finally, AMANDA, a fully integrated and autonomously operating DAP. Notably, we underscore the utilization of a robot-based high-throughput experimentation technique to address the common optimization challenges encountered in extensive multidimensional composition and parameter spaces pertaining to organic and perovskite photovoltaics materials. Finally, we briefly propose a holistic concept and technology, a self-driven autonomous material and device acceleration platform (AMADAP) laboratory, for autonomous functional solar materials discovery and development. We hope to discover how AMADAP can be further strengthened and universalized with advancing development of hardware and software infrastructures in the future.
... While the design (compositional and processing parameter) space of all possible materials is practically unlimited, most of this space is unexplored and the discovery rate of transformative materials is relatively slow. Although the convergence of high-performance computing (HPC), automation, and machine learning (ML) has significantly altered this timeline [3][4][5][6], transformative advances in functional materials and acceleration of their design will require addressing the deficiencies that currently exist in materials informatics [7], particularly a lack of centralized, standardized data management. ...
Article
Although the convergence of high-performance computing, automation, and machine learning has significantly altered the materials design timeline, transformative advances in functional materials and acceleration of their design will require addressing the deficiencies that currently exist in materials informatics, particularly a lack of standardized experimental data management. The challenges associated with experimental data management are especially true for combinatorial materials science, where advancements in automation of experimental workflows have produced datasets that are often too large and too complex for human reasoning. The data management challenge is further compounded by the multimodal and multi-institutional nature of these datasets, as they tend to be distributed across multiple institutions and can vary substantially in format, size, and content. Furthermore, modern materials engineering requires the tuning of not only composition but also of phase and microstructure to elucidate processing–structure–property–performance relationships. To adequately map a materials design space from such datasets, an ideal materials data infrastructure would contain data and metadata describing (i) synthesis and processing conditions, (ii) characterization results, and (iii) property and performance measurements. Here, we present a case study for the low-barrier development of such a dashboard that enables standardized organization, analysis, and visualization of a large data lake consisting of combinatorial datasets of synthesis and processing conditions, X-ray diffraction patterns, and materials property measurements generated at several different institutions. While this dashboard was developed specifically for data-driven thermoelectric materials discovery, we envision the adaptation of this prototype to other materials applications, and, more ambitiously, future integration into an all-encompassing materials data management infrastructure.
... Simple programming techniques and communication protocols have also improved rapidly. [19] The above factors have significantly reduced equipment design costs and shortened the design-manufacture-application cycle. ...
Article
Full-text available
The emerging photovoltaic (PV) technologies, such as organic and perovskite PVs, have the characteristics of complex compositions and processing, resulting in a large multidimensional parameter space for the development and optimization of the technologies. Traditional manual methods are time‐consuming and labor‐intensive in screening and optimizing material properties. Materials genome engineering (MGE) advances an innovative approach that combines efficient experimentation, big database and artificial intelligence (AI) algorithms to accelerate materials research and development. High‐throughput (HT) research platforms perform multidimensional experimental tasks rapidly, providing a large amount of reliable and consistent data for the creation of materials databases. Therefore, the development of novel experimental methods combining HT and AI can accelerate materials design and application, which is beneficial for establishing material‐processing‐property relationships and overcoming bottlenecks in the development of emerging PV technologies. This review introduces the key technologies involved in MGE and overviews the accelerating role of MGE in the field of organic and perovskite PVs.
Article
The purpose of this study is to investigate the attitudes of teachers toward the adoption of learning management systems (LMS) in secondary schools in Delta State. A descriptive survey research design was employed, and a total of three research questions were answered and one hypothesis was tested. The study initially recruited 384 teachers, evenly distributed between genders, but a final total of 361 participants completed the study. Data collection involved the use of a validated questionnaire, with a reliability index of 0.83. The data was analyzed using the mean, standard deviation and an independent samples t-test, with a significance level set at 0.05. The findings of the study revealed that teachers' attitudes toward LMS adoption in secondary schools in Delta State are generally positive, and there was no statistically significant difference in the attitudes toward LMS adoption between male and female teachers. Based on these results, it is recommended that educational institutions in Delta State continue to invest in the integration of learning management systems in their teaching and learning processes.
Article
Despite the fact that first-principles methods are critical tools in the study and design of materials today, the accuracy of density functional theory (DFT) prediction is fundamentally reliant on the exchange-correlation functional chosen to approximate the interactions between electrons. Although the general improvement in accurately calculating the bandgap with the Heyd-Scuseria-Ernzerhof (HSE) hybrid-functional method over the conventional semilocal DFT is well accepted, other properties such as formation energy have not been systematically studied and have yet to be evaluated thoroughly for different classes of materials. A high-throughput hybrid-functional DFT investigation on materials bandgaps and formation energies is therefore performed in this work. By evaluating over a thousand materials, including metals, semiconductors, and insulators, we have quantitatively verified that the materials bandgaps obtained through HSE [mean absolute error (MAE) = 0.687 eV] are more accurate than those from the Perdew-Burke-Ernzerhof (PBE) functional (MAE = 1.184 eV) when compared to the experimental values. For formation energies, PBE systematically underestimates the magnitude of the formation enthalpies (MAE = 0.175 eV/atom), whereas formation enthalpies obtained from the HSE method are generally more accurate (MAE = 0.147 eV/atom). We have also found that HSE significantly increases the accuracy of formation energy prediction for insulators and strongly bound compounds. A primary application of this new dataset is achieved by building a cokriging multifidelity machine learning (ML) model to quickly predict the bandgaps with HSE-level accuracy when its PBE bandgap is available from DFT calculations. The preliminary goal of our ML model, benchmarked in this work, is to select the semiconductors and insulators which may have been mislabeled as metals from the DFT-PBE calculations in the existing Open Quantum Materials Database. The performance of the cokriging model in reliably predicting HSE bandgaps with quantified model uncertainty is analyzed by comparing the results against published experimental data from the literature.
Article
The efficacy of electrolytes significantly affects battery performance, leading to the development of several strategies to enhance them. Despite this, the understanding of solvation structure remains inadequate. It is imperative to understand the structure–property–performance relationship of electrolytes using diverse techniques. This review explores the recent advancements in electrolyte design strategies for high capacity, high-voltage, wide-temperature, fast-charging, and safe applications. To begin, the current state-of-the-art electrolyte design directions are comprehensively reviewed. Subsequently, advanced techniques and computational methods used to understand the solvation structure are discussed. Additionally, the importance of high-throughput screening and advanced computation of electrolytes with the help of machine learning is emphasized. Finally, future horizons for studying electrolytes are proposed, aimed at improving battery performance and promoting their application in various fields by enhancing the microscopic understanding of electrolytes.
Article
Full-text available
With more than a hundred elements in the periodic table, a large number of potential new materials exist to address the technological and societal challenges we face today; however, without some guidance, searching through this vast combinatorial space is frustratingly slow and expensive, especially for materials strongly influenced by processing. We train a machine learning (ML) model on previously reported observations, parameters from physiochemical theories, and make it synthesis method–dependent to guide high-throughput (HiTp) experiments to find a new system of metallic glasses in the Co-V-Zr ternary. Experimental observations are in good agreement with the predictions of the model, but there are quantitative discrepancies in the precise compositions predicted. We use these discrepancies to retrain the ML model. The refined model has significantly improved accuracy not only for the Co-V-Zr system but also across all other available validation data. We then use the refined model to guide the discovery of metallic glasses in two additional previously unreported ternaries. Although our approach of iterative use of ML and HiTp experiments has guided us to rapid discovery of three new glass-forming systems, it has also provided us with a quantitatively accurate, synthesis method–sensitive predictor for metallic glasses that improves performance with use and thus promises to greatly accelerate discovery of many new metallic glasses. We believe that this discovery paradigm is applicable to a wider range of materials and should prove equally powerful for other materials and properties that are synthesis path–dependent and that current physiochemical theories find challenging to predict.
Article
Full-text available
Exploration of phase transitions and construction of associated phase diagrams are of fundamental importance for condensed matter physics and materials science alike, and remain the focus of extensive research for both theoretical and experimental studies. For the latter, comprehensive studies involving scattering, thermodynamics, and modeling are typically required. We present a new approach to data mining multiple realizations of collective dynamics, measured through piezoelectric relaxation studies, to identify the onset of a structural phase transition in nanometer-scale volumes, that is, the probed volume of an atomic force microscope tip. Machine learning is used to analyze the multidimensional data sets describing relaxation to voltage and thermal stimuli, producing the temperature-bias phase diagram for a relaxor crystal without the need to measure (or know) the order parameter. The suitability of the approach to determine the phase diagram is shown with simulations based on a two-dimensional Ising model. These results indicate that machine learning approaches can be used to determine phase transitions in ferroelectrics, providing a general, statistically significant, and robust approach toward determining the presence of critical regimes and phase boundaries.
Article
Full-text available
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.
Article
Full-text available
We propose a novel online learning algorithm for Restricted Boltzmann Machines (RBM), namely, the Online Generative Discriminative Restricted Boltzmann Machine (OGD-RBM), that provides the ability to build and adapt the network architecture of RBM according to the statistics of streaming data. The OGD-RBM is trained in two phases: (1) an online generative phase for unsupervised feature representation at the hidden layer and (2) a discriminative phase for classification. The online generative training begins with zero neurons in the hidden layer, adds and updates the neurons to adapt to statistics of streaming data in a single pass unsupervised manner, resulting in a feature representation best suited to the data. The discriminative phase is based on stochastic gradient descent and associates the represented features to the class labels. We demonstrate the OGD-RBM on a set of multi-category and binary classification problems for data sets having varying degrees of class-imbalance. We first apply the OGD-RBM algorithm on the multi-class MNIST dataset to characterize the network evolution. We demonstrate that the online generative phase converges to a stable, concise network architecture, wherein individual neurons are inherently discriminative to the class labels despite unsupervised training. We then benchmark OGD-RBM performance to other machine learning, neural network and ClassRBM techniques for credit scoring applications using 3 public non-stationary two-class credit datasets with varying degrees of class-imbalance. We report that OGD-RBM improves accuracy by 2.5-3% over batch learning techniques while requiring at least 24%-70% fewer neurons and fewer training samples. This online generative training approach can be extended greedily to multiple layers for training Deep Belief Networks in non-stationary data mining applications without the need for a priori fixed architectures.
Article
Full-text available
Spectroscopic measurements of current-voltage curves in scanning probe microscopy is the earliest and one of the most common methods for characterizing local energy-dependent electronic properties, providing insight into superconductive, semiconductor, and memristive behaviors. However, the quasistatic nature of these measurements renders them extremely slow. Here, we demonstrate a fundamentally new approach for dynamic spectroscopic current imaging via full information capture and Bayesian inference. This general-mode I-V method allows three orders of magnitude faster measurement rates than presently possible. The technique is demonstrated by acquiring I-V curves in ferroelectric nanocapacitors, yielding >100,000 I-V curves in <20 min. This allows detection of switching currents in the nanoscale capacitors, as well as determination of the dielectric constant. These experiments show the potential for the use of full information capture and Bayesian inference toward extracting physics from rapid I-V measurements, and can be used for transport measurements in both atomic force and scanning tunneling microscopy.
Article
Full-text available
We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp , provides application skeletons that readers can adapt to realize their own research data portals.
Article
Full-text available
Artificial intelligence can speed up research into new photovoltaic, battery and carbon-capture materials, argue Edward Sargent, Alán Aspuru-Guzikand colleagues.
Article
Solid-state chemists have been consistently successful in envisioning and making new compounds, often enlisting the tools of theoretical solid-state physics to explain some of the observed properties of the new materials. Here, a new style of collaboration between theory and experiment is discussed, whereby the desired functionality of the new material is declared first and theoretical calculations are then used to predict which stable and synthesizable compounds exhibit the required functionality. Subsequent iterative feedback cycles of prediction–synthesis–characterization result in improved predictions and promise not only to accelerate the discovery of new materials but also to enable the targeted design of materials with desired functionalities via such inverse design.
Article
Generative adversarial networks (GANs) are able to model the complex highdimensional distributions of real-world data, which suggests they could be effective for anomaly detection. However, few works have explored the use of GANs for the anomaly detection task. We leverage recently developed GAN models for anomaly detection, and achieve state-of-the-art performance on image and network intrusion datasets, while being several hundred-fold faster at test time than the only published GAN-based method.
Article
In photovoltaic (PV) materials development, the complex relationship between device performance and underlying materials parameters obfuscates experimental feedback from current-voltage (J-V) characteristics alone. Here, we address this complexity by adding temperature and injection dependence and applying a Bayesian inference approach to extract multiple device-relevant materials parameters simultaneously. Our approach is an order of magnitude faster than the cumulative time of multiple individual spectroscopy techniques, with added advantages of using device-relevant materials stacks and interface conditions. We posit that this approach could be broadly applied to other semiconductor- and energy-device problems of similar complexity, accelerating the pace of experimental research.