| NumPy's API and array protocols expose new arrays to the ecosystem. In this example, NumPy's 'mean' function is called on a Dask array. The call succeeds by dispatching to the appropriate library implementation (in this case, Dask) and results in a new Dask array. Compare this code to the example code in Fig. 1g.

Source publication

Fig. 1 | The NumPy array incorporates several fundamental array...

Fig. 2 | NumPy is the base of the scientific Python ecosystem....

Fig. 3 | NumPy's API and array protocols expose new arrays to the...

Array programming with NumPy

Article

Full-text available

Sep 2020

Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astrono...

Context 1

... NumPy has developed a culture of using time-tested software engineering practices to improve collaboration and reduce error 30 . This culture is not only adopted by leaders in the project but also enthusiastically taught to newcomers. The NumPy team was early to adopt distributed revision control and code review to improve collaboration (Fig. ...

View in full-text

Context 2

... facilitate this interoperability, NumPy provides 'protocols' (or contracts of operation), that allow for specialized arrays to be passed to NumPy functions (Fig. 3). NumPy, in turn, dispatches operations to the originating library, as required. Over four hundred of the most popular NumPy functions are supported. The protocols are implemented by widely used libraries such as Dask, CuPy, xarray and PyData/Sparse. Thanks to these developments, users can now, for example, scale their computation from ...

View in full-text

Coding schemes in neural networks learning classification tasks

Preprint

Jun 2024

Neural networks posses the crucial ability to generate meaningful representations of task-dependent features. Indeed, with appropriate scaling, supervised learning in neural networks can result in strong, task-dependent feature learning. However, the nature of the emergent representations, which we call the `coding scheme', is still unclear. To understand the emergent coding scheme, we investigate fully-connected, wide neural networks learning classification tasks using the Bayesian framework where learning shapes the posterior distribution of the network weights. Consistent with previous findings, our analysis of the feature learning regime (also known as `non-lazy', `rich', or `mean-field' regime) shows that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity. In linear networks, an analog coding scheme of the task emerges. Despite the strong representations, the mean predictor is identical to the lazy case. In nonlinear networks, spontaneous symmetry breaking leads to either redundant or sparse coding schemes. Our findings highlight how network properties such as scaling of weights and neuronal nonlinearity can profoundly influence the emergent representations.

EFECT -- A Method and Metric to Assess the Reproducibility of Stochastic Simulation Studies

Preprint

Full-text available

Jun 2024

Reproducibility is a foundational standard for validating scientific claims in computational research. Stochastic computational models are employed across diverse fields such as systems biology, financial modelling and environmental sciences. Existing infrastructure and software tools support various aspects of reproducible model development, application, and dissemination, but do not adequately address independently reproducing simulation results that form the basis of scientific conclusions. To bridge this gap, we introduce the Empirical Characteristic Function Equality Convergence Test (EFECT), a data-driven method to quantify the reproducibility of stochastic simulation results. EFECT employs empirical characteristic functions to compare reported results with those independently generated by assessing distributional inequality, termed EFECT error, a metric to quantify the likelihood of equality. Additionally, we establish the EFECT convergence point, a metric for determining the required number of simulation runs to achieve an EFECT error value of a priori statistical significance, setting a reproducibility benchmark. EFECT supports all real-valued and bounded results irrespective of the model or method that produced them, and accommodates stochasticity from intrinsic model variability and random sampling of model inputs. We tested EFECT with stochastic differential equations, agent-based models, and Boolean networks, demonstrating its broad applicability and effectiveness. EFECT standardizes stochastic simulation reproducibility, establishing a workflow that guarantees reliable results, supporting a wide range of stakeholders, and thereby enhancing validation of stochastic simulation studies, across a model's lifecycle. To promote future standardization efforts, we are developing open source software library libSSR in diverse programming languages for easy integration of EFECT.

Confidence Regulation Neurons in Language Models

Preprint

Full-text available

Jun 2024

Despite their widespread use, the mechanisms by which large language models (LLMs) represent and regulate uncertainty in next-token predictions remain largely unexplored. This study investigates two critical components believed to influence this uncertainty: the recently discovered entropy neurons and a new set of components that we term token frequency neurons. Entropy neurons are characterized by an unusually high weight norm and influence the final layer normalization (LayerNorm) scale to effectively scale down the logits. Our work shows that entropy neurons operate by writing onto an unembedding null space, allowing them to impact the residual stream norm with minimal direct effect on the logits themselves. We observe the presence of entropy neurons across a range of models, up to 7 billion parameters. On the other hand, token frequency neurons, which we discover and describe here for the first time, boost or suppress each token's logit proportionally to its log frequency, thereby shifting the output distribution towards or away from the unigram distribution. Finally, we present a detailed case study where entropy neurons actively manage confidence in the setting of induction, i.e. detecting and continuing repeated subsequences.

Hardness Classification Using Cost-Effective Off-the-Shelf Tactile Sensors Inspired by Mechanoreceptors

Article

Full-text available

Jun 2024

Perception is essential for robotic systems, enabling effective interaction with their surroundings through actions such as grasping and touching. Traditionally, this has relied on integrating various sensor systems, including tactile sensors, cameras, and acoustic sensors. This study leverages commercially available tactile sensors for hardness classification, drawing inspiration from the functionality of human mechanoreceptors in recognizing complex object properties during grasping tasks. Unlike previous research using customized sensors, this study focuses on cost-effective , easy-to-install, and readily deployable sensors. The approach employs a qualitative method, using Shore hardness taxonomy to select objects and evaluate the performance of commercial off-the-shelf (COTS) sensors. The analysis includes data from both individual sensors and their combinations analysed using multiple machine learning approaches, and accuracy as the primary evaluation metric was considered. The findings illustrate that increasing the number of classification classes impacts accuracy, achieving 92% in binary classification, 82% in ternary, and 80% in quaternary scenarios. Notably, the performance of commercially available tactile sensors is comparable to those reported in the literature, which range from 50% to 98% accuracy, achieving 92% accuracy with a limited data set. These results highlight the capability of COTS tactile sensors in hardness classification giving accuracy levels of 92%, while being cost-effective and easier to deploy than customized tactile sensors.

Demonstrating Quantum Homomorphic Encryption Through Simulation

Preprint

Full-text available

Jun 2024

Quantum homomorphic encryption (QHE), allows a quantum cloud server to compute on private data as uploaded by a client. We provide a proof-of-concept software simulation for QHE, according to the "EPR" scheme of Broadbent and Jeffery, for universal quantum circuits. We demonstrate the near-term viability of this scheme and provide verification that the additional cost of homomorphic circuit evaluation is minor when compared to the simulation cost of the quantum operations. Our simulation toolkit is an open-source Python implementation, that serves as a step towards further hardware applications of quantum homomorphic encryption between networked quantum devices.

Cross-correlation Techniques to Mitigate the Interloper Contamination for Line Intensity Mapping Experiments

Article

Full-text available

Jun 2024
ASTROPHYS J

Line intensity mapping (LIM) serves as a potent probe in astrophysics, relying on the statistical analysis of integrated spectral line emissions originating from distant star-forming galaxies. While LIM observations hold the promise of achieving a broad spectrum of scientific objectives, a significant hurdle for future experiments lies in distinguishing the targeted spectral line emitted at a specific redshift from undesired line emissions originating at different redshifts. The presence of these interloping lines poses a challenge to the accuracy of cosmological analyses. In this study, we introduce a novel approach to quantify line–line cross-correlations (LIM-LLX), enabling us to investigate the target signal amid instrumental noise and interloping emissions. For example, at a redshift of z ∼ 3.7, we observed that the measured auto-power spectrum of C ii 158 exhibited substantial bias, from interloping line emission. However, cross-correlating C ii 158 with CO(6–5) lines using an FYST-like experiment yielded a promising result, with a signal-to-noise ratio of ∼10. This measurement is notably unbiased. Additionally, we explore the extensive capabilities of cross-correlation by leveraging various CO transitions to probe the tomographic Universe at lower redshifts through LIM-LLX. We further demonstrate that incorporating low-frequency channels, such as 90 and 150 GHz, into FYST’s EoR-Spec-like experiment can maximize the potential for cross-correlation studies, effectively reducing the bias introduced by instrumental noise and interlopers.

Long-distance nuclear matrix elements for neutrinoless double-beta decay from lattice QCD

Article

Full-text available

Jun 2024
PHYS REV D

Neutrinoless double-beta ( 0 ν β β ) decay is a heretofore unobserved process which, if observed, would imply that neutrinos are Majorana particles. Interpretations of the stringent experimental constraints on 0 ν β β -decay half-lives require calculations of nuclear matrix elements. This work presents the first lattice quantum chromodynamics (LQCD) calculation of the matrix element for 0 ν β β decay in a multinucleon system, specifically the n n → p p e e transition, mediated by a light left-handed Majorana neutrino propagating over nuclear-scale distances. This calculation is performed with quark masses corresponding to a pion mass of m π = 806 MeV at a single lattice spacing and volume. The statistically cleaner Σ − → Σ + e e transition is also computed in order to investigate various systematic uncertainties. The prospects for matching the results of LQCD calculations onto a nuclear effective field theory to determine a leading-order low-energy constant relevant for 0 ν β β decay with a light Majorana neutrino are investigated. This work, therefore, sets the stage for future calculations at physical values of the quark masses that, combined with effective field theory and nuclear many-body studies, will provide controlled theoretical inputs to experimental searches of 0 ν β β decay. Published by the American Physical Society 2024

Geometric Deep Learning as an Enabler for Data Consistency and Interoperability in Manufacturing

Preprint

Full-text available

Jun 2024

Skilled labor shortages and the growing trend for customized products are increasing the complexity of manufacturing systems. Automation is often proposed to address these challenges, but industries operating under the engineer-to-order, lot-size-one production model often face significant limitations due to the lack of relevant data. This study investigates a laser-based optical worker assistance system in control cabinet manufacturing to demonstrate how geometric deep learning can enable the digitization and automation of complex manufacturing processes. An approach is presented for the extraction of assembly-relevant information, using only vendor-independent STEP files, and the integration and validation of these information in an industrial use case. This approach improves data quality and facilitates data transferability to components not listed in leading ECAD databases, suggesting broader potential for generalization across different components and use cases. In addition, an end-to-end inference pipeline without proprietary formats ensures high data integrity while approximating the surface of the underlying topology, making it suitable for small and medium-sized companies with limited computing resources. The study not only achieves the accuracy required for full automation, but also introduces the Spherical Boundary Score (SBS), a metric for evaluating the quality of assembly-relevant information and its application in real-world scenarios. Highlights-Assembly-relevant information can be extracted from digital representations-Neural network predictions enable better results than information from manufacturer-Assessment of three different mathematical approaches for feature detection-Evaluation of metrics to measure the quality of the detected assembly information-Application of Spherical Boundary Score in an industrial information use case

CBX: Python and Julia Packages for Consensus-Based Interacting Particle Methods

Article

Full-text available

Jun 2024

A self-supervised framework for abnormality detection from brain MRI

Preprint

Jun 2024

Contexts in source publication

Citations