ArticlePDF Available

Fudging the volcano-plot without dredging the data

February 2024
Nature Communications 15(1)

February 2024
15(1)

DOI:10.1038/s41467-024-45834-7

License
CC BY 4.0

Authors:

Thomas Burger

French National Centre for Scientific Research

Selecting omic biomarkers using both their effect size and their differential status significance (i.e., selecting the “volcano-plot outer spray”) has long been equally biologically relevant and statistically troublesome. However, recent proposals are paving the way to resolving this dilemma.

Available via license: CC BY 4.0

Content may be subject to copyright.

Comment

https://doi.org/10.1038/s41467-024-45834-7

Fudging the volcano-plot without dredging

the data

Thomas Burger Check for updates

Selecting omic biomarkers using both their

effect size and their differential status sig-

niﬁcance (i.e., selecting the “volcano-plot outer

spray”) has long been equally biologically rele-

vant and statistically troublesome. However,

recent proposals are paving the way to resolving

this dilemma.

In their recent Nature Communications article, Bayer et al. present the

tool CurveCurator1to select biomarkers according to their dose-

response proﬁles, with well-established statistical guarantees. To

conveniently blend the effect size and the signiﬁcance of the dose-

response curve into a single relevance score, they revisit the so-called

fudge factor introduced in the SAM test2. Moreover, to overcome the

risk of involuntary data dredging inherent to “fudging”the differential

analysis3, they propose a new approach inspired by the target-decoy

competition framework (TDC4). The principle of TDC is to add coun-

terfactual amino acid sequences (termed decoys) to a (target) data-

base of real amino acid sequences, as to mimic erroneous matches in a

peptide identiﬁcation task. Despite its original empirical-only justiﬁ-

cations (peptide matches involving decoy sequences should be as

probable as mismatches involving target sequences), TDC has long

been used in mass spectrometry-based proteomics to validate peptide

identiﬁcations according to a False Discovery Rate (FDR5) threshold.

Accordingly, Bayer et al. claim FDR control guarantees regardless of

the fudge factor tuning. Several recent works in selective inference (a

subﬁeld of high-dimensional statistics) have provided theoretical

support to their intuition6,7, which justify its generalization to a variety

of similar situations. Concretely, this comment asserts that essentially

any omics data analysis involving a volcano-plot is concerned –be it

transcriptomics, metabolomics, proteomics or any other; either at

bulk or single cell resolution. Therefore, elaborating on Bayer et al.

visionary proposal should lead to new user-tailored computational

omic tools, with sweeping consequences from the application

standpoint.

Issues pertaining to the fudge factor

While the fudge factor was originally introduced as a small positive

constant (denoted as s0) to improve the independence of the test

statistic variance and of the omic feature expression, its tuning to a

larger value has been observed to yielda user-deﬁned weighting of the

signiﬁcance and of the effect size. Concomitantly, the permutation-

based procedure of SAM test has sometimes been replaced by classical

p-value adjustment –as prescribed in the Benjamini-Hochberg (BH)

procedure for FDR control5. Applying simultaneously these two tricks

enhances volcano-plot interpretation: the biomarkers selected are

located in the outer spray of the volcano-plot, with selection

boundaries following hyperbolic contours (see Fig. 1). Unfortunately

doing so jeopardizes the statistical guarantees: brieﬂy, a too large s0

value distorts the p-values as well as the subsequent adjusted p-values

calculated in the BH procedure. To cope with this, itis either necessary

to constrain the tuning of s0(at the cost of less ﬂexible selection of the

outerspray)ortoreplaceBHprocedurebyanotherFDRcontrol

method that does not require any p-value adjustment. Although the

permutation-based procedure associated to SAM test is an option, it

does not strictly controls for the FDR (see Table 1). Bayer et al. have

thus explored another option inspired by TDC, which has emerged

nearly twenty years ago in proteomics in absence of p-values to assess

the signiﬁcance of peptide identiﬁcation.

Competition-based alternatives to control for the FDR

Although published a decade later, the most convincing theoretical

support of TDC to date has been knock-off ﬁlters (or KO)6,7.Inspiteof

minor discrepancies with TDC8, KO mathematically justiﬁes TDC gen-

eral approach to FDR control, as well as its main computational steps.

Notably, it demonstrates that FDR can be controlled on a biomarker

selection task by thresholding a contrast of relevance scores, which

results from a pairwise competition between the real putative

Fig. 1 | A typical volcano-plot. Asigniﬁcance measure is depicted on the Y-axis

(here, -log10(p-value)) and an effect size is depicted on the X-axis (here, the loga-

rithmized fold-change). The blue lines represent the contours of the relevance

score and the points highlighted in red are those selected according to a knockoff

procedure.

nature communications (2024) 15:1392 | 1

1234567890():,;

biomarkers and other ones, ﬁctionalized –respectively referred to as

decoys and knock-offs in the proteomic and statistic parlances. Intui-

tively, the proportion of ﬁctionalized features selected should be a

decent proxy of the ratio of false discoveries [Nota Bene:InKOtheory,

this proportion is corrected by adding 1 to the ratio numerator to cope

for a bias issue. Although this bias is still investigated9,thissuggeststo

correct for Eq. 16 in1by adding 1 to the numerator too.], as long as the

decision is made symmetrically (i.e., their relevance score is attributed

regardless of their real/ﬁctional status). However, despite conceptual

similarities, the problems solvable by TDC and KO differ: For the for-

mer, features are classically amino acid sequences; while for the latter,

a quantitative dataset describing biomolecular expression levels in

response to various experimental conditions is classically considered.

In this context, the TDC extension proposed in CurveCurator to pro-

cess quantitative dose-response curves constitutes a nice bridge

between the TDC and KO kingdoms.

Generalizing the CurveCurator approach

With this in mind, the pragmatic fallouts of Bayer et al. become strik-

ing. Any data analyst wishing to select omic biomarkers with a rele-

vance score picturing hyperbolic contours on a volcano plot (see Fig. 1)

can easily adapt CurveCurator approach to their own case, by follow-

ing the above procedure:

(1) Perform statistical tests to obtain a p-value for each putative

biomarker that assess the signiﬁcance of its differential status,

(2) Likewise, compute the biomarker fold-change, as a measure of the

effect size, and construct the volcano-plot,

(3) Tune s0to blend the signiﬁcance of the differential status and the

effect size into a single relevance score,

(4) Acknowledge the relevance score looks like a p-value even though

it may not be valid to use it as such, depending on the s0chosen,

(5) Rely on the KO framework (e.g., using the “knockoff”Rpackage

(https://cran.r-project.org/web/packages/knockoff/index.html)

as well as on the numerous tutorials available (https://web.

stanford.edu/group/candes/knockoffs/software/knockoffs/)to

control for the FDR on the biomarker selected according to the

relevance score, in a way similar to that of CurveCurator.

Different FDR control frameworks for different situations

An important and possibly troublesome feature of Fig. 1is that some

“unselected”black points are surrounded by “selected”red ones. In

other words, some putative biomarkers may not be retained while

other ones with smaller effect size and larger raw p-value are. This is a

classical drawback of competition-based FDR control methods: each

putative biomarker being retained or not does not only depend on its

features, but also on those of its ﬁctionalized counterpart, which

generation is subject to randomness. Although this weakness can be

addressed too, it requires less straightforward tools10. Another still

open problem in KO theory lies in the KO/decoy generation, which can

be difﬁcult depending on the dataset. With this respect, the approach

of CurveCurator is worthwhile. More generally, no method is perfect:

KO ﬁlters, like p-value adjustment or permutation-based control have

pros and cons (see Table 1). Therefore, depending on the data analyst

‘need, the preferred method should change. Considering this need for

multiple off-the-shelf tools, it is important to noticethat KO ﬁlters have

hardly spread beyond the theoretical community so far,and that their

applications to enhance data analysis in biology-centered investiga-

tions are still scarce, unfortunately. In this context, the seminal pro-

posal of Bayer et al. can be expected to foster the translation of these

fast-evolving theories into practical and efﬁcient software with grow-

ing importance in biomarker discoveries, and they must be acknowl-

edged for this.

Thomas Burger

Univ.GrenobleAlpes,INSERM,CEA,UA13BGE,CNRS,CEA,FR2048

ProFI, 38000 Grenoble, France. e-mail: thomas.burger@cea.fr

Received: 21 December 2023; Accepted: 2 February 2024;

References

1. Bayer, F. P., Gander, M., Kuster, B. & The, M. CurveCurator: a recalibrated F-statistic to

assess, classify, and explore signiﬁcance of dose–response curves. Nat. Commun. 14,

7902 (2023).

2. Tusher, V. G., Tibshirani, R. & Chu, G. Signiﬁcance analysis of microarrays applied to the

ionizing radiation response. Proc. Natl Acad. Sci. 98,5116–5121 (2001).

3. Giai Gianetto, Q.,Couté, Y., Bruley, C.& Burger, T. Uses and misuses of thefudge factor in

quantitative discovery proteomics. Proteomics 16,1955–1960 (2016).

4. Elias, J. E. & Gygi,S. P. Target-decoy searchstrategy for increased conﬁdence in large-scale

protein identiﬁcationsbymassspectrometry.Nat. Methods 4,207–214 (2007).

5. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful

approach to multiple testing. J. R. Stat. Soc.: Ser. B (Methodol.) 57,289–300 (1995).

6. Barber, R. F. & Candès, E. J. Controlling the falsediscovery rate via knockoffs. Ann. Stat. 43,

2055–2085 (2015).

7. Candès, E., Fan, Y., Janson, L. & Lv, J. Panning for gold:‘model-X’knockoffs for high

dimensional controlled variable selection. J. R. Stat. Soc. Ser. B: Stat. Methodol. 80,

551–577 (2018).

Table 1 | Pros and cons of the various approach to FDR control with respect to selecting biomarkers on the outer spray of the

volcano-plot

Approach to FDR control Advantages Disadvantages

P-value adjustment/

q-value

•Standard, easy to apply and computationally efﬁcient. •Requires well-calibrated p-values.

•Issue with FC ﬁltering3,11,12, either following hyperbolic contours

or not.

Empirical Bayes/null •Can cope for most of the drawbacks of the above methods (p-

value calibration and FC interaction).

•Requires the capability to tune the priors.

•Does not have frequentist interpretation, which may hinder objec-

tive signiﬁcance assessment13.

Permutations •The multiple test correction is non-parametric.

•No calibration issue.

•Related works based on FDP bounding authorize double-

dipping14.

•Strictly speaking, does not control for the FDR; Instead , it provides a

probabilistic upper bound to the FDP15.

•The fudge factor should not be tuned in contradiction to the sta-

tistical guidelines2.

Knock-offs/decoys •Flexibility of the relevance score. •Instable w.r.t. KO generation10.

•Difﬁculty of assessing the KO generation (which can lead to overly

conservative FDR control).

Comment

nature communications (2024) 15:1392 | 2

8. Etourneau, L. & Burger, T. Challenging targets or describing mismatches? A comment on

common decoydistribution by Madej et al. J. Proteome Res. 21, 2840–2845 (2022).

9. Rajchert, A. & Keich, U. Controlling the false discovery rate via competition: Is the+ 1 nee-

ded? Stat. Probab. Lett. 197, 109819 (2023).

10. Nguyen, T. B., Chevalier, J. A., Thirion, B., & Arlot, S. (2020,November). Aggregation of

multipleknockoffs.In InternationalConferenceon Machine Learning(pp. 7283-7293). PMLR.

11. McCarthy, D. J. & Smyth, G. K. Testing signiﬁcance relative to a fold-change threshold is a

TREAT. Bioinformatics 25,765–771 (2009).

12. Ebrahimpoor,M. & Goeman, J. J. Inﬂated false discovery rate due tovolcano plots:problem

and solutions. Brief. Bioinform. 22, bbab053 (2021).

13. Burger, T. CanOmics Biology Go Subjective because of Artiﬁcial Intelligence? A Comment

on “Challenges and Opportunities for Bayesian Statistics in Proteomics”by Crook et al. J.

Proteome Res. 21,1783–1786 (2022).

14. Enjalbert-Courrech, N. & Neuvial, P. Powerful and interpretable control of false discoveries

in two-group differential expression studies. Bioinformatics 38,5214–5221 (2022).

15. Hemerik, J. & Goeman,J. J. False discovery proportion estimation by permutations: con-

ﬁdence for signiﬁcance analysis of microarrays. J. R. Stat. Soc. Ser. B: Stat. Methodol. 80,

137–155 (2018).

Acknowledgements

This work was supported by grants from the French National Research Agency: ProFI project

(ANR-10-INBS-08), GRAL CBH project(ANR-17-EURE-0003) and MIAI@ GrenobleAlpes (ANR-19-

P3IA-0003).

Author contributions

Conceptualization (TB), bibliography (TB), analysis (TB), manuscript writing (TB).

Competing interests

The author declares no competing interests.

Additional information

Correspondence and requests for materials should be addressed to Thomas Burger.

Peer review information Nature Communications thanks the anonymous reviewer(s) for their

contribution to the peer review of this work.

Reprints and permissions information is available at

http://www.nature.com/reprints

Publisher’snoteSpringer Nature remains neutral with regard to jurisdictional claims in pub-

lished maps and institutional afﬁliations.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International

License,which permitsuse, sharing,adaptation, distributionand reproductionin any medium or

format,as long as you give appropriate credit to the original author(s) and thesource, provide a

link to the Creative Commons licence, and indicate if changes were made. The imagesor other

third party materialin this article are included in thearticle’s Creative Commons licence, unless

indicated otherwise in a credit line to the material. If material is not included in the article’s

Creative Commons licence and your intended use is not permitted by statutory regulation or

exceeds thepermitted use, you will need to obtain permission directly from the copyright

holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Comment

nature communications (2024) 15:1392 | 3

Multiplexed molecular imaging identifies pancreatic cancer candidates for epigenetic modulators

Preprint

Full-text available

Jun 2024

Genomic alterations drive the tumorigenesis of pancreatic cancer (PC). However, alone they do not explain its numerous phenotypes. Exploring the epigenetic landscapes of PC delivers a more insightful picture and might reveal excellent targeted therapies that could improve patient survival. PC subtyping based on histological features reflects its morphological diversity and correlates with clinical outcomes. Here we used a label-free multiplexed molecular imaging to recognize PC epigenetic modifications spatially, consequently, DNA and histone methylation (at lysine and arginine) and histone acetylation (at lysine) were investigated. To complete the picture, B-to-Z-DNA conformational change was assessed. We utilized convolutional neural networks and other machine learning approaches to analyze and semi-quantify the relative variability of epigenome among the six most common PC histological subtypes. We found foamy-glands (FG) and squamous-differentiated (SD) presenting oppositely to others and more alike the benign controls. They consistently expressed higher global levels of epigenetic modifications and higher Z-DNA ratios. Overall, our results suggest variable efficacy of targeting epigenetic regulators in histologically distinct PC subtypes.

CurveCurator: a recalibrated F-statistic to assess, classify, and explore significance of dose–response curves

Article

Full-text available

Nov 2023

Dose-response curves are key metrics in pharmacology and biology to assess phenotypic or molecular actions of bioactive compounds in a quantitative fashion. Yet, it is often unclear whether or not a measured response significantly differs from a curve without regulation, particularly in high-throughput applications or unstable assays. Treating potency and effect size estimates from random and true curves with the same level of confidence can lead to incorrect hypotheses and issues in training machine learning models. Here, we present CurveCurator, an open-source software that provides reliable dose-response characteristics by computing p-values and false discovery rates based on a recalibrated F-statistic and a target-decoy procedure that considers dataset-specific effect size distributions. The application of CurveCurator to three large-scale datasets enables a systematic drug mode of action analysis and demonstrates its scalable utility across several application areas, facilitated by a performant, interactive dashboard for fast data exploration.

Inflated false discovery rate due to volcano plots: problem and solutions

Article

Full-text available

Mar 2021

Motivation Volcano plots are used to select the most interesting discoveries when too many discoveries remain after application of Benjamini–Hochberg’s procedure (BH). The volcano plot suggests a double filtering procedure that selects features with both small adjusted $P$-value and large estimated effect size. Despite its popularity, this type of selection overlooks the fact that BH does not guarantee error control over filtered subsets of discoveries. Therefore the selected subset of features may include an inflated number of false discoveries. Results In this paper, we illustrate the substantially inflated type I error rate of volcano plot selection with simulation experiments and RNA-seq data. In particular, we show that the feature with the largest estimated effect is a very likely false positive result. Next, we investigate two alternative approaches for multiple testing with double filtering that do not inflate the false discovery rate. Our procedure is implemented in an interactive web application and is publicly available.

False Discovery Proportion Estimation by Permutations: Confidence for Significance Analysis of Microarrays

Article

Full-text available

Jun 2017

Significance analysis of microarrays (SAM) is a highly popular permutation-based multiple-testing method that estimates the false discovery proportion (FDP): the fraction of false positive results among all rejected hypotheses. Perhaps surprisingly, until now this method had no known properties. This paper extends SAM by providing 1−α upper confidence bounds for the FDP, so that exact confidence statements can be made. As a special case, an estimate of the FDP is obtained that underestimates the FDP with probability at most 0.5. Moreover, using a closed testing procedure, this paper decreases the upper bounds and estimates in such a way that the confidence level is maintained. We base our methods on a general result on exact testing with random permutations.

Uses and misuses of the fudge factor in quantitative discovery proteomics

Article

Full-text available

Jun 2016
PROTEOMICS

Selecting proteins with significant differential abundance is the cornerstone of many relative quantitative proteomics experiments. To do so, a trade-off between p-value thresholding and fold-change thresholding can be performed thanks to a specific parameter, named fudge factor, and classically noted s(0) . We have observed that this fudge factor is routinely turned away from its original (and statistically valid) use, leading to important distortion in the distribution of p-values, jeopardizing the protein differential analysis; as well as the subsequent biological conclusion. In this article, we provide a comprehensive viewpoint on this issue, as well as some guidelines to circumvent it. This article is protected by copyright. All rights reserved.

Controlling the false discovery rate via competition: Is the +1 needed?

Article

Mar 2023
STAT PROBABIL LETT

Powerful and interpretable control of false discoveries in two-group differential expression studies

Article

Oct 2022
BIOINFORMATICS

Motivation: The standard approach for statistical inference in differential expression (DE) analyses is to control the False Discovery Rate (FDR). However, controlling the FDR does not in fact imply that the proportion of false discoveries is upper bounded. Moreover, no statistical guarantee can be given on subsets of genes selected by FDR thresholding. These known limitations are overcome by post hoc inference, which provides guarantees of the number of proportion of false discoveries among arbitrary gene selections. However, post hoc inference methods are not yet widely used for DE studies. Results: In this paper, we demonstrate the relevance and illustrate the performance of adaptive interpolation-based post hoc methods for two-group DE studies. First, we formalize the use of permutation-based methods to obtain sharp confidence bounds that are adaptive to the dependence between genes. Then, we introduce a generic linear time algorithm for computing post hoc bounds, making these bounds applicable to large-scale two-group DE studies. The use of the resulting Adaptive Simes bound is illustrated on a RNA sequencing study. Comprehensive numerical experiments based on real microarray and RNA sequencing data demonstrate the statistical performance of the method. Availability and implementation: A cross-platform open source implementation within the R package sanssouci is available at https://sanssouci-org.github.io/sanssouci/. Supplementary information: Supplementary data are available at Bioinformatics online. Rmarkdown vignettes for the differential analysis of microarray and RNAseq data are available from the package.

Challenging Targets or Describing Mismatches? A Comment on Common Decoy Distribution by Madej et al

Article

Oct 2022
J PROTEOME RES

In their recent article, Madej et al. (Madej, D.; Wu, L.; Lam, H.Common Decoy Distributions Simplify False Discovery Rate Estimation in Shotgun Proteomics. J. Proteome Res.2022, 21 (2), 339-348) proposed an original way to solve the recurrent issue of controlling for the false discovery rate (FDR) in peptide-spectrum-match (PSM) validation. Briefly, they proposed to derive a single precise distribution of decoy matches termed the Common Decoy Distribution (CDD) and to use it to control for FDR during a target-only search. Conceptually, this approach is appealing as it takes the best of two worlds, i.e., decoy-based approaches (which leverage a large-scale collection of empirical mismatches) and decoy-free approaches (which are not subject to the randomness of decoy generation while sparing an additional database search). Interestingly, CDD also corresponds to a middle-of-the-road approach in statistics with respect to the two main families of FDR control procedures: Although historically based on estimating the false-positive distribution, FDR control has recently been demonstrated to be possible thanks to competition between the original variables (in proteomics, target sequences) and their fictional counterparts (in proteomics, decoys). Discriminating between these two theoretical trends is of prime importance for computational proteomics. In addition to highlighting why proteomics was a source of inspiration for theoretical biostatistics, it provides practical insights into the improvements that can be made to FDR control methods used in proteomics, including CDD.

Can Omics Biology Go Subjective because of Artificial Intelligence? A Comment on "Challenges and Opportunities for Bayesian Statistics in Proteomics" by Crook et al

Article

Jun 2022

Thomas Burger

In their recent review ( J. Proteome Res. 2022, 21 (4), 849-864), Crook et al. diligently discuss the basics (and less basics) of Bayesian modeling, survey its various applications to proteomics, and highlight its potential for the improvement of computational proteomic tools. Despite its interest and comprehensiveness on these aspects, the pitfalls and risks of Bayesian approaches are hardly introduced to proteomic investigators. Among them, one is sufficiently important to be brought to attention: namely, the possibility that priors introduced at an early stage of the computational investigations detrimentally influence the final statistical significance.

Panning for Gold: Model-free Knockoffs for High-dimensional Controlled Variable Selection

Article

Oct 2016
J R STAT SOC B

A common problem in modern statistical applications is to select, from a large set of candidates, a subset of variables which are important for determining an outcome of interest. For instance, the outcome may be disease status and the variables may be hundreds of thousands of single nucleotide polymorphisms on the genome. For data coming from low-dimensional ($n\ge p$) linear homoscedastic models, the knockoff procedure recently introduced by Barber and Cand\`{e}s solves the problem by performing variable selection while controlling the false discovery rate (FDR). The present paper extends the knockoff framework to arbitrary (and unknown) conditional models and any dimensions, including $n<p$, allowing it to solve a much broader array of problems. This extension requires the design matrix be random (independent and identically distributed rows) with a covariate distribution that is known, although we show our procedure to be robust to unknown/estimated distributions. To our knowledge, no other procedure solves the variable selection problem in such generality, but in the restricted settings where competitors exist, we demonstrate the superior power of knockoffs through simulations. Finally, we apply our procedure to data from a case-control study of Crohn's disease in the United Kingdom, making twice as many discoveries as the original analysis of the same data.

Controlling the False Discovery Rate via Knockoffs

Article

Mar 2014
ANN STAT

In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we need to know that the false discovery rate (FDR)---the expected fraction of false discoveries among all discoveries---is not too high, in order to assure the scientist that most of the discoveries are indeed true and replicable. This paper introduces the knockoff filter, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables. This method achieves exact FDR control in finite sample settings no matter the design or covariates, the number of variables in the model, and the amplitudes of the unknown regression coefficients, and does not require any knowledge of the noise level. As the name suggests, the method operates by manufacturing knockoff variables that are cheap---their construction does not require any new data---and are designed to mimic the correlation structure found within the existing variables, in a way that allows for accurate FDR control, beyond what is possible with permutation-based methods. The method of knockoffs is very general and flexible, and can work with a broad class of test statistics. We test the method in combination with statistics from the Lasso for sparse regression, and obtain empirical results showing that the resulting method has far more power than existing selection rules when the proportion of null variables is high. We also apply the knockoff filter to HIV data with the goal of identifying those mutations associated with a form of resistance to treatment plans.

Fudging the volcano-plot without dredging the data

Abstract

Recommended publications

Controlling for false discoveries subsequently to large scale one‐way ANOVA testing in proteomics: P...

Applying FDR control subsequently to large scale one-way ANOVA testing in proteomics: practical cons...

Unveiling the links between peptide identification and differential analysis FDR controls by means o...

Unveiling the Links Between Peptide Identification and Differential Analysis FDR Controls by Means o...