Article

Detecting and describing heterogeneity in mat-analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The investigation of heterogeneity is a crucial part of any meta-analysis. While it has been stated that the test for heterogeneity has low power, this has not been well quantified. Moreover the assumptions of normality implicit in the standard methods of meta-analysis are often not scrutinized in practice. Here we simulate how the power of the test for heterogeneity depends on the number of studies included, the total information (that is total weight or inverse variance) available and the distribution of weights among the different studies. We show that the power increases with the total information available rather than simply the number of studies, and that it is substantially lowered if, as is quite common in practice, one study comprises a large proportion of the total information. We also describe normal plots that are useful in assessing whether the data conform to a fixed effect or random effects model, together with appropriate tests, and give an application to the analysis of a multi-centre trial of blood pressure reduction. We conclude that the test of heterogeneity should not be the sole determinant of model choice in meta-analysis, and inspection of relevant normal plots, as well as clinical insight, may be more relevant to both the investigation and modelling of heterogeneity. © 1998 John Wiley & Sons, Ltd.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... All studies were weighted based on the generic inverse-variance method. 20 A random-effects model was developed and built on the assumption that betweenstudy variance results from factors other than measured treatment differences. 21 The random-effects model assumes a normal distribution of between-study variance, which is facilitated by the generally large sample sizes (>100) of the included studies. ...
... The I 2 statistic developed by Higgins et al describes the percent of total variation across studies attributable to heterogeneity beyond random chance. 20 Generally, values of 0% indicate no heterogeneity, and 25%, 50%, and 75% indicate low, moderate, or high heterogeneity, respectively. 20 We assumed an acceptable I 2 value of 50% or less. ...
... 20 Generally, values of 0% indicate no heterogeneity, and 25%, 50%, and 75% indicate low, moderate, or high heterogeneity, respectively. 20 We assumed an acceptable I 2 value of 50% or less. 23 In the event of an I 2 value > 50%, we specified a priori methods on outlier assessment and removal. ...
Article
Full-text available
Background Rural residents have a higher prevalence of colorectal cancer (CRC) mortality compared to urban individuals. Policies have been aimed at improving access to CRC screening to reduce these outcomes. However, little attention has been paid to other determinants of CRC-related outcomes, such as stage at diagnosis, treatment, or survivorship care. The main objective of this analysis was to evaluate literature describing differences in CRC screening, stage at diagnosis, treatment, and survivorship care between rural and urban individuals. Materials and Methods We conducted a systematic review of electronic databases using a combination of MeSH and free-text search terms related to CRC screening, stage at diagnosis, treatment, survivorship care, and rurality. We identified 921 studies, of which 39 were included. We assessed methodological quality using the ROBINS-E tool and summarized findings descriptively. A meta-analysis was performed of studies evaluating CRC screening using a random-effects model. Results Seventeen studies reported disparities between urban and rural populations in CRC screening, 12 on treatment disparities, and 8 on staging disparities. We found that rural individuals were significantly less likely to report any type of screening at any time period (pooled odds ratio = 0.81, 95% CI, 0.76-0.86). Results were inconclusive for disparities in staging at diagnosis and treatment. One study reported a lower likelihood of use of CRC survivorship care for rural individuals compared to urban individuals. Conclusion There remains an urgent need to evaluate and address CRC disparities in rural areas. Investigators should focus future work on assessing the quality of staging at diagnosis, treatment, and survivorship care in rural areas.
... Categorical estimates of primary and secondary outcomes were reported as proportions and 95% confidence intervals (CI) using weighted random effects models. Continuous variables were reported as means and standard deviations; medians were used if means were not available, and standard deviations (SDs) were calculated or imputed when possible [15]. For comparative studies, effect size was calculated with weighted mean differences (WMDs) for continuous variables. ...
... WMD were handled as continuous variables using the inverse variance approach. Presence of heterogeneity across studies was defined using a Chi-square test of homogeneity with a 0.10 significance level [15]. ...
Article
Full-text available
Background: Several risk scores have attempted to risk stratify patients with acute upper gastrointestinal bleeding (UGIB) who are at a lower risk of requiring hospital-based interventions or negative outcomes including death. This systematic review and meta-analysis aimed to compare predictive abilities of pre-endoscopic scores in prognosticating the absence of adverse events in patients with UGIB. Methods: We searched MEDLINE, EMBASE, Central, and ISI Web of knowledge from inception to February 2023. All fully published studies assessing a pre-endoscopic score in patients with UGIB were included. The primary outcome was a composite score for the need of a hospital-based intervention (endoscopic therapy, surgery, angiography, or blood transfusion). Secondary outcomes included: mortality, rebleeding, or the individual endpoints of the composite outcome. Both proportional and comparative analyses were performed. Results: Thirty-eight studies were included from 2153 citations, (n = 36,215 patients). Few patients with a low Glasgow-Blatchford score (GBS) cutoff (0, ≤1 and ≤2) required hospital-based interventions (0.02 (0.01, 0.05), 0.04 (0.02, 0.09) and 0.03 (0.02, 0.07), respectively). The proportions of patients with clinical Rockall (CRS = 0) and ABC (≤3) scores requiring hospital-based intervention were 0.19 (0.15, 0.24) and 0.69 (0.62, 0.75), respectively. GBS (cutoffs 0, ≤1 and ≤2), CRS (cutoffs 0, ≤1 and ≤2), AIMS65 (cutoffs 0 and ≤1) and ABC (cutoffs ≤1 and ≤3) scores all were associated with few patients (0.01-0.04) dying. The proportion of patients suffering other secondary outcomes varied between scoring systems but, in general, was lowest for the GBS. GBS (using cutoffs 0, ≤1 and ≤2) showed excellent discriminative ability in predicting the need for hospital-based interventions (OR 0.02, (0.00, 0.16), 0.00 (0.00, 0.02) and 0.01 (0.00, 0.01), respectively). A CRS cutoff of 0 was less discriminative. For the other secondary outcomes, discriminative abilities varied between scores but, in general, the GBS (using cutoffs up to 2) was clinically useful for most outcomes. Conclusions: A GBS cut-off of one or less prognosticated low-risk patients the best. Expanding the GBS cut-off to 2 maintains prognostic accuracy while allowing more patients to be managed safely as outpatients. The evidence is limited by the number, homogeneity, quality, and generalizability of available data and subjectivity of deciding on clinical impact. Additional, comparative and, ideally, interventional studies are needed.
... In meta-analyses, the collected studies often exhibit heterogeneity, characterized by greater variation among studies than can be explained by the variation within each study (Beath, 2014), which could result in misleading conclusions about the overall treatment effect (Lin et al., 2017;Noma et al., 2022). The random effects model is a popular tool for handling heterogeneity (Hardy and Thompson, 1998;Wang et al., 2022). However, the standard model assumes normal distributions for both random effects and within-study errors (nMeta), making it susceptible to outlying studies. ...
Preprint
Random effects meta-analysis model is an important tool for integrating results from multiple independent studies. However, the standard model is based on the assumption of normal distributions for both random effects and within-study errors, making it susceptible to outlying studies. Although robust modeling using the $t$ distribution is an appealing idea, the existing work, that explores the use of the $t$ distribution only for random effects, involves complicated numerical integration and numerical optimization. In this paper, a novel robust meta-analysis model using the $t$ distribution is proposed ($t$Meta). The novelty is that the marginal distribution of the effect size in $t$Meta follows the $t$ distribution, enabling that $t$Meta can simultaneously accommodate and detect outlying studies in a simple and adaptive manner. A simple and fast EM-type algorithm is developed for maximum likelihood estimation. Due to the mathematical tractability of the $t$ distribution, $t$Meta frees from numerical integration and allows for efficient optimization. Experiments on real data demonstrate that $t$Meta is compared favorably with related competitors in situations involving mild outliers. Moreover, in the presence of gross outliers, while related competitors may fail, $t$Meta continues to perform consistently and robustly.
... The p-values were obtained by comparing the statistic with a χ 2 distribution with k-1 degrees of freedom (where k is the number of trials). A p-value of < 0.10 was adopted since the Q statistic tends to suffer from low differential power [40]. The formal Q statistic was used in conjunction with the methods for assessing heterogeneity. ...
Article
Full-text available
Background and aim A plateau in oxygen uptake ( V ˙ O 2 ) during an incremental cardiopulmonary exercise test (CPET) to volitional exhaustion appears less likely to occur in special and clinical populations. Secondary maximal oxygen uptake ( V ˙ O 2 max ) criteria have been shown to commonly underestimate the actual V ˙ O 2 max . The verification phase protocol might determine the occurrence of ‘true’ V ˙ O 2 max in these populations. The primary aim of the current study was to systematically review and provide a meta-analysis on the suitability of the verification phase for confirming ‘true’ V ˙ O 2 max in special and clinical groups. Secondary aims were to explore the applicability of the verification phase according to specific participant characteristics and investigate which test protocols and procedures minimise the differences between the highest V ˙ O 2 values attained in the CPET and verification phase. Methods Electronic databases (PubMed, Web of Science, SPORTDiscus, Scopus, and EMBASE) were searched using specific search strategies and relevant data were extracted from primary studies. Studies meeting inclusion criteria were systematically reviewed. Meta-analysis techniques were applied to quantify weighted mean differences (standard deviations) in peak V ˙ O 2 from a CPET and a verification phase within study groups using random-effects models. Subgroup analyses investigated the differences in V ˙ O 2 max according to individual characteristics and test protocols. The methodological quality of the included primary studies was assessed using a modified Downs and Black checklist to obtain a level of evidence. Participant-level V ˙ O 2 data were analysed according to the threshold criteria reported by the studies or the inherent measurement error of the metabolic analysers and displayed as Bland-Altman plots. Results Forty-three studies were included in the systematic review, whilst 30 presented quantitative information for meta-analysis. Within the 30 studies, the highest mean V ˙ O 2 values attained in the CPET and verification phase protocols were similar (mean difference = -0.00 [95% confidence intervals, CI = -0.03 to 0.03] L·min ⁻¹ , p = 0.87; level of evidence, LoE: strong). The specific clinical groups with sufficient primary studies to be meta-analysed showed a similar V ˙ O 2 max between the CPET and verification phase ( p > 0.05, LoE: limited to strong). Across all 30 studies, V ˙ O 2 max was not affected by differences in test protocols ( p > 0.05; LoE: moderate to strong). Only 23 (53.5%) of the 43 reviewed studies reported how many participants achieved a lower, equal, or higher V ˙ O 2 value in the verification phase versus the CPET or reported or supplied participant-level V ˙ O 2 data for this information to be obtained. The percentage of participants that achieved a lower, equal, or higher V ˙ O 2 value in the verification phase was highly variable across studies (e.g. the percentage that achieved a higher V ˙ O 2 in the verification phase ranged from 0% to 88.9%). Conclusion Group-level verification phase data appear useful for confirming a specific CPET protocol likely elicited V ˙ O 2 max , or a reproducible V ˙ O 2 p e a k , for a given special or clinical group. Participant-level data might be useful for confirming whether specific participants have likely elicited V ˙ O 2 max , or a reproducible V ˙ O 2 p e a k , however, more research reporting participant-level data is required before evidence-based guidelines can be given. Trial registration PROSPERO ( CRD42021247658 ) https://www.crd.york.ac.uk/prospero .
... When patients were stratified into subgroups based on the severity of acne vulgaris, we computed the pooled mean and standard deviation (SD) following the Cochrane Handbook [26]. Heterogeneity was assessed using the Cochran's Q and I 2 statistics [27]. If P < 0.1 for Cochran's Q statistic or if I 2 > 50%, we employed a random-effects model; otherwise, we applied a fixed-effects model [28]. ...
Article
Full-text available
The relationship between acne vulgaris and oxidative stress biomarkers lacks a clear consensus. This study aimed to explore the potential correlation between acne vulgaris and circulating oxidative stress biomarkers (superoxide dismutase [SOD], malondialdehyde [MDA], and total antioxidant capacity [TAC]). We searched the PubMed, Embase, and Cochrane Library databases for articles published before June 26, 2023. The literature search combined free words and the medical subject headings terms related to acne vulgaris, SOD, MDA, and TAC. Data were analyzed using Stata 15 software. Additionally, we conducted a subgroup analysis stratified by the severity of acne vulgaris. A total of 14 trials involving 1191 participants were included. Overall results revealed that acne vulgaris was associated with MDA concentrations (SMD = 1.73; 95% CI 1.05, 2.4; P < 0.001). Subgroup analyses indicated that the severity of acne vulgaris was correlated with levels of circulating biomarkers of oxidative stress. TAC concentrations were significantly lower in patients with moderate acne vulgaris compared to controls (SMD = − 1.37; 95% CI = − 2.15, − 0.58, P = 0.001). SOD concentrations were significantly lower (SMD = − 2.92; 95% CI = − 5.39, − 0.46, P = 0.02) and MDA concentrations were significantly higher (SMD = 2.26; 95% CI = 0.95, 3.57, P = 0.001) in patients with severe acne vulgaris compared to controls. Our results implied that oxidative stress may exist in acne vulgaris. Furthermore, the severity of acne vulgaris was also correlated with oxidative stress.
... The statistical power of the Q-test heavily relies on the number of studies included in a meta-analysis, and as a result, it may fail to detect heterogeneity due to limited power when the number of included studies is small (less than 10) or when the included studies are of small size (Huedo-Medina, Snchez-Meca, Marn-Martnez,x& Botella, 2006). Therefore, a non-significant result should not be taken as showing empirical evidence for homogeneity (Hardyx& Thompson, 1998). This issue warrants serious attention, considering that a significant proportion of meta-analyses in Cochrane reviews involve only five or fewer studies (Davey, Turner, Clarke,x& Higgins, 2011). ...
Article
Full-text available
Meta-analysis of proportions has been widely adopted across various scientific disciplines as a means to estimate the prevalence of phenomena of interest. However, there is a lack of comprehensive tutorials demonstrating the proper execution of such analyses using the R programming language. The objective of this study is to bridge this gap and provide an extensive guide to conducting a meta-analysis of proportions using R. Furthermore, we offer a thorough critical review of the methods and tests involved in conducting a meta-analysis of proportions , highlighting several common practices that may yield biased estimations and misleading inferences. We illustrate the meta-analytic process in five stages: (1) preparation of the R environment; (2) computation of effect sizes; (3) quantification of heterogeneity; (4) visualization of heterogeneity with the forest plot and the Baujat plot; and (5) explanation of heterogeneity with moderator analyses. In the last section of the tutorial, we address the misconception of assessing publication bias in the context of meta-analysis of proportions. The provided code offers readers three options to transform proportional data (e.g., the double arcsine method). The tutorial presentation is conceptually oriented and formula usage is minimal. We will use a published meta-analysis of proportions as an example to illustrate the implementation of the R code and the interpretation of the results.
... It is also important to note that Cochran's Q statistic should be interpreted with caution since the number of studies included in the analysis was small. 40 The funnel plot was unsymmetrical. However, given that a small number of studies were available, it was difficult to assess accurately whether any small-study bias was present or if the appearance was due to chance. ...
Article
Full-text available
Objective Summarise longitudinal observational studies to determine whether diabetes (types 1 and 2) is a risk factor for frozen shoulder. Design Systematic review and meta-analysis. Data sources MEDLINE, Embase, AMED, PsycINFO, Web of Science Core Collection, CINAHL, Epistemonikos, Trip, PEDro, OpenGrey and The Grey Literature Report were searched on January 2019 and updated in June 2021. Reference screening and emailing professional contacts were also used. Eligibility criteria Longitudinal observational studies that estimated the association between diabetes and developing frozen shoulder. Data extraction and synthesis Data extraction was completed by one reviewer and independently checked by another using a predefined extraction sheet. Risk of bias was judged using the Quality In Prognosis Studies tool. For studies providing sufficient data, random-effects meta-analysis was used to derive summary estimates of the association between diabetes and the onset of frozen shoulder. Results A meta-analysis of six case–control studies including 5388 people estimated the odds of developing frozen shoulder for people with diabetes to be 3.69 (95% CI 2.99 to 4.56) times the odds for people without diabetes. Two cohort studies were identified, both suggesting diabetes was associated with frozen shoulder, with HRs of 1.32 (95% CI 1.22 to 1.42) and 1.67 (95% CI 1.46 to 1.91). Risk of bias was judged as high in seven studies and moderate in one study. Conclusion People with diabetes are more likely to develop frozen shoulder. Risk of unmeasured confounding was the main limitation of this systematic review. High-quality studies are needed to confirm the strength of, and understand reasons for, the association. PROSPERO registration number CRD42019122963.
... This is commonly assessed via a Cochran's Q test, in which a significant result indicates that there is heterogeneity, and thus a random-effects model should be used (Kanters 2022). However, the power of the Cochran's Q test is low when the number of studies or experiments included in the analyses is low (Hardy and Thompson 1998). Consequently, some researchers advocate for the use of a less conservative criterion to assess the significance of the Cochran Q test (α = .10; ...
Article
Full-text available
Judgments of learning are most accurate when made at a delay from the initial encoding of the assessed material. A wealth of evidence suggests that this is because a delay encourages participants to base their predictions on cues retrieved from long-term memory, which are generally the most diagnostic of later memory performance. We investigated the hypothesis that different types of study techniques affect delayed JOL accuracy by influencing the accessibility of cues stored in long-term memory. In two experiments, we measured the delayed-JOL accuracy of participants who encoded semantically unrelated and weakly related word pairs through one of three study techniques: reading the pairs twice (study practice), generating keywords (elaborative encoding), or taking a cued-recall test with feedback (retrieval practice). We also measured the accessibility, utilization, and diagnostic quality of two long-term memory cues at the time of the delayed JOL: (a) retrieval of the target, and (b) noncriterial cues (retrieval of contextual details pertaining to the encoding of the target). We found that the accessibility of targets was positively associated with delayed-JOL accuracy. Further, we provide evidence that when study techniques enhance the accessibility of targets, they likewise enhance delayed-JOL accuracy.
Article
Full-text available
There is debate on the role of glial fibrillary acidic protein (GFAP) as a reliable biomarker in multiple sclerosis (MS) and neuromyelitis optica spectrum disorder (NMOSD), and its potential to reflect disease progression. This review aimed to investigate the role of GFAP in MS and NMOSD. A systematic search of electronic databases, including PubMed, Embase, Scopus, and Web of Sciences, was conducted up to 20 December 2023 to identify studies that measured GFAP levels in people with MS (PwMS) and people with NMOSD (PwNMOSD). R software version 4.3.3. with the random-effect model was used to pool the effect size with its 95% confidence interval (CI). Of 4109 studies, 49 studies met our inclusion criteria encompassing 3491 PwMS, 849 PwNMOSD, and 1046 healthy controls (HCs). The analyses indicated that the cerebrospinal fluid level of GFAP (cGFAP) and serum level of GFAP (sGFAP) were significantly higher in PwMS than HCs (SMD = 0.7, 95% CI: 0.54 to 0.86, p < 0.001, I 2 = 29%, and SMD = 0.54, 95% CI: 0.1 to 0.99, p = 0.02, I 2 = 90%, respectively). The sGFAP was significantly higher in PwNMOSD than in HCs (SMD = 0.9, 95% CI: 0.73 to 1.07, p < 0.001, I 2 = 10%). Among PwMS, the Expanded Disability Status Scale (EDSS) exhibited significant correlations with cGFAP (r = 0.43, 95% CI: 0.26 to 0.59, p < 0.001, I 2 = 91%) and sGFAP (r = 0.36, 95% CI: 0.23 to 0.49, p < 0.001, I 2 = 78%). Regarding that GFAP is increased in MS and NMOSD and has correlations with disease features, it can be a potential biomarker in MS and NMOSD and indicate the disease progression and disability in these disorders.
Article
Full-text available
Purpose This study aimed to consolidate the evidence regarding the prognostic influence of sarcopenia in degenerative lumbar spine surgeries. Methods A literature search of public databases was conducted up to Nov 15, 2023 using combinations of the key words “sarcopenia” and “lumbar spine surgery”. Eligible studies were those that focused on adults undergoing decompression or fusion surgery for degenerative lumbar spine diseases, and compared the outcomes between patients with and without preoperative sarcopenia. Primary outcomes were change in ODI and back and leg pain VAS pain scores. Secondary outcomes were changes in Eq. 5D, JOA, SFHS-p scores, and LOS. Results Ultimately, nine retrospective studies with a total of 993 patients were included. Sarcopenic patients exhibited significantly worse functional improvement as assessed by ODI compared to non-sarcopenic patients (pooled standardized mean difference [pSMD] = 0.53, 95% confidence interval [CI]: 0.17–0.90). Back pain (pSMD = 0.31, 95% CI:0.15-0.47) and leg pain (pSMD = 0.21, 95% CI:0.02 - 0.39) improvement were also less in sarcopenic patients. Non-sarcopenic patients had greater improvements in Eq. 5D (pSMD = 0.25) and SFHS-p (pSMD = 0.39), and shorter LOS (pSMD = 0.62). Conclusions As compared to patients without sarcopenia, those with sarcopenia undergoing lumbar spine surgery for degenerative diseases have lower improvements in functional ability, quality of life, physical health, pain relief and extended hospitalization compared to those without sarcopenia.
Article
Full-text available
Background: Anterior cervical discectomy and fusion (ACDF) and cervical disc arthroplasty (CDA) are both considered to be efficacious surgical procedures for treating cervical spondylosis in patients with or without compression myelopathy. This updated systematic review and meta-analysis aimed to compare the outcomes of these procedures for the treatment of cervical degenerative disc disease (DDD) at two contiguous levels. Methods: The PubMed, EMBASE, and Cochrane CENTRAL databases were searched up to 1 May 2023. Studies comparing the outcomes between CDA and ACDF in patients with two-level cervical DDD were eligible for inclusion. Primary outcomes were surgical success rates and secondary surgery rates. Secondary outcomes were scores on the Neck Disability Index (NDI) and Visual Analogue Scale (VAS) for neck and arm pain, as well as the Japanese Orthopaedic Association (JOA) score for the severity of cervical compression myelopathy and complication rates. Results: In total, eight studies (two RCTs, four retrospective studies, and two prospective studies) with a total of 1155 patients (CDA: 598; ACDF: 557) were included. Pooled results revealed that CDA was associated with a significantly higher overall success rate (OR, 2.710, 95% CI: 1.949–3.770) and lower secondary surgery rate (OR, 0.254, 95% CI: 0.169–0.382) compared to ACDF. In addition, complication rates were significantly lower in the CDA group than in the ACDF group (OR, 0.548, 95% CI: 0.326 to 0.919). CDA was also associated with significantly greater improvements in neck pain VAS than ACDF. No significant differences were found in improvements in the arm VAS, NDI, and JOA scores between the two procedures. Conclusions: CDA may provide better postoperative outcomes for surgical success, secondary surgery, pain reduction, and postoperative complications than ACDF for treating patients with two-level cervical DDD.
Article
Statement of Problem Computer‐aided design and manufacturing (CAD/CAM) have been increasingly used to enhance the patient and clinician experiences with removable complete dentures (CDs). Yet, evidence from systematic reviews is lacking to validate the clinical significance of these digital prostheses. Purpose The purpose of this systematic review was to compare CAD/CAM CDs with the traditional ones in terms of patient and clinician‐reported outcomes, post‐insertion adjustment visits and costs. Materials and Methods An electronic search of four databases [Medline (Ovid), Embase, Scopus and Cochrane CENTRAL; last update: May 2022] was performed to retrieve clinical studies comparing CAD/CAM and traditional CDs. Two independent reviewers screened the articles, extracted data (methods and outcomes) and assessed risk of bias of the included studies. The following outcomes underwent meta‐analysis (random‐effects model): overall patient and clinician satisfaction, oral health‐related quality of life (OHRQoL), number of post‐insertion adjustment visits, as well as laboratory and total costs. Results This review included 11 studies. Meta‐analysis revealed that CAD/CAM CDs are comparable to the traditional CDs in terms of overall patient satisfaction and OHRQoL. Clinician‐reported data depended on the manufacturing technique: whereas milled CDs performed better than traditional CDs in terms of clinician satisfaction and number of adjustments, 3D printed and traditional CDs were similar. Fabrication of CAD/CAM CDs required significantly less laboratory and overall costs than the traditional CDs. Conclusions There is some evidence showing that CAD/CAM CDs are at least comparable to traditional CDs. Further well‐designed randomized clinical trials are needed to evaluate the performance of specific CAD/CAM approaches for manufacturing CDs, however.
Article
Full-text available
Geographical indication (GI) products serve as one of the significant instruments for increasing farmers’ income. While most studies affirmatively indicate that GI products contribute to boosting farmers’ income growth, it is noteworthy that their relationship does not consistently demonstrate a positive correlation. The academic discourse on this issue remains inconclusive. This study employs a meta-analysis method to reanalyze 140 effect sizes from 32 independent research samples across diverse global contexts. The findings reveal that the development of GI products significantly promotes farmer income growth, showing a high positive correlation (r = 0.348, CI = [0.104, 0.540]). Specifically, there exists a high positive correlation between GI products and per capita disposable income (r = 0.389) and a moderate positive correlation between GI products and agricultural product prices (r = 0.255). Further analysis indicates that factors at the sample level, literature level, and methodological level all exert moderating effects on the relationship between GI products and farmers’ income. This study not only provides a scientific response to the debate surrounding the relationship between GI products and farmers’ income but also delves into the underlying mechanisms. It holds significant importance for advancing the rational optimization of agricultural resources and enhancing agricultural competitiveness.
Article
Full-text available
Background Healthy lifestyle behaviors (LBs) have been widely recommended for the prevention and management of cardiovascular disease (CVD). Despite a large number of studies exploring the association between combined LBs and CVD, a notable gap exists in integration of relevant literatures. We conducted a systematic review and meta-analysis of prospective cohort studies to analyze the correlation between combined LBs and the occurrence of CVD, as well as to estimate the risk of various health complications in individuals already diagnosed with CVD. Methods Articles published up to February 10, 2023 were sourced through PubMed, EMBASE and Web of Science. Eligible prospective cohort studies that reported the relations of combined LBs with pre-determined outcomes were included. Summary relative risks (RRs) and 95% confidence intervals (CIs) were estimated using either a fixed or random-effects model. Subgroup analysis, meta-regression, publication bias, and sensitivity analysis were as well performed. Results In the general population, individuals with the healthiest combination of LBs exhibited a significant risk reduction of 58% for CVD and 55% for CVD mortality. For individuals diagnosed with CVD, adherence to the healthiest combination of LBs corresponded to a significant risk reduction of 62% for CVD recurrence and 67% for all-cause mortality, when compared to those with the least-healthy combination of LBs. In the analysis of dose-response relationship, for each increment of 1 healthy LB, there was a corresponding decrease in risk of 17% for CVD and 19% for CVD mortality within the general population. Similarly, among individuals diagnosed with CVD, each additional healthy LB was associated with a risk reduction of 27% for CVD recurrence and 27% for all-cause mortality. Conclusions Adopting healthy LBs is associated with substantial risk reduction in CVD, CVD mortality, and adverse outcomes among individuals diagnosed with CVD. Rather than focusing solely on individual healthy LB, it is advisable to advocate for the adoption of multiple LBs for the prevention and management of CVD. Trial registration PROSPERO: CRD42023431731.
Article
Full-text available
Background There is no consensus on the effect of tumor necrosis factor-alpha (TNF-alpha) inhibitors on lipid profiles in patients with psoriasis. This study aimed to investigate the effects of TNF-alpha inhibitors on lipid profiles (triglycerides, total cholesterol, low-density lipoprotein, or high-density lipoprotein) in patients with psoriasis. Methods We searched PubMed, Embase, and Cochrane Library databases for articles published before October 17, 2023. Four TNF-alpha inhibitors (infliximab, etanercept, adalimumab, and certolizumab) were included in our study. (PROSPERO ID: CRD42023469703). Results A total of twenty trials were included. Overall results revealed that TNF-alpha inhibitors elevated high-density lipoprotein levels in patients with psoriasis (WMD = 2.31; 95% CI: 0.96, 3.67; P = 0.001), which was supported by the results of sensitivity analyses excluding the effect of lipid-lowering drugs. Subgroup analyses indicated that high-density lipoprotein levels were significantly increased in the less than or equal to 3 months group (WMD = 2.88; 95% CI: 1.37, 4.4; P < 0.001), the etanercept group (WMD = 3.4; 95% CI = 1.71, 5.09, P < 0.001), and the psoriasis group (WMD = 2.52; 95% CI = 0.57, 4.48, P = 0.011). Triglyceride levels were significantly increased in the 3 to 6-month group (WMD = 4.98; 95% CI = 1.97, 7.99, P = 0.001) and significantly decreased in the 6-month and older group (WMD = -19.84; 95% CI = -23.97, -15.7, P < 0.001). Additionally, Triglyceride levels were significantly increased in the psoriasis group (WMD = 5.22; 95% CI = 2.23, 8.21, P = 0.001). Conclusion Our results revealed that TNF-alpha inhibitors might temporarily increase high-density lipoprotein levels in patients with psoriasis. However, changes in triglycerides were not consistent among the different durations of treatment, with significant increases after 3 to 6 months of treatment. Future prospective trials with long-term follow-up contribute to confirming and extending our findings. Systematic Review Registration https://www.crd.york.ac.uk/PROSPERO/, identifier CRD42023469703.
Article
Full-text available
The relationship between academic performance and procrastination has been well documented over the last twenty years. The current research aggregates existing research on this topic. Most of the studies either find no result or a small negative result. However, recent studies suggest that procrastination can have a positive influence on academic performance if the procrastination is active instead of passive. To analyse the effect of active procrastination on academic performance, a meta-analysis was conducted. The analysis includes 96 articles with 176 coefficients including a combined average of 55,477 participants related to the correlation between academic performance and procrastination. The analysis uncovered a modest negative correlation between academic performance and procrastination overall. Importantly, the type of procrastination exerted a substantial impact on the strength of this correlation: active procrastination demonstrated a small positive effect size, whereas passive procrastination registered a small negative effect size. Additionally, participant-specific characteristics and indicators further modulated the magnitude of the correlation. The implications of this research extend to underscoring a potential beneficial aspect of procrastination, specifically elucidating how certain types of procrastination can positively influence academic performance.
Article
Occupational exposure to carcinogens of increasing cancer risk have been extensively suggested. A robust assessment of these evidence is needed to guide public policy and health care. We aimed to classify the strength of evidence for associations of 13 occupational carcinogens (OCs) and risk of cancers. We searched PubMed and Web of Science up to November 2022 to identify potentially relevant studies. We graded the evidence into convincing, highly suggestive, suggestive, weak, or not significant according to a standardized classification based on: random-effects p value, number of cancer cases, 95% confidence interval of largest study, heterogeneity between studies, 95% prediction interval, small study effect, excess significance bias and sensitivity analyses with credibility ceilings. The quality of meta-analysis was evaluated by AMSTAR 2. Forty-eight articles yielded 79 meta-analyses were included in current umbrella review. Evidence of associations were convincing (class I) or highly suggeastive (class II) for asbestos exposure and increasing risk of lung cancer among smokers (RR = 8.79, 95%CI: 5.81-13.25 for cohort studies and OR = 8.68, 95%CI: 5.68-13.24 for case-control studies), asbestos exposure and increasing risk of mesothelioma (RR = 4.61, 95%CI: 2.57-8.26), and formaldehyde exposure and increasing risk of sinonasal cancer (RR = 1.68, 95%CI: 1.38-2.05). Fifteen associations were supported by suggestive evidence (class III). In summary, the current umbrella review found strong associations between: asbestos exposure and increasing risk of lung cancer among smokers; asbestos exposure and increasing risk of mesothelioma; and formaldehyde exposure and higher risk of sinonasal cancer. Other associations might be genuine, but substantial uncertainty remains.
Article
Aims To explore whether music intervention improves the quality of life (QOL) of patients undergoing hematopoietic stem cell transplantation (HSCT) and to evaluate its impact on patients' symptoms of depression/anxiety and fatigue. Methods This systematic review and meta‐analysis was conducted in accordance with the Preferred Reporting Items of Systematic reviews and Meta‐Analyses (PRISMA) guidelines. The databases PubMed, Cochrane CENTRAL, and EMBASE were searched from inception to September 30, 2022. The search strategy used a combination of the keywords “music” and “hematopoietic stem cell transplantation” or “HSCT.” The outcomes assessed were QOL, depression and anxiety, and fatigue. Pooled standardized mean differences with 95% confidence intervals were calculated to compare the outcomes between the music intervention and control groups. Heterogeneity across the studies was assessed using a chi‐square‐based test, and the I ² and Q statistics. Results Meta‐analysis of the included study population showed that music intervention for patients undergoing HSCT was associated with patients' improved QOL, and resulted in reduced depression/anxiety and fatigue compared to patients without music intervention. Conclusion Music intervention benefits HSCT outcomes, including better QOL, less depression/anxiety, and less fatigue postoperatively. Future trials with larger samples are still warranted to strengthen the evidence supporting the benefits of music intervention in this patient population.
Article
Full-text available
Objective: to estimate the risk of thyroid cancer incidence in the population of Ukraine in connection with its exposure to radioactive iodine fallout of Chornobyl origin and the use of pesticides in agricultural production in the country. Object of study. Incidence rates of thyroid cancer in the population of Ukraine in 2001-2019, average regional radiation doses absorbed by the thyroid because of the Chornobyl accident, the volume of use of various groups of pesticides in the regions of Ukraine. Research methods: statistical, mathematical and cartographic. Results. The study covering the period of 2001–2019, revealed significant temporal and regional differences in the thyroid cancer incidence in the population of the Ukraine regions in 2001–2019. The existence of a significant correlation between the thyroid cancer incidence and the amount of radiation exposure to the thyroid associated with the Chornobyl accident was established. The existence of a significant correlation between the thyroid cancer incidence and the degree of pesticide use intensity in agriculture in the Ukraine regions was established. A significant value of multiple correlation r = 0.5866 (p < 0.05) was found between the thyroid cancer incidence in Ukraine and the average regional radiation doses and the pesticide use intensity in agricultural production in the country. Conclusions. A reliable value of the multiple correlation between the value of the average regional radiation exposure doses to the thyroid associated with the Chornobyl accident and the degree of pesticide use intensity in the national economy of Ukraine and the thyroid cancer incidence in the population was determined. Key words: ionizing radiation, pesticides, thyroid gland, morbidity, cancer.
Article
Objective Our objective was to perform a systematic review and meta‐analysis comparing the clinical outcomes after endoscopic and microscopic type I tympanoplasty. Study Design Randomized controlled trials, two‐arm prospective studies, and retrospective studies were included. Setting Medline, Cochrane, EMBASE, and Google Scholar databases were searched until March 1, 2022 using the combinations of search terms: “endoscopic,” “microscopic,” and “tympanoplasty.” Methods Two independent reviewers utilized the abovementioned search strategy to identify eligible studies. If any uncertainty existed regarding eligibility, a third reviewer was consulted. Primary outcome measures were graft success rate, air‐bone gap (ABG) improvement, and operative time. Secondary outcomes were the rate of need for canalplasty, the proportion of self‐rated excellent cosmetic results, and pain visual analog scale (VAS). Results Forty‐three studies enrolled a total of 3712 patients who were undergoing type I tympanoplasty and were finally included. The pooled result showed endoscopic approach was significantly associated with shorter operative time (difference in means: −20.021, 95% confidence interval [CI]: −31.431 to −8.611), less need for canalplasty (odds ratio [OR]: 0.065, 95% CI: 0.026‐0.164), more self‐rated excellent cosmetic results (OR: 87.323, 95% CI: 26.750‐285.063), and lower pain VAS (difference in means: −2.513, 95% CI: −4.737 to −0.228). No significant differences in graft success rate or ABG were observed between the two procedures. Conclusion Endoscopic type I tympanoplasty provides a similar graft success rate, improvement in ABG, and reperforation rate to microscopic tympanoplasty with a shorter operative time, better self‐rated cosmetic results, and less pain. Unless contraindicated, the endoscopic approach should be the procedure of choice in type I tympanoplasty.
Article
Full-text available
Backgrounds Existing studies have investigated the relationship between the levels of serum inhibin B (INHB), anti-müllerian hormone (AMH) and precocious puberty in girls, but the results are inconsistent. Objective The aim of this meta-analysis was to assess whether the INHB and AMH levels changed in girls with precocious puberty relative to healthy controls. Methods PubMed, Embase, Cochrane Library and Web of Science were searched through June 2022. We included observational clinical studies reporting the serum levels INHB and AMH in girls with precocious puberty. Conference articles and observational study abstracts were included if they contained enough information regarding study design and outcome data. Case series and reports were excluded. An overall standard mean difference (SMD) between precocious puberty and healthy controls was estimated using a DerSimonian-Laird random-effects model. Results A total of 11 studies featuring 552 girls with precocious puberty and 405 healthy girls were selected for analysis. The meta-analysis showed that the INHB level of precocious puberty [including central precocious puberty (CPP) and premature the larche (PT)] were significantly increased. While there was no significant association between precocious puberty [including CPP, PT, premature pubarche (PP) and premature adrenarche (PA)] and the level of serum AMH. Conclusion Scientific evidence suggested that the INHB level, but not the AMH level, altered in girls with precocious puberty compared with healthy controls. Through our results we think that INHB level might be a marker for the auxiliary diagnosis of precocious puberty (especially CPP and PT). Therefore, it is important to evaluate and thoroughly investigate the clinical indicators (e.g., INHB) in order to ensure early diagnosis and medical intervention, and the risk of physical, psychological and social disorders in immature girls with precocious puberty is minimized.
Article
Full-text available
The growing research interest in the relationship between health insurance and pharmaceutical innovation is driven by their significant impact on healthcare optimization and pharmaceutical development. The existing literature, however, lacks consensus on this relationship and provides no evidence of the magnitude of a correlation. In this context, this study employs meta-analysis to explore the extent to which health insurance affects pharmaceutical innovation. It analyzes 202 observations from 14 independent research samples, using the regression coefficient of health insurance on pharmaceutical innovation as the effect size. The results reveal that there is a strong positive correlation between health insurance and pharmaceutical innovation (r = 0.367, 95% CI = [0.294, 0.436]). Public health insurance exhibits a stronger promoting effect on pharmaceutical innovation than commercial health insurance. The relationship between health insurance and pharmaceutical innovation is moderated by the country of sample origin, data range, journal type, journal impact factor, type of health insurance, and research perspective. Our research findings further elucidate the relationship mechanism between health insurance and pharmaceutical innovation, providing a valuable reference for future explorations in pharmaceutical fields.
Article
Full-text available
Background Although Ethiopia is working towards measles elimination, a recurrent measles outbreak has occurred. To take appropriate measures, previously, many fragmented and inconsistent outbreak investigations were done, but there is no consolidated evidence on attack rate, case fatality rate, and determinants of measles infection during the measles outbreak. This systematic review and meta-analysis aimed to identify cumulative evidence on attack rate, case fatality rate, and determinants of measles infection during the outbreak. Methods A systematic literature review and Meta-analysis was used. We searched Google Scholar, Medline/PubMed, Cochrane/Wiley Library, EMBASE, Science Direct, and African Journals Online databases using different terms. Investigations that applied any study design, data collection- and analysis methods related to the measles outbreak investigation were included. Data were extracted in an Excel spreadsheet and imported into STATA version 17 software for meta-analysis. The I² statistics were used to test heterogeneity, and ‘Begg’s and ‘Egger’s tests were used to assess publication bias. The odds ratio (OR) with a 95% confidence interval (CI) was presented using forest plots. Results Eight measles outbreak investigations with 3004 measles cases and 33 deaths were included in this study. The pooled attack rate (A.R.) and case fatality rate were 34.51/10,000 [95% CI; 21.33–47.70/10,000] population and 2.21% [95% CI; 0.07-2.08%], respectively. Subgroup analysis revealed the highest attack rate of outbreaks in the Oromia region (63.05 per 10,000 population) and the lowest in the Amhara region (17.77 per 10,000 population). Associated factors with the measles outbreak were being unvaccinated (OR = 5.96; 95% CI: 3.28–10.82) and contact history (OR = 3.90; 95% CI: 2.47–6.15). Conclusion Our analysis revealed compelling evidence within the outbreak descriptions, highlighting elevated attack and case fatality rates. Measles infection was notably linked to being unvaccinated and having a contact history. Strengthening routine vaccination practices and enhancing contact tracing measures are vital strategies moving forward.
Article
A positive relationship between assessments of procedural justice and attitudes toward the political system has been identified in many studies of various countries. To quantify this relationship, a meta‐analysis was conducted on 69 samples from 50,814 respondents, reported in 37 manuscripts between 1981 and 2021. We found positive correlations between assessments of procedural justice and attitudes toward politicians, political institutions, and the political system in people of different ages and in countries with different political regimes. These positive correlations exist in real and hypothetical situations with various levels of authority. However, two factors moderated the association between the assessment of procedural justice and political attitudes. First, procedural justice as a set of norms is more strongly related to attitudes toward the system than procedural justice as a generalized assessment is. Second, the assessment of procedural justice is more strongly associated with attitudes toward political institutions and the system than attitudes toward the procedures and decisions. Moreover, the percentage of heterogeneity in the obtained models is fairly high; categorical moderators explain 43% of the variance of the effects obtained. The results should therefore be interpreted with consideration of this substantial heterogeneity in the correlations' sizes.
Article
Full-text available
Objective Many studies have investigated the impact of precocious puberty on cardiovascular disease (CVD) outcomes and the association between lipid profile levels and precocious puberty. However, the results have been inconsistent. The aim of this meta-analysis was to evaluate whether triglyceride (TG), total cholesterol (TC), high density lipoprotein (HDL)and low density lipoprotein (LDL) levels were altered in girls with precocious puberty compared with healthy controls. Methods References published before June 2022 in the EMBASE, Cochrane Library, PubMed and Web of Science databases were searched to identify eligible studies. A DerSimonian-Laird random-effects model was used to evaluate the overall standard mean difference (SMD) between precocious puberty and healthy controls. Subgroup analyses and sensitivity analyses were preformed, and publication bias was assessed. Results A total of 14 studies featuring 1023 girls with precocious puberty and 806 healthy girls were selected for analysis. The meta-analysis showed that TG (SMD: 0.28; 95% CI: 0.01 to 0.55; P = 0.04), TC (SMD: 0.30; 95% CI: 0.01 to 0.59; P = 0.04), LDL (SMD: 0.45; 95% CI: 0.07 to 0.84; P = 0.02)levels were significantly elevated in girls with precocious puberty. HDL levels did not change significantly (SMD: -0.06; 95% CI: -0.12 to 0.61; P = 0.62). Subgroup analyses revealed that the heterogeneity in the association between lipid profile and precocious puberty in this meta-analysis may arise from disease type, region, sample size, chronological age, body mass index difference and drug usage. Conclusion Lipid profile levels altered in girls with precocious puberty compared with healthy controls. In order to minimize the risk of CVD morbidity and mortality, early interventions were needed to prevent obesity in children and adolescents, especially those with precocious puberty.
Article
Full-text available
Background Neonatal deaths remain a serious public health concern in Ethiopia; being one of the top five countries contributing to half of the neonatal deaths worldwide. Although antenatal care (ANC) is assumed as one of the viable options that contribute to neonatal survival, findings from original studies indicated disparities in the effect of ANC on neonatal mortality. Thus, this review aimed to determine the pooled effect of ANC on neonatal mortality in Ethiopia. Methods Databases such as PubMed, EMBASE, CINAHL, HINARI, and Cochrane Central Library were searched for articles using keywords. Selection of eligible articles and data extraction were conducted by an independent author. The risk of a bias assessment tool for non-randomized studies was used to assess the quality of the articles. Comprehensive meta-analysis version 2 software was used for meta-analysis. Heterogeneity and publication bias of included studies were assessed using I² test statistic and Egger test, respectively. The random-effect model was employed; an outcome is reported using a risk ratio with a 95% confidence interval. Results Of 28 included studies, 20 showed receiving at least one ANC visit had a significant association with neonatal mortality. Accordingly, the estimated pooled risk ratio for neonatal mortality was 0.59 (95% CI 0.45, 0.77) among infants born to women who had at least one ANC visit compared to infants born to women who had no ANC visits. Conclusion This finding indicated that neonatal mortality was decreased among infants born to women who had at least one ANC visit compared to infants born to women who had no ANC visit. Therefore, promoting and strengthening ANC service utilization during pregnancy would accelerate the reduction of neonatal mortality in Ethiopia.
Article
Full-text available
Do geographical indication products help facilitate the development of the agricultural economy? This problem is a point of controversy in the field of global agricultural intellectual property. For a long time, there have been different viewpoints on this problem; that is, there is a positive correlation, negative correlation, U-shape correlation, or no correlation between the geographical indication products and the development of the agricultural economy in the context of different studies. To clarify the influence mechanism between the two and explain why there are these disputes, this study used the meta-analysis method to statistically reanalyze 405 observation values provided in 64 independent research samples from the context of different regions around the world. The study results show that geographical indications not only generate more economic benefits than ordinary products but also contribute to the growth of the agricultural economy by effectively promoting the development of agricultural product trade and the enhancement of agricultural product price. There exists a low positive correlation between the geographical indication products and the agricultural economy (r = 0.176, 95% CI = [0.126, 0.225]). In addition, the promotion effect of geographical indication products on the agricultural economy is regulated by the country of origin of the samples, sample level, publication journal, data type, data acquisition approach, and research method. Our research findings further revealed the internal relationship mechanism between the geographical indication products and the agricultural economy and lay a foundation for better protecting and developing geographical indication products.
Article
Full-text available
Basal insulin treatment for type 2 diabetes is usually initiated on a background of oral glucose-lowering medications (OGLM). We wanted to examine the influence of various OGLMs on fasting plasma glucose (FPG) and hemoglobin A 1c (HbA 1c ) values achieved after titration. A PubMed literature search retrieved 42 publications (clinical trials introducing basal insulin in 17 433 insulin-naïve patients with type 2 diabetes on a defined background of OGLM) and reporting FPG, HbA 1c , target achievement, hypoglycemic events, and insulin doses. 60 individual study arms were grouped by OGLM (combinations) allowed during the titration process: (a) metformin only; (b) sulfonylureas only; (c) metformin and sulfonylureas; or (d) metformin and dipeptidyl peptidase-4 (DPP-4) inhibitors. For all OGLM categories, weighted means and SD were calculated for baseline and end-of-treatment FPG, HbA 1c , target achievement, incidence of hypoglycemic events, and insulin doses. Primary end point was a difference in FPG after titration between OGLM categories. Statistics: analysis of variance and post hoc comparisons. Sulfonylureas, alone or in combination with metformin, impair the titration of basal insulin (insulin doses 30%–40% lower, more hypoglycemic episodes), thus leading to poorer final glycemic control (p<0.05 for FPG and HbA 1c after titration). Conversely, adding a DPP-4 inhibitor to metformin is superior to metformin alone (p<0.05 for FPG and HbA 1c achieved) in patients with type 2 diabetes initiating basal insulin therapy. In conclusion, OGLM are a major determinant of the success of basal insulin therapy. Sulfonylureas impair, while DPP-4 inhibitors (added to metformin) may facilitate the achievement of ambitious fasting glucose targets. PROSPERO registration number CRD42019134821.
Article
Data heterogeneity determines whether sufficiently strong conclusions can be derived from synthesizing and aggregating the available literature. Multiple tools are available to calculate data heterogeneity, but each tool has pros and cons. Providing a prediction interval may be the most beneficial since it allows readers to quantify heterogeneity in a clear and clinically relevant form. However, the ultimate decision on which tool to use is left to the discretion of the researcher. This decision should be decided upon during the study inception.
Article
Full-text available
The diagnostic accuracy of a screening tool is often characterized by its sensitivity and specificity. An analysis of these measures must consider their intrinsic correlation. In the context of an individual participant data meta-analysis, heterogeneity is one of the main components of the analysis. When using a random-effects meta-analytic model, prediction regions provide deeper insight into the effect of heterogeneity on the variability of estimated accuracy measures across the entire studied population, not just the average. This study aimed to investigate heterogeneity via prediction regions in an individual participant data meta-analysis of the sensitivity and specificity of the Patient Health Questionnaire-9 for screening to detect major depression. From the total number of studies in the pool, four dates were selected containing roughly 25%, 50%, 75% and 100% of the total number of participants. A bivariate random-effects model was fitted to studies up to and including each of these dates to jointly estimate sensitivity and specificity. Two-dimensional prediction regions were plotted in ROC-space. Subgroup analyses were carried out on sex and age, regardless of the date of the study. The dataset comprised 17,436 participants from 58 primary studies of which 2322 (13.3%) presented cases of major depression. Point estimates of sensitivity and specificity did not differ importantly as more studies were added to the model. However, correlation of the measures increased. As expected, standard errors of the logit pooled TPR and FPR consistently decreased as more studies were used, while standard deviations of the random-effects did not decrease monotonically. Subgroup analysis by sex did not reveal important contributions for observed heterogeneity; however, the shape of the prediction regions differed. Subgroup analysis by age did not reveal meaningful contributions to the heterogeneity and the prediction regions were similar in shape. Prediction intervals and regions reveal previously unseen trends in a dataset. In the context of a meta-analysis of diagnostic test accuracy, prediction regions can display the range of accuracy measures in different populations and settings.
Article
Full-text available
Payments for Environmental or Ecosystem Services (PES) schemes have become a popular tool to address environmental degradation and to promote sustainable management of ecosystem services. We use meta-regression analysis on a sample of 110 individual studies to investigate the determinants of the environmental effectiveness, defined as the probability to increase environmental services (ES) provision, of about 149 PES schemes implemented worldwide. We find that increased effectiveness of PES schemes is strongly associated with periodical third-party monitoring, generic reference design and to a lesser extent results-based payments. We further study the determinants of PES additionality, defined as direct changes in ES provision induced by the PES scheme, compared to a baseline without PES, on a smaller sample of 41 studies from which we could obtain the necessary data. The results confirm the role of certain design variables, such as monitoring type, and raise a potential trade-off between enrolment and additionality in the assessment of PES effectiveness.
Article
Full-text available
Purpose Transarterial chemoembolization (TACE) with tyrosine kinase inhibitors (TKIs) has been increasingly used to treat unresectable hepatocellular carcinoma (uHCC). However, the superiority of combination therapy to TACE monotherapy remains controversial. Therefore, here we performed a meta-analysis to evaluate the efficacy and safety of TACE plus TKIs in patients with uHCC. Methods We searched four databases for eligible studies. The primary outcome was time to progression (TTP), while the secondary outcomes were overall survival (OS), tumor response rates, and adverse events (AEs). Pooled hazard ratios (HRs) with 95% confidence intervals (95% CIs) were collected for TTP and OS, and the data were analyzed using random-effects meta-analysis models in STATA software. OR and 95% CIs were used to estimate dichotomous variables (complete remission[CR], partial remission[PR], stable disease[SD], progressive disease[PD], objective response rate[ORR], disease control rate[DCR], and AEs) using RStudio’s random-effects model. Quality assessments were performed using the Newcastle–Ottawa scale (NOS) for observational studies and the Cochrane risk of bias tool for randomized controlled trials (RCTs). Results The meta-analysis included 30 studies (9 RCTs, 21 observational studies) with 8246 patients. We judged the risk of bias as low in 44.4% (4/9) of the RCTs and high in 55.6% (5/9) of the RCTs. All observational studies were considered of high quality, with a NOS score of at least 6. Compared with TACE alone or TACE plus placebo, TACE combined with TKIs was superior in prolonging TTP (combined HR 0.72, 95% CI 0.65–0.80), OS (combined HR 0.57, 95% CI 0.49–0.67), and objective response rate (OR 2.13, 95% CI 1.23–3.67) in patients with uHCC. However, TACE plus TKIs caused a higher incidence of AEs, especially hand-foot skin reactions (OR 87.17%, 95%CI 42.88–177.23), diarrhea (OR 18.13%, 95%CI 9.32–35.27), and hypertension (OR 12.24%, 95%CI 5.89–25.42). Conclusions Our meta-analysis found that TACE plus TKIs may be beneficial for patients with uHCC in terms of TTP, OS, and tumor response rates. However, combination therapy is also associated with a significantly increased risk of adverse reactions. Therefore, we must evaluate the clinical benefits and risks of combination therapy. Further well-designed RCTs are needed to confirm our findings. Trial registration PROSPERO registration number: CRD42022298003.
Article
Full-text available
Background Studies included in a meta-analysis are often heterogeneous. The traditional random-effects models assume their true effects to follow a normal distribution, while it is unclear if this critical assumption is practical. Violations of this between-study normality assumption could lead to problematic meta-analytical conclusions. We aimed to empirically examine if this assumption is valid in published meta-analyses. Methods In this cross-sectional study, we collected meta-analyses available in the Cochrane Library with at least 10 studies and with between-study variance estimates > 0. For each extracted meta-analysis, we performed the Shapiro–Wilk (SW) test to quantitatively assess the between-study normality assumption. For binary outcomes, we assessed between-study normality for odds ratios (ORs), relative risks (RRs), and risk differences (RDs). Subgroup analyses based on sample sizes and event rates were used to rule out the potential confounders. In addition, we obtained the quantile–quantile (Q–Q) plot of study-specific standardized residuals for visually assessing between-study normality. Results Based on 4234 eligible meta-analyses with binary outcomes and 3433 with non-binary outcomes, the proportion of meta-analyses that had statistically significant non-normality varied from 15.1 to 26.2%. RDs and non-binary outcomes led to more frequent non-normality issues than ORs and RRs. For binary outcomes, the between-study non-normality was more frequently found in meta-analyses with larger sample sizes and event rates away from 0 and 100%. The agreements of assessing the normality between two independent researchers based on Q–Q plots were fair or moderate. Conclusions The between-study normality assumption is commonly violated in Cochrane meta-analyses. This assumption should be routinely assessed when performing a meta-analysis. When it may not hold, alternative meta-analysis methods that do not make this assumption should be considered.
Article
In recent years, meta-analysis has evolved to a critically important field of Statistics, and has significant applications in Medicine and Health Sciences. In this work we briefly present existing methodologies to conduct meta-analysis along with any discussion and recent developments accompanying them. Undoubtedly, studies brought together in a systematic review will differ in one way or another. This yields a considerable amount of variability, any kind of which may be termed heterogeneity. To this end, reports of meta-analyses commonly present a statistical test of heterogeneity when attempting to establish whether the included studies are indeed similar in terms of the reported output or not. We intend to provide an overview of the topic, discuss the potential sources of heterogeneity commonly met in the literature and provide useful guidelines on how to address this issue and to detect heterogeneity. Moreover, we review the recent developments in the Bayesian approach along with the various graphical tools and statistical software that are currently available to the analyst. In addition, we discuss sensitivity analysis issues and other approaches of understanding the causes of heterogeneity. Finally, we explore heterogeneity in meta-analysis for time to event data in a nutshell, pointing out its unique characteristics.
Chapter
It is well acknowledged that the information acquired from a large body of evidence, based on combining multiple studies, is more useful and less biased than what can be acquired by a single work. However, like any other clinical study, the quality of the meta-analysis results depends on the selected material and the adopted methods. The analysis relies on choosing a computable effect sizeEffect size that can answer the research question and combining the effects of different studies in a fixed or more realistic random-effects model that considers the variability between studies. Whether related to subgroups or risk factors, true variability can be further illuminated by moderator analysis. A major concern will be whether there was a publication bias or other inclinations related to the literature being full of small studies producing clinical effects disproportionate to their small size. The heterogeneity of the study material and the multiplicity of the methods implicate conducting a sensitivity analysis throughout the study to ensure that the results are robust and can be safely applied to the population.
Article
The global tea (Camellia sinensis L.) output has been in decline despite the continued increases in global tea consumption amount and cultivation area. Application of potassium (K) fertilizer is a common way to enhance tea yield and quality, given the widespread deficiency of soil-available K in tea plantations. However, the specific effects of K fertilizer application upon tea yield and quality and its differential effects among varied K fertilizer application amount, soil-available K amount and tea types remain unknown. In this study, 518 pairwise comparisons between tea productivity after non-application and application of K fertilizers were obtained through a literature review, and then analyzed. The results showed that K application had a significant positive effect on tea yield and overall quality, enhancing the amount of tea yield, the content of amino acid, tea polyphenol, water extract, and catechin by 7.83 %, 7.84 %, 26.79 %, 1.57 % and 6.47 %, respectively. Furthermore, a K application rate of 90–120 kg ha⁻¹ maximized the tea yield and quality. However, K application showed varied effects on tea yield and quality in soils with different available K levels. In terms of tea types, the yield and quality of black and green tea were enhanced under K application. The results of a mixed-effects model indicated that the main factors affecting tea quality were the amount of K fertilizer applied and the soil-available K content, whereas the variation in tea yield results mainly from the tea types. Therefore, to develop high-quality tea plantations and address the contradiction between low tea yield and high tea consumption, efforts should be made to enhance the soil-available K content by optimizing the amount of K applied on tea plantations.
Article
Full-text available
Background There is increasing evidence that regulatory problems (RPs), such as excessive crying, sleeping or feeding problems in infancy, could be associated with the development of behavioral problems in childhood. In this meta-analysis we aimed to investigate the strength and characteristics of this association. Methods A systematic literature search (PubMed/PsycInfo, until 15/08/2021) for longitudinal prospective studies of infants with RPs and at least one follow-up assessment reporting incidence and/or severity of behavioral problems was conducted. The primary outcomes were (i) the cumulative incidence of behavioral problems in children (2–14 years) with previous RPs and (ii) the difference between children with/without previous RPs with regard to the incidence and severity of externalizing, internalizing and/or attention-deficit/hyperactivity disorder (ADHD) symptoms. Additionally, we analyzed behavioral problems of children with previous single, multiple or no RPs and with respect to age at follow-up. Subgroup and meta-regression analyses were added. Results 30 meta-analyzed studies reported on 34,582 participants (n RP = 5091, n control = 29,491; age: baseline = 6.5 ± 4.5 months, follow-up = 5.5 ± 2.8 years) with excessive crying (studies = 13, n = 1577), sleeping problems (studies = 9, n = 2014), eating problems (studies = 3, n = 105), any single (studies = 2, n = 201) or multiple RPs (studies = 9, n = 1194). The cumulative incidence for behavioral problems during childhood was 23.3% in children with RPs. Behavioral problems were significantly more pronounced in infants with RPs compared to healthy controls (SMD = 0.381, 95% CI = 0.296–0.466, p < .001), particularly with multiple RPs (SMD = 0.291, p = 0.018). Conclusions Findings suggest that RPs in infancy are associated with overall behavioral problems (externalizing or internalizing behavior and ADHD symptoms) in childhood. Our data cannot explain linked developmental trajectories and underlying factors. However, detection of affected infants may help to adapt supportive measures to the individual familial needs to promote the parent-child-relationship and prevent the development of child behavioral problems from early on.
Article
Full-text available
Background Countries with high TB burden have expanded access to molecular diagnostic tests. However, their impact on reducing delays in TB diagnosis and treatment has not been assessed. Our primary aim was to summarize the quantitative evidence on the impact of nucleic acid amplification tests (NAAT) on diagnostic and treatment delays compared to that of the standard of care for drug-sensitive and drug-resistant tuberculosis (DS-TB and DR-TB). Methods We searched MEDLINE, EMBASE, Web of Science, and the Global Health databases (from their inception to October 12, 2020) and extracted time delay data for each test. We then analysed the diagnostic and treatment initiation delay separately for DS-TB and DR-TB by comparing smear vs Xpert for DS-TB and culture drug sensitivity testing (DST) vs line probe assay (LPA) for DR-TB. We conducted random effects meta-analyses of differences of the medians to quantify the difference in diagnostic and treatment initiation delay, and we investigated heterogeneity in effect estimates based on the period the test was used in, empiric treatment rate, HIV prevalence, healthcare level, and study design. We also evaluated methodological differences in assessing time delays. Results A total of 45 studies were included in this review (DS = 26; DR = 20). We found considerable heterogeneity in the definition and reporting of time delays across the studies. For DS-TB, the use of Xpert reduced diagnostic delay by 1.79 days (95% CI − 0.27 to 3.85) and treatment initiation delay by 2.55 days (95% CI 0.54–4.56) in comparison to sputum microscopy. For DR-TB, use of LPAs reduced diagnostic delay by 40.09 days (95% CI 26.82–53.37) and treatment initiation delay by 45.32 days (95% CI 30.27–60.37) in comparison to any culture DST methods. Conclusions Our findings indicate that the use of World Health Organization recommended diagnostics for TB reduced delays in diagnosing and initiating TB treatment. Future studies evaluating performance and impact of diagnostics should consider reporting time delay estimates based on the standardized reporting framework.
Chapter
A systematic review and meta-analysis (SRMA) of results from randomized clinical trials (RCTs) is considered the highest level of evidence in determining comparative effect of health intervention for a given disease or condition. This exercise involves pooling results of relevant published Randomized Controlled Trials (RCTs) to obtain the totality of evidence on specified outcomes of interest. This chapter mainly focuses on meta-analysis of intervention studies. The chapter aims at introducing the following topics in meta-analysis: 1. Statistical methods behind meta-analysis including measures of disease occurrence (e.g., odds ratio [OR], relative risk [RR], and mean difference [MD]); methods for pooling results (e.g., Peto OR, Mantel Haenszel [MH] Statistic and Inverse Variance [IV]); some study designs in clinical trials; measures of heterogeneity; and subgroup analysis 2. Steps involved in meta-analysis of interventions using an open-source software (R statistical software for construction of forest plots and funnel plots, and computation of heterogeneity indices) 3. Meta-regression including illustrative examples using R codes to demonstrate how meta-analysis is conducted for continuous and dichotomous outcomes KeywordsComputer softwareFixed effectMeta-analysisMeta-regressionRandom effects
Chapter
In the big-data era, the meta-data collected to address the same/similar scientific question usually come from diverse sources (such as, multi-regional clinical trials, multiple intervention studies). Meta-analysis (MA) is then a statistical methodology for combining information from these diverse sources to reach a more reliable conclusion. In this chapter, an overview of MA is given with emphasis on classical fixed-effects and random-effects MA models to synthesize summary statistics from all studies as well as meta-regression to explain the between-study heterogeneity. A Monte-Carlo simulation study is designed to illustrate the relative efficiency of the MA using summary statistics to the MA using the original individual participant-level data. Real meta-data from 13 clinical trials to assess Bacillus Calmette-Guerin vaccine in the prevention of tuberculosis are used to demonstrate the implementation of these meta-analysis models in open source R software.KeywordsMeta-analysisFixed-effects modelRandom-effects modelWeighted-mean estimatorDersimonian-Laird estimatorHeterogeneityMeta-regressionIndividual patient-level dataMonte-Carlo simulations
Article
Full-text available
Background Several studies report mixed associations between the retinal nerve fiber layer (RNFL) thickness with cognitive and physical disability in persons with multiple sclerosis (PwMS). Systematic synthesis of these findings is crucial in deriving credible conclusions. Methods Five databases were searched from their inception to March 2022. The inclusion criteria for studies were MS-specific and required RNFL and cognitive performance data in order to be analyzed. The selection processes followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Results The systematic review yielded 31 studies that investigated the association between RNFL thickness and cognitive performance. Twenty-two studies reported positive associations, and nine did not. The meta-analysis included 11 studies with a total of 782 PwMS with mean age of 40.5 years, mean Expanded Disability Status Scale (EDSS) of 2.7, and disease duration of 11.3 years. RNFL thickness was significantly associated Symbol Digit Modalities Test (pooled r = 0.306, p < 0.001), Paced Auditory Serial Addition Test (pooled r = 0.374, p < 0.001) and Word List Generation (WLG, pooled r = 0.177, p < 0.001). RNFL was also significantly correlated with visuospatial learning and memory tests (pooled r = 0.148, p = 0.042) and verbal learning and memory tests (pooled r = 0.245, p = 0.005). Within three eligible studies, no significant association between ganglion cell inner-plexiform layer and SDMT 0.083 (95% CI − 0.186, 0.352) was noted. The heterogeneity was high in all correlation studies (I² > 63% and p < 0.008) except for the WLG and visuospatial memory findings. Conclusion RNFL thickness is associated with cognitive processing speed, verbal learning and memory, visual learning and memory, as well as verbal fluency in PwMS. The number of studies included in the meta-analyses were limited due to non-standardized reporting.
Chapter
Suicide rates continue to increase globally. The volume of research in this field has also expanded rapidly. In A Concise Guide to Understanding Suicide, leading researchers and clinicians provide a concise review of recent literature, report solutions achieved and give practical guidance for patient care to aid understanding and help prevent suicide. Each chapter is highly focused to provide pertinent information covering all major aspects of the field, from epidemiology and theories of causation through to treatment and prevention. This text will educate practising clinicians (psychologists, psychiatrists, nurses, counsellors, and emergency room personnel) and other health care workers and researchers, as well as providing a pathway for undergraduate and graduate students interested in furthering their understanding of the complexities surrounding suicide. Further, mental health professionals and those in the social sciences will be extremely interested in this monograph, as will the University community, armed forces and interested lay public.
Chapter
Suicide rates continue to increase globally. The volume of research in this field has also expanded rapidly. In A Concise Guide to Understanding Suicide, leading researchers and clinicians provide a concise review of recent literature, report solutions achieved and give practical guidance for patient care to aid understanding and help prevent suicide. Each chapter is highly focused to provide pertinent information covering all major aspects of the field, from epidemiology and theories of causation through to treatment and prevention. This text will educate practising clinicians (psychologists, psychiatrists, nurses, counsellors, and emergency room personnel) and other health care workers and researchers, as well as providing a pathway for undergraduate and graduate students interested in furthering their understanding of the complexities surrounding suicide. Further, mental health professionals and those in the social sciences will be extremely interested in this monograph, as will the University community, armed forces and interested lay public.
Chapter
Suicide rates continue to increase globally. The volume of research in this field has also expanded rapidly. In A Concise Guide to Understanding Suicide, leading researchers and clinicians provide a concise review of recent literature, report solutions achieved and give practical guidance for patient care to aid understanding and help prevent suicide. Each chapter is highly focused to provide pertinent information covering all major aspects of the field, from epidemiology and theories of causation through to treatment and prevention. This text will educate practising clinicians (psychologists, psychiatrists, nurses, counsellors, and emergency room personnel) and other health care workers and researchers, as well as providing a pathway for undergraduate and graduate students interested in furthering their understanding of the complexities surrounding suicide. Further, mental health professionals and those in the social sciences will be extremely interested in this monograph, as will the University community, armed forces and interested lay public.
Chapter
Suicide rates continue to increase globally. The volume of research in this field has also expanded rapidly. In A Concise Guide to Understanding Suicide, leading researchers and clinicians provide a concise review of recent literature, report solutions achieved and give practical guidance for patient care to aid understanding and help prevent suicide. Each chapter is highly focused to provide pertinent information covering all major aspects of the field, from epidemiology and theories of causation through to treatment and prevention. This text will educate practising clinicians (psychologists, psychiatrists, nurses, counsellors, and emergency room personnel) and other health care workers and researchers, as well as providing a pathway for undergraduate and graduate students interested in furthering their understanding of the complexities surrounding suicide. Further, mental health professionals and those in the social sciences will be extremely interested in this monograph, as will the University community, armed forces and interested lay public.
Thesis
Meta-analysis is now commonly used in medical research. However there are statistical issues relating to the subject that require investigation and some are considered here, from both a methodological and a practical perspective. Each of the fixed effect and the random effects models for meta-analysis are based on certain assumptions and the validity of these is investigated. A formal test of the homogeneity assumption made in the fixed effect model may be performed. Since the test has low power, simulation was used to investigate the power under various conditions. The random effects model incorporates a between-study component of variance into the model. A likelihood based method was used to obtain a confidence interval for this variance and also to provide an interval for the overall treatment effect which takes into account the fact that the between-study variance is estimated, rather than assuming it to be known. In order to obtain confidence intervals for the treatment effect for both the fixed effect and the random effects models, distributional assumptions of normality are usually made. Such assumptions may be checked using q-q plots of the residuals obtained for each trial in the meta-analysis. In both meta-analysis models it is assumed that the weight allocated to each study is known, when in fact it must be estimated from the data. The effect of estimating the weights on the overall treatment effect estimate, its confidence intervals, the between-study variance estimate and the test statistic for homogeneity, is investigated by both analytic and simulation methods. It is shown how meta-analysis methods may be used to analyse multicentre trials of a paired cluster randomised design. Meta-analysis techniques are found to be preferable to previously published methods specifically developed for the analysis of such designs, which produce biased and potentially misleading results when a large treatment effect is present.
Article
SUMMARY We develop several methods for estimating the treatment effect difference defined as the overall log-odds ratio of favourable response in a multicentre clinical trial comparing two treatments with binary response. A simulation study compares the bias and mean squared error of the point estimates and the exact coverage probabilities of confidence intervals obtained distributions.
Article
The Shapiro-Wilk W and Shapiro-Francia W' statistics are convenient and powerful tests of departure from normality. Modifications are described to allow the use of Wand W' with grouped and with singly censored data, and of W with log-normally distributed data. Methods are given to enable the P value of each test to be calculated under these different circumstances. It is hoped that software developers will thereby be encouraged to incorporate one or other of the tests in their statistical products.
Article
Weighted normal plots are proposed as graphical checks on the normality of random effects in Gaussian linear models. The technique is illustrated using the one-way comparisons model Yi = μi + ϵi, where the (μi, ϵi), are independent pairs with μi and ϵi, independent N(0, σ) and N(0, σ i), respectively, for i = 1, …, n. When the variance components σ and σ i are known, an unweighted normal plot of the standardized Zi = Yi(σ + σ i) provides a check of the overall adequacy of the model. Weighted normal plots involve a modification that gives the ith observation a sample weight of Wi = (σ + σ i). Under the null hypothesis, the sample size must be larger by a factor of (1 + v/m ), where m and v are the mean and variance of the weights, to produce a weighted plot with approximately the same sampling variance as an unweighted normal plot. Despite this higher variability, we show that weighted plots are more sensitive than unweighted plots to several departures from the assumed distribution on the random effects, μi. Several numerical examples are included and the effects of substituting maximum likelihood estimates for the parameters σ and σ i are considered briefly.
Article
We conducted a simulation study to determine the performance of nine procedures for testing the homogeneity of odds ratios in K 2 × 2 contingency tables. We recommend Tarone's approximate score test, based on the Mantel—Haenszel estimator of the common odds ratio, for use in practice. We also recommend a non-iterative statistic developed by Gart and based on the modified Woolf estimator of the common odds ratio for very large samples in balanced or mildly unbalanced designs. We base our recommendation of a statistic on its performance in terms of size and power in comparison with the other statistics considered.
Article
This paper concerns the efficiency of the conditional likelihood method for inference in models which include nuisance parameters. A new concept of ancillarity, asymptotic weak ancillarity, is introduced. It is shown that the conditional maximum likelihood estimator and the conditional score test of θ, the parameter of interest, are asymptotically equivalent to their unconditional counterparts, and hence are asymptotically efficient, provided that the conditioning statistic is asymptotically weakly ancillary. The key assumption that the conditioning statistic is asymptotically weakly ancillary is verified when the underlying distribution is from exponential families. Some illustrative examples are given.
Article
Three tests for homogeneity of odds ratio for a series of 2 × 2 tables when the data are sparse are studied by means of Monte Carlo experiments. A score test, based on the assumption that the log odds ratios are generated from some unknown distribution, is shown to be more powerful than the other two. The original test of Zelen (1971), later corrected by Halperin, Ware, Byar et al. (1977), is also examined. It tends to be too liberal under the sparse data situation; however, it has approximately the nominal size when there are as many as ten tables with as few as ten observations per table.
Article
This article offers a practical guide to goodness-of-fit tests using statistics based on the empirical distribution function (EDF). Five of the leading statistics are examined—those often labelled D, W , V, U , A —and three important situations: where the hypothesized distribution F(x) is completely specified and where F(x) represents the normal or exponential distribution with one or more parameters to be estimated from the data. EDF statistics are easily calculated, and the tests require only one line of significance points for each situation. They are also shown to be competitive in terms of power.
Article
Several large epidemiological studies in the Nordic countries have failed to confirm an association between age at first birth and breast cancer independent of parity. To assess whether lack of power or heterogeneity between the countries could explain this, a meta-analysis was performed of 8 population-based studies (3 cohort and 5 case-control) of breast cancer and reproductive variables in the Nordic countries, including a total of 5,568 cases. It confirmed that low parity and late age at first birth are significant and independent determinants of breast-cancer risk. Nulliparity was assoclated with a 30% increase in risk compared with parous women, and for every 2 births, the risk was reduced by about 16%. There was a significant trend of increasing risk with increasing age at first birth, women giving first birth after the age of 35 years having a 40% increased risk compared to those with a first birth before the age of 20 years. Tests for heterogeneity between studies were not significant for any of the examined variables. In the absence of bias, this suggests that several individual Nordic studies may have had too little power to detect the weak effect of age at first birth observed in the meta-analysis.
Article
This article presents a modification of the Shapiro-Wilk W statistic for testing normality which can be used with large samples. Shapiro and Wilk gave coefficients and percentage points for sample sizes up to 50. These coefficients required obtaining an approximation to the covariance matrix of the normal order statistics. The proposed test uses coefficients which depend only on the expected values of the normal order statistics which are generally available. Results of an empirical sampling study to compare the sensitivity of the test statistic to the W test statistic are briefly discussed.
Article
In this note we study, by simulation, small sample performance in terms of size of ten procedures for testing the homogeneity of odds ratios in K 2 x 2 contingency tables. These ten statistics are derived for 'large-stratum' settings. Our study concerns the behaviour of these statistics for 'small-stratum' settings.
Article
The enthusiasm for meta-analyses (or overviews) expressed by their proponents is not always shared by the broader medical community. To encourage constructive debate, we adopt a critical perspective on the conduct and interpretation of meta-analysis. We focus particularly on some of the statistical issues, especially heterogeneity between studies, and also on the extrapolation of meta-analysis findings to clinical practice. We conclude that meta-analysis is not an exact statistical science that provides definitive simple answers to complex clinical problems. It is more appropriately viewed as a valuable objective descriptive technique, which often furnishes clear qualitative conclusions about broad treatment policies, but whose quantitative results have to be interpreted cautiously.
Article
A three-stage hierarchical model is proposed for two treatment, binary response studies conducted in a number of centres. The approach adopted is Bayesian. Marginal densities for second stage parameters are shown to provide useful summaries both of comparative efficacy and of the heterogeneity of treatment effects across centres. Sensitivity studies of model assumptions are illustrated.
Article
We conducted a simulation study to determine the performance of nine procedures for testing the homogeneity of odds ratios in K 2 x 2 contingency tables. We recommend Tarone's approximate score test, based on the Mantel-Haenszel estimator of the common odds ratio, for use in practice. We also recommend a non-iterative statistic developed by Gart and based on the modified Woolf estimator of the common odds ratio for very large samples in balanced or mildly unbalanced designs. We base our recommendation of a statistic on its performance in terms of size and power in comparison with the other statistics considered.
Article
An epidemiologically impeccable study does not bring answers to all the important questions. A structured and systematic integration of information from different studies of a given problem with a view to answering the original question or bringing additional information is the essence and objective of the meta-analytic approach to health problem solving. Original studies in medicine, being very heterogeneous in nature and structure require not only a quantitative approach (as in classical meta-analysis) but also an additional "qualitative meta-analysis" as well. The latter represents not only a systematic accumulation of both information and the characteristics of different studies, but also an assessment of quality, uncertainty, missing data, random error and bias across studies of interest. The greatest challenge of meta-analysis in medicine lies in the integration of the qualitative and quantitative assessment of given information (scoring of quality, weighing of the effect size by quality score, etc.). Meta-analysis in medicine must go beyond a simple pooling of data. It should become the "epidemiology of results of independent studies of a common topic of interest". Further development of meta-analysis in such an expanded way may have an important impact on decision-making in clinical medicine, and in health policies.
Article
We compare two statistical methods for combining event rates from several studies. Both methods treat each study as a separate stratum. The Peto-modified Mantel-Haenszel (Peto) method estimates a combined odds ratio assuming homogeneity across strata and provides a test for heterogeneity. The DerSimonian and Laird modified Cochran method (D&L) produces a weighted average of rate differences, where the weights allow for among-study variability. We analyse 22 meta-analyses from ten reports by both methods. The pooled estimates are divided by their standard errors to produce a Z-statistic. A t-test comparing Z-statistics from all 22 studies suggests that the D&L method tends to be more conservative [d(Peto - D&L) = 0.29, t = 2.53, p = 0.02]. For a subset of 14 non-heterogeneous studies, the difference is smaller and non-significant (d = 0.09, t = 0.72, p = 0.49). The results from the methods correlate well (r = 0.66 for all 22 studies, r = 0.95 for 14 non-heterogeneous studies). Thus, the presence of heterogeneity influences our conclusion. We discuss the statistical and scientific implications of these findings.
Article
Epidemiologic data for case-control studies are often summarized into K 2 x 2 tables. Given a fixed number of cases and controls, the degree of sparseness in the data depends on the number of strata, K. The effect of increasing stratification on size and power of seven tests of homogeneity of the odds ratio is studied using Monte Carlo methods. In all the designs considered here, the numbers of cases and controls per stratum are the same. Considering both size and power in non-sparse-data settings, we recommend the Breslow-Day statistic (1980, Statistical Methods in Cancer Research, 1. The Analysis of Case-Control Studies, p. 142; Lyon: International Agency for Research on Cancer) for general use. In sparse-data settings the T4 statistic of Liang and Self (1985, Biometrika 72, 353-358) performs the best when all tables, regardless of sample size, have odds ratios generated from the same distribution. In sparse-data settings characterized by a large table with an odds ratio of 1 and many small tables with odds ratios greater than 1, the T5 statistic of Liang and Self (1985) performs the best. One of the most important results of this study is the generally low power for all homogeneity tests especially when the data are sparse.
Article
To display a number of estimates of a parameter obtained from different studies it is common practice to plot a sequence of confidence intervals. This can be useful but is often unsatisfactory. An alternative display is suggested which represents intervals as points on a bivariate graph, and which has advantages. When the data are estimates of odds ratios from studies with a binary response, it is argued that for either type of plot, a log scale should be used rather than a linear scale.
Article
This paper examines eight published reviews each reporting results from several related trials. Each review pools the results from the relevant trials in order to evaluate the efficacy of a certain treatment for a specified medical condition. These reviews lack consistent assessment of homogeneity of treatment effect before pooling. We discuss a random effects approach to combining evidence from a series of experiments comparing two treatments. This approach incorporates the heterogeneity of effects in the analysis of the overall treatment efficacy. The model can be extended to include relevant covariates which would reduce the heterogeneity and allow for more specific therapeutic recommendations. We suggest a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies.
Article
Although meta-analysis is now well established as a method of reviewing evidence, an uncritical use of the technique can be very misleading. One common problem is the failure to investigate appropriately the sources of heterogeneity, in particular the clinical differences between the studies included. This paper distinguishes between the concepts of clinical and statistical heterogeneity and exemplifies the importance of investigating heterogeneity by using published meta-analyses of epidemiological studies of serum cholesterol concentration and clinical trials of its reduction. Although not without some dangers of speculative conclusions, prompted by overzealous inspection of the data to hand, a sensible investigation of sources of heterogeneity should increase both the scientific and the clinical relevance of the results of meta-analyses.
Article
The usual meta-analysis of a sequence of randomized clinical trials only considers the difference between two treatments and produces a point estimate and a confidence interval for a parameter that measures this difference. The usual parameter is the log(odds ratio) linked to Mantel-Haenszel methodology. Inference is made either under the assumption of homogeneity or in a random effects model that takes account of heterogeneity between trials. This paper has two goals. The first is to present a likelihood based method for the estimation of the parameters in the random effects model, which avoids the use of approximating Normal distributions. The second goal is to extend this method to a bivariate random effects model, in which the effects in both groups are supposed random. In this way inference can be made about the relationship between improvement and baseline effect. The method is demonstrated by a meta-analysis dataset of Collins and Langman.
Article
There has recently been disagreement in the literature on the results and interpretation of meta-analyses of the trials of serum cholesterol reduction, both in terms of the quantification of the effect on ischaemic heart disease and as regards the evidence of any adverse effect on other causes of death. This paper describes statistical aspects of a recent meta-analysis of these trials, and draws some more general conclusions about the methods used in meta-analysis. Tests of an overall null hypothesis are shown to have a basis clearly distinct from the more extensive assumptions needed to provide an overall estimate of effect. The fixed effect approach to estimation relies on the implausible assumption of homogeneity of treatment effects across the trials, and is therefore likely to yield confidence intervals which are too narrow and conclusions which are too dogmatic. However the conventional random effects method relies on its own set of unrealistic assumptions, and cannot be regarded as a robust solution to the problem of statistical heterogeneity. The random effects method is more usefully regarded as a type of sensitivity analysis in which the weights allocated to each study in estimating the overall effect are modified. However, rather than using a statistical model for the 'unexplained' heterogeneity, greater insight and scientific understanding of the results of a set of trials may be obtained by a careful exploration of potential sources of heterogeneity. In the context of the cholesterol trials, the heterogeneity according to the extent and duration of cholesterol reduction are of prime concern and are investigated using logistic regression. It is concluded that the long-term benefits of serum cholesterol reduction on the risk of heart disease have been seriously underestimated in some previous meta-analyses, while the evidence for adverse effects on other causes of death have been misleadingly exaggerated.
Article
Current methods for meta-analysis still leave a number of unresolved issues, such as the choice between fixed- and random-effects models, the choice of population distribution in a random-effects analysis, the treatment of small studies and extreme results, and incorporation of study-specific covariates. We describe how a full Bayesian analysis can deal with these and other issues in a natural way, illustrated by a recent published example that displays a number of problems. Such analyses are now generally available using the BUGS implementation of Markov chain Monte Carlo numerical integration techniques. Appropriate proper prior distributions are derived, and sensitivity analysis to a variety of prior assumptions carried out. Current methods are briefly summarized and compared to the full Bayes analysis.
Article
In a meta-analysis of a set of clinical trials, a crucial but problematic component is providing an estimate and confidence interval for the overall treatment effect theta. Since in the presence of heterogeneity a fixed effect approach yields an artificially narrow confidence interval for theta, the random effects method of DerSimonian and Laird, which incorporates a moment estimator of the between-trial components of variance sigma B2, has been advocated. With the additional distributional assumptions of normality, a confidence interval for theta may be obtained. However, this method does not provide a confidence interval for sigma B2, nor a confidence interval for theta which takes account of the fact that sigma B2 has to be estimated from the data. We show how a likelihood based method can be used to overcome these problems, and use profile likelihoods to construct likelihood based confidence intervals. This approach yields an appropriately widened confidence interval compared with the standard random effects method. Examples of application to a published meta-analysis and a multicentre clinical trial are discussed. It is concluded that likelihood based methods are preferred to the standard method in undertaking random effects meta-analysis when the value of sigma B2 has an important effect on the overall estimated treatment effect.