Figure - uploaded by Muhittin Serdar
Content may be subject to copyright.
Sample size calculation formulas for some research methods (according to reference 17-23)

Sample size calculation formulas for some research methods (according to reference 17-23)

Source publication
Article
Full-text available
Calculating the sample size in scientific studies is one of the critical issues as regards the scientific contribution of the study. The sample size critically affects the hypothesis and the study design, and there is no straightforward way of calculating the effective sample size for reaching an accurate conclusion. Use of a statistically incorrec...

Contexts in source publication

Context 1
... manual calculation is preferred by the experts of the subject, it is a bit complicated and difficult for the researchers that are not statistics experts. In addition, considering the variety of the research types and characteristics, it should be noted that a great number of calculations will be required with too many variables (Table 1) (16,(24)(25)(26)(27)(28)(29)(30). ...
Context 2
... addition to sample size estimations that may be computed according to Table 4, formulas stated in Table 1 and the websites mentioned in Table 2 may also be utilized to estimate sample size in animal studies. Relying on previous studies pose certain limitations since it may not always be possible to acquire reliable "pooled standard deviation" and "group mean" values. ...
Context 3
... Similarity and equivalence: The sample size required demonstrating similarity and equivalence is very low. Sample size estimation can be performed manually using the formulas in Table 1 as well as software and websites in Table 2 (especially by G-Power). However, all of these calculations require preliminary results or previous study outputs regarding the hypothesis of interest. ...
Context 4
... verification studies, the "sample size" and the "minimum proportion of the observed samples required to lie within the CI limits" are proportional. For instance, for a 50-sample study, 90% of the samples are required to lie within the CI limits for approval of the verification; while for a 200-sample study, 93% is required (Table 10). In an example study whose total allowable error (TAE) is specified as 15%; 50 samples were measured. ...
Context 5
... the actual size estimations, it is a prerequisite for the researcher to calculate potential area under the curve (AUC) using data from previous or preliminary studies. In addition, size estimation may also be calculated manually according to Table 1, or using sensitivity (or TPF) and 1-specificity (FPF) values according to Table 11 which is adapted from CLSI EP24-A2 (83,84). ...
Context 6
... TPF represents sensitivity, FPF represents 1-specificity. Utilizing Table 11, for a 0.85 sensitivity, 0.90 specificity and a maximum allowable error of 5% (L = 0.05), 196 positive and 139 negative samples are required. For the scenarios not included in this table, reader should refer to the formulas given under "diagnostic prognostic studies" subsection of Table 1. ...
Context 7
... Table 11, for a 0.85 sensitivity, 0.90 specificity and a maximum allowable error of 5% (L = 0.05), 196 positive and 139 negative samples are required. For the scenarios not included in this table, reader should refer to the formulas given under "diagnostic prognostic studies" subsection of Table 1. ...
Context 8
... can be seen here, the critical parameters for sample size estimation are AUC, specificity and sensitivity, and their 95% CI values . The table 12 demonstrates the relationship of sample size with sensitivity, specificity, negative predictive value (NPV) and positive predictive value (PPV); the lower the sample size, the higher is the 95% CI values, leading to increase in type II errors (87). As can be seen here, confidence interval is narrowed as the sample size increases, ...
Context 9
... formulations given on Table 1 and the websites mentioned on Table 2 will be particularly useful for sample size estimations in survey studies which are dependent primarily on the population size (101). ...
Context 10
... margin of error would suggest that the poll results are less likely to reflect the survey results of an entire population. Table 13 may provide a practical solution for size estimation. A 5% ME means that, the actual population value is expected to lie within survey result ± 5%. ...
Context 11
... the contrary, for a Serdar CC. et al. Sample size, power and effect size in studies mined using the formula suggested in Table 14 which is based on the prevalence value (103). It is unlikely to reach a sufficient power for revealing of uncommon problems (prevalence 0.02) at small sample sizes. ...
Context 12
... 0.30) were discovered with higher power (0.83) even when the sample size was as low as 5. For situations where power and prevalence are known, effective sample size can easily be estimated using the formula in Table 1. ...
Context 13
... when ME is changed from 5% to 10% or 1%; the sample size which was initially 370 would change into 96 or 4900 respectively. For other ME and CI levels, the researcher should refer to the equations and software provided on Table 1 and Table 2 (102). Table 13. ...
Context 14
... other ME and CI levels, the researcher should refer to the equations and software provided on Table 1 and Table 2 (102). Table 13. Sample size estimation according to the population size (merely as rough estimates), margin of error (ME) and confidence interval (CI) fixed ME, CI and sample size is directly proportional; in order to obtain a higher CI, the sample size should be increased. ...
Context 15
... variation in ME causes a more drastic change in sample size than a variation in CI. As exemplified in Table 13, for a population of 10,000 people, a survey with a 95% CI and 5% ME would require at least 370 samples. When CI is changed from 95% to 90% or 99%, the sample size which was 370 initially would change into 264 or 623 respectively. ...
Context 16
... when ME is changed from 5% to 10% or 1%; the sample size which was initially 370 would change into 96 or 4900 respectively. For other ME and CI levels, the researcher should refer to the equations and software provided on Table 1 and Table 2. The situation is slightly different for the survey studies to be conducted for problem detection. ...
Context 17
... If a similar previous study is available, or preliminary results of the current study are present, their results may be used for sample size estimations via the websites and software mentioned in Table 1 and Table 2. Some of these software may also be used to calculate effect size and power. ...
Context 18
... to reference 103. Table 14. The relation among prevalence, sample size and power of a study that will detect a problem after "N" number of interviews Serdar CC. et al. ...

Citations

... According to our pilot study, the a priori sample size estimation yielded a size of 22 participants for a power of 0.85, when alpha was 0.05. With 104 participants, the post hoc power calculation resulted in 1-beta values > 0.995, when alpha was 0.05, indicating an efficient statistical power [23]. ...
Article
Full-text available
    Background To assess the improvement of image quality and diagnostic acceptance of thinner slice iodine maps enabled by deep learning image reconstruction (DLIR) in abdominal dual-energy CT (DECT). Methods This study prospectively included 104 participants with 136 lesions. Four series of iodine maps were generated based on portal-venous scans of contrast-enhanced abdominal DECT: 5-mm and 1.25-mm using adaptive statistical iterative reconstruction-V (Asir-V) with 50% blending (AV-50), and 1.25-mm using DLIR with medium (DLIR-M), and high strength (DLIR-H). The iodine concentrations (IC) and their standard deviations of nine anatomical sites were measured, and the corresponding coefficient of variations (CV) were calculated. Noise-power-spectrum (NPS) and edge-rise-slope (ERS) were measured. Five radiologists rated image quality in terms of image noise, contrast, sharpness, texture, and small structure visibility, and evaluated overall diagnostic acceptability of images and lesion conspicuity. Results The four reconstructions maintained the IC values unchanged in nine anatomical sites (all p > 0.999). Compared to 1.25-mm AV-50, 1.25-mm DLIR-M and DLIR-H significantly reduced CV values (all p < 0.001) and presented lower noise and noise peak (both p < 0.001). Compared to 5-mm AV-50, 1.25-mm images had higher ERS (all p < 0.001). The difference of the peak and average spatial frequency among the four reconstructions was relatively small but statistically significant (both p < 0.001). The 1.25-mm DLIR-M images were rated higher than the 5-mm and 1.25-mm AV-50 images for diagnostic acceptability and lesion conspicuity (all P < 0.001). Conclusions DLIR may facilitate the thinner slice thickness iodine maps in abdominal DECT for improvement of image quality, diagnostic acceptability, and lesion conspicuity.
    ... Sample size calculation was performed using G*Power version 3.1.9.4 (Heinrich Hein Universität Düsseldorf, Germany). A sample size of 80 patients achieved a small effect size f (0.37), 80% power (1 − β error probability), and a significance level of 0.05 (Serdar et al. 2021). Statistical analysis was performed using the IBM SPSS software version 26 (IBM Corp., Armonk, NY, USA). ...
    Article
    Full-text available
    Background Myofascial pain syndrome (MPS) is a particular type of temporomandibular joint disorder. Research findings comparing various treatment approaches are scarce and controversial. Therefore, this study aimed to compare the effectiveness of ultrasound therapy, stabilization splint, TheraBite device, and masticatory muscle exercises in reducing pain intensity and improving mandibular mobility in patients with MPS. Methods It was a single‐blind, randomized, parallel‐group, active‐controlled trial that took place between April 2023 and October 2023 at the Department of Fixed Prosthodontics, Damascus University. Patients older than 18 years old with myofascial pain accompanied by limited jaw opening and pain lasting for at least 6 months were included. Eighty patients were randomly assigned into four groups using online randomization software: ultrasound therapy, stabilization splint, TheraBite device, and masticatory muscle exercises. Only outcome assessors were masked to treatment allocation. The exercise regimen was the exercise program for patients with TMD. The following primary outcome measures were considered at the baseline (t0), at the first (t1), second (t2), and fourth (t3) week of treatment, and at the second (t4) and fifth (t5) month of follow‐up: pain intensity using the visual analogue scale, maximum interincisal opening, right lateral movement, and left lateral movement measured in millimeters. Results The pain level changed from severe to mild at t3 in ultrasound therapy, stabilization splint, and TheraBite device groups. In the masticatory muscle exercises group, it changed to moderate, with a significant difference between ultrasound therapy (p = 0.012) and stabilization splint (p = 0.013) groups. In addition, the mandibular mobility continued to improve at the subsequent follow‐up periods (t4 and t5). Conclusions All therapies are equally effective after 5‐month follow‐up. However, ultrasound therapy and stabilization splints have the benefit of achieving rapid improvement. Trial Registration ISRCTN20833186.
    ... This extension aims to increase the statistical power and reduce the risk of erroneous data interpretation (Table 6). In the main text of the paper, in addition to tables with formulas for SS and ES, there are tables with free calculators for these values available online (Table 6) as a form of supplement to the paper Serdar et al. [238] or Charan and Biswas [16]. However, there were no academic calculators or government websites to help with the analyses in the above studies, so Table 7 was created. ...
    Article
    Full-text available
    Background/Objectives: Temporomandibular disorder (TMD) is the term used to describe a pathology (dysfunction and pain) in the masticatory muscles and temporomandibular joint (TMJ). There is an apparent upward trend in the publication of dental research and a need to continually improve the quality of research. Therefore, this study was conducted to analyse the use of sample size and effect size calculations in a TMD randomised controlled trial. Methods: The period was restricted to the full 5 years, i.e., papers published in 2019, 2020, 2021, 2022, and 2023. The filter article type—“Randomized Controlled Trial” was used. The studies were graded on a two-level scale: 0–1. In the case of 1, sample size (SS) and effect size (ES) were calculated. Results: In the entire study sample, SS was used in 58% of studies, while ES was used in 15% of studies. Conclusions: Quality should improve as research increases. One factor that influences quality is the level of statistics. SS and ES calculations provide a basis for understanding the results obtained by the authors. Access to formulas, online calculators and software facilitates these analyses. High-quality trials provide a solid foundation for medical progress, fostering the development of personalized therapies that provide more precise and effective treatment and increase patients’ chances of recovery. Improving the quality of TMD research, and medical research in general, helps to increase public confidence in medical advances and raises the standard of patient care.
    ... In this study, the B&A limit of agreement were compared with the acceptable clinical limits for parameters suggested by Ricos et al. and Westgard QC [24]. The sample size of 25 participants for ICC analyses based on degrees of freedom and sample size of 35 participants considering Passing and Bablok regression analysis was deemed to be sufficient for this study [19,25]. All analyses were performed using statistical software SPSS Version 23 (IBM) and MedCalc for Windows, version 19.4 (MedCalc Software, Ostend, Belgium). ...
    Article
    Full-text available
    The purpose of this work was to investigate the degree of agreement between two distinct approaches for measuring a set of blood values and to compare comfort levels reported by participants when utilizing these two disparate measurement methods. Radial arterial blood was collected for the comparator analysis using the Abbott i-STAT® POCT device. In contrast, the non-invasive proprietary DBC methodology is used to calculate sodium, potassium, chloride, ionized calcium, total carbon dioxide, pH, bicarbonate, and oxygen saturation using four input parameters (temperature, hemoglobin, pO2, and pCO2). Agreement between the measurement for a set of blood values obtained using i-STAT and DBC methodology was compared using intraclass correlation coefficients, Passing and Bablok regression analyses, and Bland Altman plots. A p-value of <0.05 was considered statistically significant. A total of 37 participants were included in this study. The mean age of the participants was 42.4 ± 13 years, most were male (65%), predominantly Caucasian/White (75%), and of Hispanic ethnicity (40%). The Intraclass Correlation Coefficients (ICC) analyses indicated agreement levels ranging from poor to moderate between i-STAT and the DBC’s algorithm for Hb, pCO2, HCO3, TCO2, and Na, and weak agreement for pO2, HSO2, pH, K, Ca, and Cl. The Passing and Bablok regression analyses demonstrated that values for Hb, pO2, pCO2, TCO2, Cl, and Na obtained from the i-STAT did not differ significantly from that of the DBC’s algorithm suggesting good agreement. The values for Hb, K, and Na measured by the DBC algorithm were slightly higher than those obtained by the i-STAT, indicating some systematic differences between these two methods on Bland Altman Plots. The non-invasive DBC methodology was found to be reliable and robust for most of the measured blood values compared to invasive POCT i-STAT device in healthy participants. These findings need further validation in larger samples and among individuals afflicted with various medical conditions.
    ... An additional feature of this ascertainment problem is that, as it has been demonstrated that there is a male bias in the test materials and the clinical thresholds, those females who do acquire an autism diagnosis may, de facto, be more similar to males on the spectrum. Where female-male comparisons are part of a study rationale, the study design should reflect appropriate sample sizes to ensure adequate statistical power [63]. ...
    Article
    Full-text available
    Autism is a neurodevelopmental condition, behaviourally identified, which is generally characterised by social communication differences, and restrictive and repetitive patterns of behaviour and interests. It has long been claimed that it is more common in males. This observed preponderance of males in autistic populations has served as a focussing framework in all spheres of autism-related issues, from recognition and diagnosis through to theoretical models and research agendas. One related issue is the near total absence of females in key research areas. For example, this paper reports a review of over 120 brain-imaging studies of social brain processes in autism that reveals that nearly 70% only included male participants or minimal numbers (just one or two) of females. Authors of such studies very rarely report that their cohorts are virtually female-free and discuss their findings as though applicable to all autistic individuals. The absence of females can be linked to exclusionary consequences of autism diagnostic procedures, which have mainly been developed on male-only cohorts. There is clear evidence that disproportionately large numbers of females do not meet diagnostic criteria and are then excluded from ongoing autism research. Another issue is a long-standing assumption that the female autism phenotype is broadly equivalent to that of the male autism phenotype. Thus, models derived from male-based studies could be applicable to females. However, it is now emerging that certain patterns of social behaviour may be very different in females. This includes a specific type of social behaviour called camouflaging or masking, linked to attempts to disguise autistic characteristics. With respect to research in the field of sex/gender cognitive neuroscience, there is emerging evidence of female differences in patterns of connectivity and/or activation in the social brain that are at odds with those reported in previous, male-only studies. Decades of research have excluded or overlooked females on the autistic spectrum, resulting in the construction of inaccurate and misleading cognitive neuroscience models, and missed opportunities to explore the brain bases of this highly complex condition. A note of warning needs to be sounded about inferences drawn from past research, but if future research addresses this problem of male bias, then a deeper understanding of autism as a whole, as well as in previously overlooked females, will start to emerge.
    ... Actualmente, en Brasil hay aproximadamente 25 mil triatletas amateurs (). Por lo tanto, considerando un intervalo de confianza del 95% para la proporción de la muestra, y un margen de error de 10%, el tamaño de la muestra necesario para el presente estudio es de 96 triatletas (Serdar et al., 2021). La muestra fue de 151 triatletas amateurs entrenados: 108 hombres y 43 mujeres. ...
    Article
    Full-text available
    La calidad del sueño es un factor determinante para el rendimiento y la salud de los deportistas en general. El objetivo del presente estudio fue caracterizar la calidad del sueño en triatletas aficionados entrenados, masculinos y femeninos, entre 20 y 59 años. Se realizó una encuesta con 151 triatletas aficionados entrenados, 108 hombres (38.6 ± 8.1 años, experiencia en entrenamiento de triatlón 5.8 ± 4.3 años, frecuencia de entrenamiento 6.3 ± 0.9 días por semana) y 43 mujeres (39.3 ± 7.6 años, experiencia en entrenamiento de triatlón 4.8 ± 3.3 años, frecuencia de entrenamiento 6.5 ± 0.6 días por semana). La calidad del sueño se midió mediante el índice de calidad del sueño de Pittsburgh-Br (PSQI-Br), los valores totales por debajo de 05 puntos indican que duermen bien y los valores iguales o superiores a 05 puntos indican que duermen mal. Los datos de las subescalas se analizaron utilizando frecuencias absolutas y relativas. Los demás datos para la caracterización de la calidad del sueño se analizaron con mediana, media, desviación típica, error estándar e intervalo de confianza del 95% de la media. Los triatletas masculinos y femeninos tienen una mala calidad del sueño (valores iguales o superiores a 05 puntos), lo que puede tener efectos negativos en la salud y el rendimiento. En conclusión, todos los triatletas, sin importar el género y el grupo de edad, tienen una mala calidad del sueño.
    ... The sample size was calculated through an appropriate calculator (https://www.surveymonkey.com/mp/sample-size-calculator/) based on similar previous studies. 8,15,17 Assuming a confidence level of 95%, a margin of error of 5%, and a population proportion of 50%, the minimum sample size required was determined to be 92 participants. ...
    Article
    Full-text available
    Purpose This study aimed to investigate the relationship between temporomandibular joint (TMJ) effusion and TMJ pain, as well as jaw function limitation in patients via two-dimensional (2D) and three-dimensional (3D) magnetic resonance imaging (MRI) evaluation. Patients and Methods 121 patients diagnosed with temporomandibular disorder (TMD) were included. TMJ effusion was assessed qualitatively using MRI and quantified with 3D Slicer software, then graded accordingly. In addition, a visual analogue scale (VAS) was employed for pain reporting and an 8-item Jaw Functional Limitations Scale (JFLS-8) was utilized to evaluate jaw function limitation. Statistical analyses were performed appropriately for group comparisons and association determination. A probability of p<0.05 was considered statistically significant. Results 2D qualitative and 3D quantitative strategies were in high agreement for TMJ effusion grades (κ = 0.766). No significant associations were found between joint effusion and TMJ pain, nor with disc displacement and JLFS-8 scores. Moreover, the binary logistic regression analysis showed significant association between sex and the presence of TMJ effusion, exhibiting an Odds Ratio of 5.168 for females (p = 0.008). Conclusion 2D qualitative evaluation was as effective as 3D quantitative assessment for TMJ effusion diagnosis. No significant associations were found between TMJ effusion and TMJ pain, disc displacement or jaw function limitation. However, it was suggested that female patients suffering from TMD may be at a risk for TMJ effusion. Further prospective research is needed for validation.
    ... The number of participants needed for this study was based on a priori power calculation using methods outlined by Serdar et al. [20]. Given the lack of literature on this topic, a conservative estimate of medium effect size was utilized. ...
    Article
    Full-text available
    Background Direct anterior total hip arthroplasty (DA-THA) has increased in popularity over recent decades. However, DA-THA has been reported to have a higher incidence of superficial wound complications, including infection and incisional dehiscence, compared to other surgical approaches to hip arthroplasty. While this indicates a need for optimal wound closure, little research exists on the preferred method of skin closure following DA-THA. This study aimed to determine if there was any difference in rates of superficial infection, wound dehiscence, or overall wound complications with skin closure using a running subcuticular 3-0 Monocryl® suture compared to surgical staples following DA-THA. Methods Records of patients who underwent DA-THA at our institution between July 2017 to July 2022 were retrospectively reviewed. Data were abstracted on patient demographics, comorbidities, skin closure method, and wound complications from the electronic medical record. Superficial infection and wound dehiscence were classified based on explicit diagnosis in post-operative records and incision photographs taken during follow-up visits. Overall wound complications were classified in patients who experienced either superficial infection, incisional dehiscence, or both complications following surgery. Descriptive statistics and chi-squared measures were obtained from post-operative patient data, and significance was set at p ≤ 0.05. Results A total of 365 DA-THAs were completed in 349 patients. A running subcuticular 3-0 Monocryl® suture closed 207 surgeries (56.7%), while surgical staples closed 158 surgeries (43.3%). There was no significant difference in independent rates of superficial infection (p = 0.076) or wound dehiscence (p = 0.118) between suture and staple cohorts; however, suture closure (10, 2.7%) was associated with a significantly higher rate of overall wound complications compared to staple closure (1, 0.3%) (p = 0.020). Conclusion DA-THA carries the risk of overall wound complications, including superficial infection and wound dehiscence. Our findings suggest superficial skin closure with staples may be preferred over sutures due to lower rates of overall wound complications. Further studies are needed to determine the optimal method of skin closure following DA-THA.
    ... We also did post hoc analyses following the insignificant results, estimating model power for the models with the smallest and largest number of observations (for NM-OMI, n = 84 and n = 62; for BGS, n = 301 and n = 284). We used α = 0.1 (see Serdar et al., 2021), R 2 ≤ 0.1 [a minimum acceptable R 2 to demonstrate model fit], and two tested covariates for each of these tests. For NM-OMI, n = 84, and two tested covariates, the power is 0.857. ...
    Article
    Full-text available
    Objective: Deciduous dental crowns primarily develop during gestation and early infancy and embody early life stress exposures. Composite measures of dental fluctuating asymmetry (DFA) generated from the deciduous teeth may therefore indicate cumulative gestational stress in developmental origins of health and disease (DOHaD) studies. This study examines whether higher composite measures of deciduous DFA are associated with low birthweight and prematurity, two aspects of birth phenotype consistently associated with increased morbidity and mortality risks in adulthood. Subjects and Methods: We evaluated associations between composite deciduous DFA, birthweight, and birth term in two contemporary North American samples: an autopsy sample from New Mexico (n = 94), and sample from a growth cohort study in Burlington, Ontario (n = 304). Dental metric data for each sample was collected from postmortem CT scans and dental casts, respectively. Composite DFA was estimated using buccolingual (BL) and mesiodistal (MD) crown diameters from paired deciduous teeth. Results: Contrary to expectations, the results of linear regression indicated no significant relationship between birthweight and DFA, or birth term and DFA, in either sample. Conclusions: Deciduous DFA does not predict aspects of birth phenotype associated with gestational stress. Birthweight and birth term are plastic relative to the more developmentally stable deciduous dentition, which may only subtly embody early life stress. We suggest that deciduous DFA should be utilized with caution in DOHaD studies until its relationship with gestational stress is clarified.
    ... We used 80% power and an alpha = 0.05 in our Treichler et al. Trials (2024) 25:363 calculations, which are widely accepted parameters for power calculations in clinical trials [28]. Based on this, we plan to recruit 72 veterans over the 3-year study period for a final sample of 58 veterans (estimating ~ 20% attrition based on RCTs for similar interventions with SMI population) [29]. ...
    Article
    Full-text available
    Background Patient participation in treatment decision making is a pillar of recovery-oriented care and is associated with improvements in empowerment and well-being. Although demand for increased involvement in treatment decision-making is high among veterans with serious mental illness, rates of involvement are low. Collaborative decision skills training (CDST) is a recovery-oriented, skills-based intervention designed to support meaningful patient participation in treatment decision making. An open trial among veterans with psychosis supported CDST’s feasibility and demonstrated preliminary indications of effectiveness. A randomized control trial (RCT) is needed to test CDST’s effectiveness in comparison with an active control and further evaluate implementation feasibility. Methods The planned RCT is a hybrid type 1 trial, which will use mixed methods to systematically evaluate the effectiveness and implementation feasibility of CDST among veterans participating in a VA Psychosocial Rehabilitation and Recovery Center (PRRC) in Southern California. The first aim is to assess the effectiveness of CDST in comparison with the active control via the primary outcome, collaborative decision-making behavior during usual care appointments between veterans and their VA mental health clinicians, and secondary outcomes (i.e., treatment engagement, satisfaction, and outcome). The second aim is to characterize the implementation feasibility of CDST within the VA PRRC using the Practical Robust Implementation and Sustainability Model framework, including barriers and facilitators within the PRRC context to support future implementation. Discussion If CDST is found to be effective and feasible, implementation determinants gathered throughout the study can be used to ensure sustained and successful implementation at this PRRC and other PRRCs and similar settings nationally. Trial registration ClinicalTrials.gov NCT04324944. Registered on March 27, 2020. Trial registration data can be found in Appendix 1.