Article

Nonparametric Estimation From Incomplete Observations

Taylor & Francis
Journal of the American Statistical Association
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In lifetesting, medical follow-up, and other fields the observation of the time of occurrence of the event of interest (called a death) may be prevented for some of the items of the sample by the previous occurrence of some other event (called a loss). Losses may be either accidental or controlled, the latter resulting from a decision to terminate certain observations. In either case it is usually assumed in this paper that the lifetime (age at death) is independent of the potential loss time; in practice this assumption deserves careful scrutiny. Despite the resulting incompleteness of the data, it is desired to estimate the proportion P(t) of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t). The observation for each item of a suitable initial event, marking the beginning of its lifetime, is presupposed. For random samples of size N the product-limit (PL) estimate can be defined as follows: List and label the N observed lifetimes (whether to death or loss) in order of increasing magnitude, so that one has \(0 \leqslant t_1^\prime \leqslant t_2^\prime \leqslant \cdots \leqslant t_N^\prime .\) Then \(\hat P\left( t \right) = \Pi r\left[ {\left( {N - r} \right)/\left( {N - r + 1} \right)} \right]\), where r assumes those values for which \(t_r^\prime \leqslant t\) and for which \(t_r^\prime\) measures the time to death. This estimate is the distribution, unrestricted as to form, which maximizes the likelihood of the observations. Other estimates that are discussed are the actuarial estimates (which are also products, but with the number of factors usually reduced by grouping); and reduced-sample (RS) estimates, which require that losses not be accidental, so that the limits of observation (potential loss times) are known even for those items whose deaths are observed. When no losses occur at ages less than t the estimate of P(t) in all cases reduces to the usual binomial estimate, namely, the observed proportion of survivors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The observed cumulative mortality was estimated using 1 -Kaplan-Meier [25]. The relative risk (RR) and risk difference (RD) of observed cumulative mortality were estimated for the comparison between patients with FPC + SPC, and those with an FPC only. ...
... CI confidence interval, FPC first primary cancer, RD risk difference, RR relative risk, SPC second primary cancer a Calculated using 1 -Kaplan-Meier[25] b Calculated as observed mortality in FPC + SPC / observed mortality in FPC only c Calculated as observed mortality FPC + SPC -observed mortality in FPC onlyFig. 2Observed cumulative mortality a of breast first primary cancer patients with and without a synchronous or metachronous second primary cancer diagnosis. ...
... FPC first primary cancer, SPC second primary cancer. a Calculated using 1 -Kaplan-Meier[25] ...
Article
Full-text available
Purpose Second primary cancers (SPCs) are estimated to affect nearly 5% of patients with breast cancer within 10 years of their diagnosis. This study aimed to estimate the contribution of SPCs to the mortality of patients with a breast first primary cancer (FPC). Methods A population-based cohort of 17,210 patients with a breast FPC diagnosed between 2000 and 2010 was followed for SPCs (31/12/2015) and vital status (30/06/2021). Patients diagnosed with an SPC (265 synchronous and 897 metachronous, ≤ 1 and > 1 year after the FPC, respectively) were matched (1:3, by five-year age group and year of breast FPC diagnosis) to those without an SPC and alive when the corresponding SPC was diagnosed. Results Significantly higher hazards of death were found among patients with an SPC [hazard ratio of 1.56, 95% confidence interval (CI) 1.29–1.89 for synchronous SPCs; and 2.85, 95%CI 2.56–3.17 for metachronous SPCs] compared to patients with a breast FPC only. Estimates were higher for synchronous lung, stomach, non-Hodgkin lymphoma and breast SPCs, and metachronous liver, stomach, ovary, lung, rectum, corpus uteri, colon, breast, and non-Hodgkin lymphoma SPCs. The 15-year cumulative mortality was 59.5% for synchronous SPCs and 68.7% for metachronous SPCs, which was higher than in patients with a breast FPC only (43.6% and 44.8%, respectively). Conclusions In Northern Portugal, patients with an SPC following a breast FPC have a higher mortality compared with patients with a breast FPC only.
... Informally, it is used to analyze the relationship between the survival function and the variate; the former represents the survival results varying from the time. The traditional method, such as Kaplan-Meier survival estimate [3], is a better solution to survival analysis when the variates are vectors. But if any of the following two situations appeared, it would be more difficult for the traditional methods to estimate the survival analysis: the time is not according to normal distribution and the homogeneity of variance, or there is a special situation for the involved data: censored data. ...
... Meanwhile, ours supports the plural vectors in message space while [24,25] are only integers. The size of our authenticators is same as the one in [25] while [24] is much larger, where the symbol N + D D is the combinatorial number chosen D numbers from N + D numbers and |p| is y 0 = (m(X)) and a i = i and (g a i−1 i ) − (m(X)) = i , 3 Any two authenticators ̂ 1 ,̂ 2 can be changed into same space ℝ p × i × i : by multiplying the authenticator with underlying message 1, i.e. U . ...
Article
Full-text available
While it is well known that privacy-preserving cox regression generally consists of a semi-honest cloud service provider (CSP) who performs curious-but-honest computations on ciphertexts to train the cox model. No one can verify the behaviors of CSP when he performs computations dishonestly in reality. Focusing on this problem, we propose a verifiable privacy-preserving cox regression algorithm tailored with the semi-malicious CSP, where all his behaviors are recorded on a witness tape fulfilling the requirement of transparency. To be specific, a multi-key fully homomorphic encryption (FHE) is used to protect the information of different data owners. The verifiability of our proposed multi-key homomorphic message authenticator (HMAC) ensures CSP sends correct results back to data owners. Furthermore, the compactness of FHE and succinctness of HMAC both under multi keys make the cox regression scheme more feasible. The efficiency of our proposed cox regression scheme is also proved by both theoretical analyses and experimental evaluations. After 21 iterations, it costs no more than 10 min to evaluate our cox regression scheme.
... equations will be provided in the current section, more information regarding the derivation of the estimator can be found in ref. [16,17]. Time to failure, or satellite lifetime, is defined as the time between satellite launch and failure in the current analysis. ...
... The variance of the Kaplan-Meier Estimator calculated through by Greenwood's formula shown in Equation (4). 16,19 var ...
Article
Flight data for deep space satellites launched and operated between 1991 and 2020 is analyzed to generate various reliability metrics. Satellite reliability is first estimated by the Kaplan‐Meier estimator, then parameterized through the Weibull distribution. This general process is applied to a general satellite data set that included all deep space satellites launched between 1991 and 2020, as well as two data subsets. One subset focuses on deployable satellites, while the other introduces a methodology of normalizing satellite lifetimes by satellite design life. Results from the general data set prove deep space satellites suffer from infant mortality while the results from the deployable data subset show deployable deep space satellites are only reliable over short periods of time. Results from the design life normalized data set give promising results, with satellites having a relatively high chance of reaching their design life. Available information regarding specific modes of failure is also leveraged to generate a percent contribution to overall satellite failure for eight distinct failure modes. Satellite failure due to crashing, in‐space propulsion failure, and telemetry system failure are proven to drive both early in life failure and later in life failure, making them the main causes of decreased reliability.
... This scenario is more pronounced in food crops as evidenced by the dearth of literature involving survival analysis of events in crop plants. Non-parametric analysis for this study was done using the Kaplan-Meier (KM) estimate (Kaplan and Meier, 1958) while parametric analyses used single failure-time probability distribution models (Meeker and Escobar, 1998). The objectives of this study were to: (i) compare the efficiencies of different models in describing time to symptom expression and time to death among infected sweet potato genotypes and, (ii) develop predictive models. ...
... The methods used for nonparametric estimation of survival and hazard functions were the Kaplan-Meier (KM) method and actuarial life tables (Cox, 1972). The Kaplan-Meier method, also called the product-limit estimator, was developed by Kaplan and Meier (Kaplan and Meier, 1958) as a nonparametric maximum likelihood estimator and it estimates the survival function . Parametric modelling used five models, namely the Exponential, Weibull, Lognormal, Log-logistic and Generalized gamma models. ...
... Overall survival was measured from date of diagnosis to date that event occurred or censored at time of data cut-off. Survival probability was estimated using the Kaplan-Meier method 30 . The confidence interval of median survival time was constructed by the method of Brookmeyer-Crowley 31 . ...
Preprint
Full-text available
H3K27M-mutant diffuse midline gliomas (DMGs) express high levels of the GD2 disialoganglioside and chimeric antigen receptor modified T-cells targeting GD2 (GD2-CART) eradicate DMGs in preclinical models. Arm A of the Phase I trial NCT04196413 administered one IV dose of autologous GD2-CART to patients with H3K27M-mutant pontine (DIPG) or spinal (sDMG) diffuse midline glioma at two dose levels (DL1=1e6/kg; DL2=3e6/kg) following lymphodepleting (LD) chemotherapy. Patients with clinical or imaging benefit were eligible for subsequent intracerebroventricular (ICV) GD2-CART infusions (10-30e6 GD2-CART). Primary objectives were manufacturing feasibility, tolerability, and identification of a maximally tolerated dose of IV GD2-CART. Secondary objectives included preliminary assessments of benefit. Thirteen patients enrolled and 11 received IV GD2-CART on study [n=3 DL1(3 DIPG); n=8 DL2(6 DIPG/2 sDMG). GD2-CART manufacturing was successful for all patients. No dose-limiting toxicities (DLTs) occurred on DL1, but three patients experienced DLT on DL2 due to grade 4 cytokine release syndrome (CRS). Nine patients received ICV infusions, which were not associated with DLTs. All patients exhibited tumor inflammation-associated neurotoxicity (TIAN). Four patients demonstrated major volumetric tumor reductions (52%, 54%, 91% and 100%). One patient exhibited a complete response ongoing for >30 months since enrollment. Eight patients demonstrated neurological benefit based upon a protocol-directed Clinical Improvement Score. Sequential IV followed by ICV GD2-CART induced tumor regressions and neurological improvements in patients with DIPG and sDMG. DL1 was established as the maximally tolerated IV GD2-CART dose. Neurotoxicity was safely managed with intensive monitoring and close adherence to a management algorithm.
... In this section, we analyze three real datasets that can be better fitted using our proposed semiparametric approach, which are made available in Appendix 5. For each data set, we present several key metrics: the p-value derived from the Shapiro-Wilk normality test [66], the AIC and BIC criteria values (utilized for determining the optimal number of changepoints for the PE model), the Kaplan-Meier estimate of the reliability function [67], the Cox-Snell residuals [57], and the p-value derived from the Cramér-von Mises test [68,69], which evaluates the goodness-of-fit of our model. Additionally, we provide the estimated PCIs of the PE model, offering a comprehensive assessment of our approach's efficacy across diverse data sets. ...
Article
Full-text available
Piecewise models have gained popularity as a useful tool in reliability and quality control/monitoring, particularly when the process data deviates from a normal distribution. In this study, we develop maximum likelihood estimators (MLEs) for the process capability indices, denoted as C pk , C pm , C * pm and C pmk , using a semiparametric model. To remove the bias in the MLEs with small sample sizes, we propose a bias-correction approach to obtain improved estimates. Furthermore, we extend the proposed method to situations where the change-points in the density function are unknown. To estimate the model parameters efficiently, we employ the profiled maximum likelihood approach. Our simulation study reveals that the suggested method yields accurate estimates with low bias and mean squared error. Finally, we provide real-world data applications to demonstrate the superiority of the proposed procedure over existing ones.
... The Kaplan-Meier estimator of the survivorship function [25], also called the product limit. ...
... Survival analysis was performed using the Kaplan-Meier method 11 and survival curves produced at five years (with 30 patients remaining at risk, more than the required minimum of 10%), as described by Lettin et al, 12 with censoring at the time of last clinic visit or death. The primary endpoint for survival analysis for treatment failure was the time of further revision surgery. ...
Article
Full-text available
Aims In metal-on-metal (MoM) hip arthroplasties and resurfacings, mechanically induced corrosion can lead to elevated serum metal ions, a local inflammatory response, and formation of pseudotumours, ultimately requiring revision. The size and diametral clearance of anatomical (ADM) and modular (MDM) dual-mobility polyethylene bearings match those of Birmingham hip MoM components. If the acetabular component is satisfactorily positioned, well integrated into the bone, and has no surface damage, this presents the opportunity for revision with exchange of the metal head for ADM/MDM polyethylene bearings without removal of the acetabular component. Methods Between 2012 and 2020, across two centres, 94 patients underwent revision of Birmingham MoM hip arthroplasties or resurfacings. Mean age was 65.5 years (33 to 87). In 53 patients (56.4%), the acetabular component was retained and dual-mobility bearings were used (DM); in 41 (43.6%) the acetabulum was revised (AR). Patients underwent follow-up of minimum two-years (mean 4.6 (2.1 to 8.5) years). Results In the DM group, two (3.8%) patients underwent further surgery: one (1.9%) for dislocation and one (1.9%) for infection. In the AR group, four (9.8%) underwent further procedures: two (4.9%) for loosening of the acetabular component and two (4.9%) following dislocations. There were no other dislocations in either group. In the DM group, operating time (68.4 vs 101.5 mins, p < 0.001), postoperative drop in haemoglobin (16.6 vs 27.8 g/L, p < 0.001), and length of stay (1.8 vs 2.4 days, p < 0.001) were significantly lower. There was a significant reduction in serum metal ions postoperatively in both groups (p < 0.001), although there was no difference between groups for this reduction (p = 0.674 (cobalt); p = 0.186 (chromium)). Conclusion In selected patients with Birmingham MoM hips, where the acetabular component is well-fixed and in a satisfactory position with no surface damage, the metal head can be exchanged for polyethylene ADM/MDM bearings with retention of the acetabular prosthesis. This presents significant benefits, with a shorter procedure and a lower risk of complications. Cite this article: Bone Jt Open 2024;5(6):514–523.
... Kaplan-Meier (K-M) curves and log-rank tests were used to assess the probability of survival for different marital status. 9 Next, prespecified subgroups include stratification by baseline sex (male or female) and age (65-79 or ≥80 years), which were performed using a stratified log-rank test, whereas OR with 90% CI was estimated using a stratified logistic regression model. 10,11 Statistical analysis, excluding stratified analysis, was performed using R, Two-sided p-values were calculated, and p-values < 0.05 were considered as statistically significant. ...
Article
Full-text available
Background and Aims Marital status has been shown to be associated with mortality, but evidence in critically ill elder intensive care unit (ICU) patients with cerebrovascular diseases (CeVD) is limited. This study was to explore the correlation between marital status and the prognosis of patients with CeVD aged 65 years and over in the ICU. Methods In the present study, 3564 patients were enrolled in the Medical Information Mart for Intensive Care IV database (version 2.2). Patients were divided into four groups based on marital status: married, single, divorced, and widowed. The primary outcome was all‐cause mortality as patients were followed up for 3‐, 6‐, 9‐, and 12‐month. All‐cause mortality risk for patients with different marital status was compared. Univariate and multivariable logistic regression analyses, survival curves and stratified analyses were performed to determine the correlation between marital status and mortality in critically ill patients with CeVD aged ≥65 years. Results Of the patients, 51.2% (1825/3564) were married, followed by 23.8% (847/3564) were widowed, 18.2% (647/3564) were single, and 6.9% (245/3567) were divorced. Compared with the married, the unmarried had a higher proportion of female (p < 0.001), older (p < 0.001), and less proportion of mechanical ventilation (p = 0.045). Multivariate analyses showed that no differences were observed for mortality risk among different marital statuses (p > 0.05), while at late follow‐up, widowed had a significance higher mortality risk than the married (9‐month: odds ratio [OR]: 1.30, 95% confidence interval [CI]: 1.05–1.61, p = 0.02; 12‐month: OR: 1.38, 95% CI: 1.12–1.71, p = 0.003). Stratified analyses indicated a stable correlation between marital status and 12‐month mortality rate in sub‐analysis for gender (p = 0.46) and age (p = 0.35). Conclusion Marital status is associated with long‐term prognosis in older patients with CeVD admitted to ICU. Widowed people should receive more societal attention irrespective of sex or age.
... Statistical analysis was performed using the IBM SPSS statistics program, version 29.0. We used the Kaplan-Meier method to estimate OS in our study cohort [15]. Initially, we conducted bivariate analyses using either the chi-square or Fisher tests to determine statistical significance, based on sample size and data characteristics. ...
... Therefore, we only consider 43 major-merger paired galaxies that have CO data from JCMT observations, IRAM 30 m observations, and the MASCOT survey. For the tentative detections with S/N <5, we treat these measurements as upper limits in the statistical analysis, adopting the Kaplan-Meier estimator (Kaplan & Meier 1958;Feigelson & Nelson 1985) to calculate the mean values and errors for the corresponding physical properties. ...
Article
Full-text available
We present a study of the molecular gas in early-mid stage major mergers, with a sample of 43 major-merger galaxy pairs selected from the Mapping Nearby Galaxies at Apache Point Observatory survey and a control sample of 195 isolated galaxies selected from the xCOLD GASS survey. Adopting kinematic asymmetry as a new effective indicator to describe the merger stage, we aim to study the role of molecular gas in the merger-induced star formation enhancement along the merger sequence of galaxy pairs. We obtain the molecular gas properties from CO observations with the James Clerk Maxwell Telescope, Institut de Radioastronomie Milimetrique 30 m telescope, and the MaNGA-ARO Survey of CO Targets survey. Using these data, we investigate the differences in molecular gas fraction ( f H 2 ), star formation rate (SFR), star formation efficiency (SFE), molecular-to-atomic gas ratio ( M H 2 / M H i ), total gas fraction ( f gas ), and the SFE of total gas (SFE gas ) between the pair and control samples. In the full pair sample, our results suggest the f H 2 of paired galaxies is significantly enhanced, while the SFE is comparable to that of isolated galaxies. We detect significantly increased f H 2 and M H 2 / M H i in paired galaxies at the pericenter stage, indicating an accelerated transition from atomic gas to molecular gas due to interactions. Our results indicate that the elevation of f H 2 plays a major role in the enhancement of global SFR in paired galaxies at the pericenter stage, while the contribution of enhanced SFE in specific regions requires further explorations through spatially resolved observations of a larger sample spanning a wide range of merger stages.
... The ROC curve on test sets corresponds to the mean (line) and standard deviation (shade) of the ROC curves on the 5 test sets. For each survival task, the model's ability to stratify patients into clinically significant risk groups is illustrated by Kaplan-Meier curves[53] of test sets (top left) and holdout set (top right) using stratification based on predicted risk (see Methods, section Risk groups, for a description of the risk threshold computation method). The 95% confidence interval (95% CI, shade) of the Kaplan-Meier curve (line) is estimated using log hazard ...
Preprint
Full-text available
We propose a fully automatic multi-task Bayesian model, named Bayesian Sequential Network (BSN), for predicting high-grade (Gleason ≥ 8) prostate cancer (PCa) prognosis using pre-prostatectomy FDG-PET/CT images and clinical data. BSN performs one classification task and five survival tasks: predicting lymph node invasion (LNI), biochemical recurrence-free survival (BCR-FS), metastasis-free survival, definitive androgen deprivation therapy-free survival, castration-resistant PCa-free survival, and PCa-specific survival (PCSS). Experiments are conducted using a dataset of 295 patients. BSN outperforms widely used nomograms on all tasks except PCSS, leveraging multi-task learning and imaging data. BSN also provides automated prostate segmentation, uncertainty quantification, personalized feature-based explanations, and introduces dynamic predictions, a novel approach that relies on short-term outcomes to refine long-term prognosis. Overall, BSN shows great promise in its ability to exploit imaging and clinico-pathological data to predict poor outcome patients that need treatment intensification with loco-regional or systemic adjuvant therapy for high-risk PCa.
... Robots that do not encounter the target result in censored data points. Then, we estimate the cumulative distribution function (CDF) of the first passage times by using the Kaplan-Meier (K-M) estimator [36]. By examining our results in section III, we concluded that the empirical CDF best fits a Weibull distribution for all the performed simulations. ...
Preprint
Full-text available
Effective exploration abilities are fundamental for robot swarms, especially when small, inexpensive robots are employed (e.g., micro- or nano-robots). Random walks are often the only viable choice if robots are too constrained regarding sensors and computation to implement state-of-the-art solutions. However, identifying the best random walk parameterisation may not be trivial. Additionally, variability among robots in terms of motion abilities-a very common condition when precise calibration is not possible-introduces the need for flexible solutions. This study explores how random walks that present chaotic or edge-of-chaos dynamics can be generated. We also evaluate their effectiveness for a simple exploration task performed by a swarm of simulated Kilobots. First, we show how Random Boolean Networks can be used as controllers for the Kilobots, achieving a significant performance improvement compared to the best parameterisation of a L\'evy-modulated Correlated Random Walk. Second, we demonstrate how chaotic dynamics are beneficial to maximise exploration effectiveness. Finally, we demonstrate how the exploration behavior produced by Boolean Networks can be optimized through an Evolutionary Robotics approach while maintaining the chaotic dynamics of the networks.
... It blends information from all the observations available both censored and uncensored cases. The Kaplan-Meier estimator of survival function can be written as (Kaplan and Meier (1958)): ...
Thesis
The main objective of this paper is to explore a suitable statistical model for endometrial cancer survival data. Several parametric life time distributions have been considered for this analysis. Also, non-parametric Kaplan-Meier used to validate our chosen model. Further, log rank test has been used to compare the survival experience between age groups (< 60 and ≥ 60 years) and different grades of endometrial cancer patients. The log-logistic survival distribution is chosen as most suitable model based on its lowest AIC and BIC values. There is no significant difference of survival experience between age groups (< 60 and ≥ 60 years) and different grades of the tumor. The overall five-year survival of the endometrial cancer patients is found to be 45.6% with 95% confidence interval (25.8%, 80.6%). This knowledge would be quite helpful in predicting survival of endometrial cancer patients.
... Descriptive statistics were calculated using an Excel spreadsheet (Version 16.83, Microsoft, Redmond, WA, USA) and expressed as a data range, median, and standard deviation. Progression-free and overall survival were evaluated using the methods described by Kaplan and Meier [31]. A log-rank test was used to compare the survival curves [32]. ...
Article
Full-text available
Immune-mediated diarrhea represents a serious complication of checkpoint inhibitor therapy, especially following ipilimumab-based treatment. Efficient diagnosis and control of diarrhea remains an ongoing challenge. We developed an accelerated management paradigm for patients with ipilimumab-induced diarrhea. Patients who developed significant diarrhea (>five loose stools/day) were presumed to be developing immune colitis. Therapy was interrupted and patients were treated with a methylprednisolone dose pack. If diarrhea was not completely resolved, high-dose steroids and infliximab were promptly added. Only non-responding patients underwent further evaluation for infection or other causes of diarrhea. A total of 242 patients were treated with ipilimumab-based regimens. Forty-six developed significant diarrhea (19%) and thirty-four (74.4%) had a rapid resolution of diarrhea following glucocorticosteroid and infliximab treatment. The median time to resolution of diarrhea was only 8.5 ± 16.4 days. Accelerated treatment for presumed immune-mediated diarrhea resulted in the rapid control of symptoms in the majority of patients. There were no intestinal complications or deaths. Immunosuppressive therapy for diarrhea did not appear to decrease the remission rate or survival. After the control of diarrhea, most patients were able to continue their planned immunotherapy. Further testing in 11/46 patients with unresponsive diarrhea revealed additional diagnoses, allowing their treatment to be adjusted.
... Annual stage-specific survival rates (S j : juvenile survival, S y : yearling female survival, S a : adult female survival) were determined using the staggered entry Kaplan and Meier (1958) technique. Pregnancy rates of yearling females (P y ) and adult females (P a ) were determined via blood serum analysis or ultrasound at time of capture. ...
Article
Full-text available
Large terrestrial mammals increasingly rely on human‐modified landscapes as anthropogenic footprints expand. Land management activities such as timber harvest, agriculture, and roads can influence prey population dynamics by altering forage resources and predation risk via changes in habitat, but these effects are not well understood in regions with diverse and changing predator guilds. In northeastern Washington state, USA, white‐tailed deer (Odocoileus virginianus) are vulnerable to multiple carnivores, including recently returned gray wolves (Canis lupus), within a highly human‐modified landscape. To understand the factors governing predator–prey dynamics in a human context, we radio‐collared 280 white‐tailed deer, 33 bobcats (Lynx rufus), 50 cougars (Puma concolor), 28 coyotes (C. latrans), and 14 wolves between 2016 and 2021. We first estimated deer vital rates and used a stage‐structured matrix model to estimate their population growth rate. During the study, we observed a stable to declining deer population (lambda = 0.97, 95% confidence interval: 0.88, 1.05), with 74% of Monte Carlo simulations indicating population decrease and 26% of simulations indicating population increase. We then fit Cox proportional hazard models to evaluate how predator exposure, use of human‐modified landscapes, and winter severity influenced deer survival and used these relationships to evaluate impacts on overall population growth. We found that the population growth rate was dually influenced by a negative direct effect of apex predators and a positive effect of timber harvest and agricultural areas. Cougars had a stronger effect on deer population dynamics than wolves, and mesopredators had little influence on the deer population growth rate. Areas of recent timber harvest had 55% more forage biomass than older forests, but horizontal visibility did not differ, suggesting that timber harvest did not influence predation risk. Although proximity to roads did not affect the overall population growth rate, vehicle collisions caused a substantial proportion of deer mortalities, and reducing these collisions could be a win–win for deer and humans. The influence of apex predators and forage indicates a dual limitation by top‐down and bottom‐up factors in this highly human‐modified system, suggesting that a reduction in apex predators would intensify density‐dependent regulation of the deer population owing to limited forage availability.
... Subsequently, the risk stratification capabilities of each model were assessed by computing risk scores for every patient using the trained models. Patients in the test set were classified into high-, intermediate-, and lowrisk score groups [18] and evaluated through Kaplan-Meier survival curves [19] and log-rank tests [20]. ...
Article
Full-text available
Introduction Ischemic heart disease is a leading cause of death worldwide, and its importance is increasing with the aging population. The aim of this study was to evaluate the accuracy of SurvTrace, a survival analysis model using the Transformer—a state-of-the-art deep learning method—for predicting recurrent cardiovascular events and stratifying high-risk patients. The model’s performance was compared to that of a conventional scoring system utilizing real-world data from cardiovascular patients. Methods This study consecutively enrolled patients who underwent percutaneous coronary intervention (PCI) at the Department of Cardiovascular Medicine, University of Tokyo Hospital, between 2005 and 2019. Each patient’s initial PCI at our hospital was designated as the index procedure, and a composite of major adverse cardiovascular events (MACE) was monitored for up to two years post-index event. Data regarding patient background, clinical presentation, medical history, medications, and perioperative complications were collected to predict MACE. The performance of two models—a conventional scoring system proposed by Wilson et al. and the Transformer-based model SurvTrace—was evaluated using Harrell’s c-index, Kaplan–Meier curves, and log-rank tests. Results A total of 3938 cases were included in the study, with 394 used as the test dataset and the remaining 3544 used for model training. SurvTrace exhibited a mean c-index of 0.72 (95% confidence intervals (CI): 0.69–0.76), which indicated higher prognostic accuracy compared with the conventional scoring system’s 0.64 (95% CI: 0.64–0.64). Moreover, SurvTrace demonstrated superior risk stratification ability, effectively distinguishing between the high-risk group and other risk categories in terms of event occurrence. In contrast, the conventional system only showed a significant difference between the low-risk and high-risk groups. Conclusion This study based on real-world cardiovascular patient data underscores the potential of the Transformer-based survival analysis model, SurvTrace, for predicting recurrent cardiovascular events and stratifying high-risk patients.
... The risk score midpoint was used to categorise the specimens. The training set used Kaplan-Meier plots to assess the overall survival (OS) of the high-and low-risk groups (19). ...
Article
Full-text available
Background Stomach adenocarcinoma (STAD), a frequently occurring gastrointestinal tumour, is often detected late and has a poor prognosis. Long non-coding RNAs (lncRNAs) significantly affect tumour development. Recent studies have identified disulfidptosis as a previously unexplained form of cell death. Herein, we aimed to examine the predictive value of disulfidptosis-related lncRNA models for the clinical prognosis and immunotherapy of STAD. Methods STAD-related transcriptomic data were obtained from The Cancer Genome Atlas (TCGA), whereas genes associated with disulfidptosis were identified from previously published papers. A risk prediction model for disulfidptosis-related lncRNAs was developed using the Cox regression and least absolute shrinkage selection algorithm methods. The accuracy of the model was confirmed using calibration curves, and the biological functions were analysed using Gene Ontology (GO) and Gene Set Enrichment Analysis (GSEA). Finally, the tumour mutation burden (TMB) and tumour immune dysfunction and exclusion (TIDE) algorithms were used to screen drugs that are sensitive to STAD. Results The risk prediction models were constructed using seven disulfidptosis-related lncRNAs. The validated results were consistent with the predicted ones, with significant survival differences. When combined with clinical data, the risk scores were used as independent prognostic markers. Based on the tumour mutation load, the high-risk patient group had a poorer survival rate as compared with the low-risk patient group. Further studies were conducted to understand the different groups’ inconsistent responses to immune status; subsequently, relatively sensitive drugs were identified. Conclusions Overall, seven markers of disulfidptosis-related lncRNAs associated with STAD were found to facilitate prognostic prediction, suggesting new ideas for immunotherapy and clinical applications.
... These parameters have been subjected to different statistical analyses applied using the R statistical program. The Kaplan-Meier estimator is used to compare the two groups of individuals analyzed: surviving or nonsurviving companies (Kaplan & Meier, 1958). This estimator indicates the estimated probability of surviving a specific period conditioned by current age. ...
Article
Full-text available
Business survival has been a widely studied phenomenon in management literature given its impact on economic and social reality. There is some consensus that crises have a catalytic effect by driving the least efficient, productive, and profitable firms out of the market, while those best prepared with better technology, products, or business structure survive; this process is known as creative destruction. This study analyzed a sample of 5,000 firms using statistical survival models in order to identify the main factors affecting the life and disappearance of firms in periods of crisis (2008–2012) versus periods of recovery (2016–2020) to detect differentiated patterns of behavior. It was found that those firms that were driven out of the market during the crisis, unlike in the subsequent recovery period, were neither the less profitable nor the less efficient. Instead, more indebted firms or those that did not get the support of credit institutions were withdrawn from market. This result casts doubt on the effectiveness of the theorized process of creative destruction in times of economic crisis.
... The distribution of PFS and OS (quantified using the median) were estimated using the Kaplan-Meier method [17]. The reverse Kaplan-Meier method was used to estimate the median follow-up durations. ...
Article
Full-text available
Dual human epidermal growth factor receptor 2 (HER2) blockade with trastuzumab and pertuzumab combined with taxane-based chemotherapy (Cht) has been the standard first-line treatment for HER2-positive metastatic breast cancer (mBC) for years, due to the impressive results of the CLEOPATRA study. Real-world (RW) studies have become critical for assessing treatment effectiveness and safety in real-life circumstances. The aim of this study was to analyze the treatment outcomes of first-line therapy for HER2-positive mBC in RW clinical practice, specifically focusing on the use of maintenance endocrine therapy (ET) in hormone receptor positive (HR-positive) patients. This retrospective analysis included 106 HER2-positive mBC patients treated with trastuzumab and pertuzumab combined with taxane-based Cht from October 2015 to December 2020 at the University Hospital Centre Zagreb. At a median follow-up of 30 months, median progression-free survival (PFS) was 25 months for the total population (95% confidence interval [CI] 16 - not analyzed). Patients with de novo mBC had longer median PFS than patients with recurrent disease (not reached vs. 18 months; hazard ratio 1.99; 95% CI 0.69–3.64, p<0.022). Age, hormone receptor positivity, visceral involvement, number of Cht cycles and previous adjuvant trastuzumab did not impact PFS. Most HR-positive patients (N=55, 88.7%) received maintenance ET after induction Cht. This retrospective study provides additional data on patient characteristics, treatment and outcomes of RW HER2-positive mBC patients treated with pertuzumab and trastuzumab as first-line therapy. In our institution, maintenance ET after induction Cht has become standard clinical practice.
... IC50 values were determined for each cell line at each time point using a nonlinear fit [Inhibitor] vs. response-variable slope (four parameter). Survival Kaplan-Meir curves were analyzed using a Logrank Mantel-Cox test [26,27]. P-values of < 0.05 were considered significant, and the asterisks indicate increasing levels of significance with * < 0.05, ** < 0.01, *** < 0.001, **** < 0.0001. ...
Article
Full-text available
Purpose The purpose of this study was to analyze potential differences in antitumor efficacy and pharmacokinetics between intravenous (IV) bendamustine and a novel orally administered (PO) bendamustine agent that is utilizing the beneficial properties of superstaturated solid dispersions formulated in nanoparticles. Methods Pharmacokinetics of IV versus PO bendamustine were determined by analysis of plasma samples collected from NSG mice treated with either IV or PO bendamustine. Plasma samples were analyzed using liquid chromatography–mass spectrometry following a liquid–liquid extraction to determine peak bendamustine concentration, area under the concentration–time curve, and the half-life in-vivo. In-vitro cytotoxicity of bendamustine against human non-Hodgkin Burkitt’s Lymphoma (Raji), multiple myeloma (MM.1s), and B-cell acute lymphoblastic leukemia (RS4;11) cell lines was determined over time using MTS assays. Luciferase-tagged versions of the aforementioned cell lines were used to determine in-vivo bendamustine cytotoxicity of IV versus PO bendamustine at two different doses. Results Bendamustine at a high dose in-vitro causes cell death. There was no significant difference in antitumor activity between IV and novel PO bendamustine at a physiologically relevant concentration in all three xenograft models. In-vivo pharmacokinetics showed the oral bioavailability of bendamustine in mice to be 51.4%. Conclusions The novel oral bendamustine agent tested exhibits good oral bioavailability and systemic exposure for in-vivo antitumor efficacy comparable to IV bendamustine. An oral bendamustine formulation offers exciting clinical potential as an additional method of administration for bendamustine and warrants further evaluation in clinical studies.
... In the context of right censoring with independent data, the maximum non-parametric likelihood approach is the Kaplan-Meier estimator originally introduced in Kaplan and Meier (1958). Notably, the Kaplan-Meier estimator is consistent Wang et al. (1987) and its asymptotic properties were studied in Cai (1998). ...
Article
We study the problem of comparing distribution equality between two random samples under a random censoring scheme. We design a series of tests based on energy distance and kernel mean embeddings to address this problem. We calibrate our tests using permutation methods and prove that they are consistent against all fixed continuous alternatives. To evaluate our proposed tests in real-world clinical scenarios, we simulate survival curves from immunotherapy clinical trials published in major medical journals. Additionally, we provide practitioners with recommendations on selecting parameters and distances for the crossing survival curves problem observed in the analyzed real data. Based on the parameter tuning method we propose, we demonstrate that our tests show a considerable gain in statistical power compared to classical survival tests. Furthermore, as our test depends on the chosen semi-metric or kernel, it can be adapted to other clinical settings or survival analysis problems.
... Firstly, calculation of Z π can be executed with existing standard software whereas b S 2 needs additional implementation effort to be calculated. Secondly, for computation of b S 2 the full dataset of the historical patients is needed, whereas Z π is based only on the sample size n A and the estimated reference curve b L A [3] (or equivalently the Kaplan Meier estimator b S A [4]). This enables the usage of the Z π -test (5) even in absence of full historical survival time data. ...
Article
Full-text available
The one-sample log-rank test is the preferred method for analysing the outcome of single-arm survival trials. It compares the survival distribution of patients with a prefixed reference survival curve that usually represents the expected outcome under standard of care. However, classical one-sample log-rank tests assume that the reference curve is known, ignoring that it is frequently estimated from historical data and therefore susceptible to sampling error. Neglecting the variability of the reference curve can lead to an inflated type I error rate, as shown in a previous paper. Here, we propose a new survival test that allows to account for the sampling error of the reference curve without knowledge of the full underlying historical survival time data. Our new test allows to perform a valid historical comparison of patient survival times when only a historical survival curve rather than the full historic data is available. It thus applies in settings where the two-sample log-rank test is not applicable as method of choice due to non-availability of historic individual patient survival time data. We develop sample size calculation formulas, give an example application and study the performance of the new test in a simulation study.
... Patient survival rates were calculated using the Kaplan-Meier method. 15 The OS of patients were compared with those of age-, sex-, and calendar-period-matched Japanese GP. The expected survival curve was created using data on mortality in the GP obtained from Vital Statistics supplied by the Japanese Ministry of Health, Labour, and Welfare. ...
Article
Full-text available
Information regarding follow-up duration after treatment for newly diagnosed diffuse large B-cell lymphoma (DLBCL) is important. However, a clear endpoint has yet to be established. We totally enrolled 2182 patients newly diagnosed with DLBCL between 2008 and 2018. The median age of the patients was 71 years. All patients were treated with rituximab- and anthracycline-based chemotherapies. Each overall survival (OS) was compared with the age- and sex-matched Japanese general population (GP) data. At a median follow-up of 3.4 years, 985 patients experienced an event and 657 patients died. Patients who achieved an event-free survival (EFS) at 36 months (EFS36) had an OS equivalent to that of the matched GP (standard mortality ratio [SMR], 1.17; P=0.1324), whereas those who achieved an EFS24 did not have an OS comparable to that of the matched GP (SMR, 1.26; P=0.0095). Subgroup analysis revealed that relatively old patients (>60 years), male patients, those with limited-stage disease, those with a good performance status, and those with low levels of soluble interleukin 2 receptor already had a comparable life expectancy to the matched GP at an EFS24. In contrast, relatively young patients had a shorter life expectancy than matched GP, even with an EFS36. In conclusion, an EFS36 was shown to be a more suitable endpoint for newly diagnosed DLBCL patients than an EFS24. Of note, younger patients require a longer EFS period than older patients in order to obtain an equivalent life expectancy to the matched GP.
... As reported by Fedrizzi et al. (15), survival analysis was also conducted to analyze the number of goals scored in the UEFA EURO 2020 final phase and the time interval between goals. The authors used a Poisson distribution for modeling the number of goals and used the Kaplan-Meier Model (16) to compute the survival curves and the time between goals. Their model focused solely on the overall number of goals scored, without considering which team was responsible for each goal. ...
Article
Full-text available
Introduction: This study investigated the influence of team formation on goal-scoring efficiency through analysing the time required for a goal to be scored in elite football matches. Method: The analysis was conducted using a comprehensive open access dataset encompassing eight major football competitions, including prestigious events such as the World Cup and the UEFA Champions League. It notably focused on the competing risks framework and employed the Fine and Gray model to account for the interplay between two competing events: team A scoring and team B scoring. Results: Through analysis of Team A’s goal occurrences, we assessed the offensive capabilities of its formation and the defensive effectiveness of Team B’s composition in relation to the time it took for Team A to score a goal. Findings revealed that teams employing the 4-3-3 and 4-2-3-1 formations outperformed other formations (3-4-3, 3-5-2, 4-4-2, 4-5-1, 5-3-2, 5-4-1) regarding goal-scoring efficiency. Discussion: By shedding light on the impact of team formation on goal scoring, this research contributes to a deeper understanding of some of the successful strategic aspects of elite football.
... In this article, we showed how the variance of the Greenwood estimator can be derived to quantify its uncertainty and to calculate corresponding Wald-type confidence intervals. Our starting point was the Kaplan-Meier estimator [Kaplan and Meier, 1958]. ...
Preprint
Full-text available
In this article, we introduce an estimator for the asymptotic variance of the Greenwood variance estimator, where the latter is crucial for assessing the accuracy of the Kaplan-Meier survival estimator. The result indicates that the asymptotic variance of the Greenwood variance estimator is considerably smaller than that of the Kaplan-Meier variance estimator. This finding emphasizes the robustness of the Greenwood estimator.
... Additionally, we estimate the survival functions of the risk groups using the Kaplan-Meier estimator from the package Lifelines (Davidson-Pilon, 2023). The Kaplan-Meier estimator is a non-parametric estimator of the survival function in the face of censoring (Kaplan & Meier, 1958). The resulting survival function is a step function that decreases with every loan default and expresses the empirical probability of survival beyond each time step. ...
Article
Full-text available
Recent research using explainable machine learning survival analysis demonstrated its ability to identify new risk factors in the medical field. In this study, we adapted this methodology to credit risk assessment. We used a comprehensive dataset from the Estonian P2P lending platform Bondora, consisting of over 350,000 loans and 112 features with a loan volume of 915 million euros. First, we applied classical (linear) and machine learning (extreme gradient-boosted) Cox models to estimate the risk of these loans and then risk-rated them using risk stratification. For each rating category we calculated default rates, rates of return, and plotted Kaplan–Meier curves. These performance criteria revealed that the boosted Cox model outperformed both the classical Cox model and the platform’s rating. For instance, the boosted model’s highest rating category had an annual excess return of 18% and a lower default rate compared to the platform’s best rating. Second, we explained the machine learning model’s output using Shapley Additive Explanations. This analysis revealed novel nonlinear relationships (e.g., higher risk for borrowers over age 55) and interaction effects (e.g., between age and housing situation) that provide promising avenues for future research. The machine-learning model also found feature contributions aligning with existing research, such as lower default risk associated with older borrowers, females, individuals with mortgages, or those with higher education. Overall, our results reveal that explainable machine learning survival analysis excels at risk rating, profit scoring, and risk factor analysis, facilitating more precise and transparent credit risk assessments.
... To handle this problem, we employ an idea from [6], which analyses estimators derived as integrals with respect to the product-limit estimator (cf. [14]) at normalized upper order statistics; however, we simultaneously incorporate the idea of [20], where the inclusion of random covariates in handled elegantly in a censoring context. The extension is relevant on two fronts. ...
Preprint
The task of analyzing extreme events with censoring effects is considered under a framework allowing for random covariate information. A wide class of estimators that can be cast as product-limit integrals is considered, for when the conditional distributions belong to the Frechet max-domain of attraction. The main mathematical contribution is establishing uniform conditions on the families of the regularly varying tails for which the asymptotic behaviour of the resulting estimators is tractable. In particular, a decomposition of the integral estimators in terms of exchangeable sums is provided, which leads to a law of large numbers and several central limit theorems. Subsequently, the finite-sample behaviour of the estimators is explored through a simulation study, and through the analysis of two real-life datasets. In particular, the inclusion of covariates makes the model significantly versatile and, as a consequence, practically relevant.
... The most widely used one is ( − 1∕2)∕ (also referred to as the "median rank method") for sample sizes ≥ 11, and ( − 3∕8)∕( + 1∕4) for ≤ 10, as introduced by Blom 26 However, when the dataset includes censored observations, estimating the empirical CDF̂( ) becomes more intricate. In this scenario, the renowned product limit estimator of ( ) = 1 − ( ), originally credited to Kaplan and Meier, 28 can be employed. This estimator is defined as follows: ...
Article
Full-text available
A system consisting of interconnected components in series is under consideration. This research focuses on estimating the parameters of this system for incomplete lifetime data within the framework of competing risks, employing an underlying inverse Weibull distribution. While one popular method for parameter estimation involves the Newton–Raphson (NR) technique, its sensitivity to initial value selection poses a significant drawback, often resulting in convergence failures. Therefore, this paper opts for the expectation–maximization (EM) algorithm. In competing risks scenarios, the precise cause of failure is frequently unidentified, and these issues can be further complicated by potential censoring. Thus, incompleteness may arise due to both censoring and masking. In this study, we present the EM‐type parameter estimation and demonstrate its superiority over parameter estimation based on the NR method. Two illustrative examples are provided. The proposed method is compared with the existing Weibull competing risks model, revealing the superiority of our approach. Through Monte Carlo simulations, we also examine the sensitivity of the initial value selection for both the NR‐type method and our proposed method.
... The RFS and iDFS were defined as the duration from the surgery to the occurrence of the relevant event. The association between detection of ctDNA and patient outcomes were analyzed using the Kaplan-Meier method, and differences were examined using the log-rank test (34). ...
Article
Full-text available
Background: Circulating tumor DNA (ctDNA) is a potential biomarker not only capable of monitoring the treatment response during neoadjuvant therapy (NAT) or rescue therapy, but also identifying minimal residual disease (MRD) and detecting early relapses after primary treatment. However, it remains uncertain whether the detection of ctDNA at diagnosis, before any treatment, can predict the prognosis for patients with early breast cancer. The objective of our study was to evaluate the predictive value of baseline ctDNA for prognosis in patients with early breast cancer. Methods: A total of 90 patients with early breast cancer and 24 healthy women were recruited between August 2016 and October 2016. Peripheral blood samples were collected from patients at diagnosis, before any treatment. Blood samples were processed and subjected to targeted deep sequencing with a next-generation sequencing (NGS) panel of 1,021 cancer-related genes. The recurrence-free survival (RFS) and invasive disease-free survival (iDFS) were reported. Results: The 90 patients with breast cancer included 6 patients with ductal carcinoma in situ (DCIS) and 84 patients with invasive breast cancer. Within the cohort of patients with invasive breast cancer, ctDNA were detected in 57 patients, with a ctDNA detection rate of 67.9%. Meanwhile, no ctDNA was detected in DCIS patients. Among 84 patients with invasive breast cancer, patients with high-level ctDNA had a significantly lower RFS compared to patients with low-level ctDNA (log-rank P=0.0036). Conclusions: Our study suggested that ctDNA at diagnosis, before any treatment, could potentially serve as a biomarker to predict the prognosis for patients with early breast cancer. However, further follow-up and more studies with large sample sizes are required to confirm these findings.
Preprint
Full-text available
Background Ovarian cancer is a significant health concern, necessitating the identification of potential diagnostic markers and novel therapeutic targets. This study presents, to the best of our knowledge, the first comparative immunohistochemical analysis of five tumor markers, namely the extra-domains A and B of fibronectin, fibroblast activation protein, carcinoembryonic antigen, and MUC16 in human epithelial ovarian cancer tissue samples. Methods Formalin-fixed paraffin-embedded human ovarian tissue sections were stained using previously validated antibodies to assess the percentage and intensity of marker expressions. Results Our results indicate a similar stromal pattern of expression for fibroblast activation protein, extra-domains A, and extra-domains B, with extra-domains A exhibiting the most intense staining. MUC16 was abundantly expressed on tumor cells of high-grade serous carcinoma samples, while carcinoembryonic antigen was not detected in this indication. Subsequent staining revealed that carcinoembryonic antigen was highly expressed on mucinous ovarian cancer specimens. With respect to clinical features, MUC16 and extra-domains A were found to be highly expressed in the most challenging scenarios namely platinum-resistant (100% and 50% respectively) and BRCA WT (75% and 45% respectively) patients. Conclusions The findings of this study highlight that MUC16, extra-domains B, and extra-domains A are attractive targets for the treatment of serous ovarian carcinoma, while carcinoembryonic antigen could be exploited for mucinous ovarian cancer. Clinical investigations are warranted to validate the potential of antibody-based therapies targeting these antigens in the context of ovarian cancer.
Preprint
Full-text available
Breeding animals to produce more robust and disease-resistant pig populations becomes a complementary strategy to the more conventional methods of biosecurity and vaccination. The objective of this study was to explore the ability of a panel of genetic markers and immunity parameters to predict the survival rates during a natural PRRSV outbreak. Ten-week-old female Duroc pigs (n = 129), obtained from 61 sows and 20 boars, were naturally infected with a highly pathogenic PRRSV genotype 1 strain. Prior to infection, piglets were screened for immunity parameters (IgG levels in plasma and SOX13 mRNA expression in blood) and genetic markers previously associated to PRRSV immune response and immunity traits. Additionally, the 20 boars were genotyped with a panel of 132 single nucleotide polymorphisms (SNPs). Survival analysis showed that mortality was significantly higher for animals with low basal IgG levels in plasma and/or high SOX13 mRNA expression in blood. The genotypes of sires for SNPs associated with IgG plasma levels, CRP in serum, percentage of γδ T cells, lymphocyte phagocytic capacity, total number of lymphocytes and leukocytes, and MCV and MCH were significantly associated with the number of surviving offspring. Furthermore, CD163 and GBP5 markers were also associated to piglet survival. The effects of these SNPs were polygenic and cumulative, survival decreased from 94–21% as more susceptible alleles were accumulated for the different markers. Our results confirmed the existence of genetic variability in survival after PRRSV infection and provided a set of genetic markers and immunity traits associated with PRRS resistance.
Chapter
Between 20 and 25% of individuals undergoing total knee arthroplasty (TKA) acquire hypersensitivity to metals, but only less than 1% display symptoms. For individuals presenting with painful TKA, metal hypersensitivity is a diagnosis of exclusion where skin patch test, lymphocyte transformation test, and synovial biopsies are useful adjuncts prior to revision surgery is undertaken to hypoallergenic implants. In individuals in whom a primary conventional TKA fails due to metal hypersensitivity, a “hypoallergenic” revision TKA should be recommended. However, before performing it, it is advisable to carry out knee arthroscopy to get tissue for microbiological and histopathological studies to exclude infection.
Article
Importance Pancreatic ductal adenocarcinoma (PDAC) is an aggressive malignant tumor, and durable disease control is rare with the current standard of care, even for patients who undergo surgical resection. Objective To assess whether neoadjuvant modified 5-fluorouracil, leucovorin, oxaliplatin, and irinotecan (mFOLFIRINOX) leads to early control of micrometastasis and improves survival. Design, Setting, and Participants This open-label, single-arm, phase 2 nonrandomized controlled trial for resectable PDAC was conducted at the Yale Smilow Cancer Hospital from April 3, 2014, to August 16, 2021. Pancreatic protocol computed tomography was performed at diagnosis to assess surgical candidacy. Data were analyzed from January to July 2023. Interventions Patients received 6 cycles of neoadjuvant mFOLFIRINOX before surgery and 6 cycles of adjuvant mFOLFIRINOX. Whole blood was collected and processed to stored plasma for analysis of circulating tumor DNA (ctDNA) levels. Tumors were evaluated for treatment response and keratin 17 (K17) expression. Main Outcomes and Measures The primary end point was 12-month progression-free survival (PFS) rate. Additional end points included overall survival (OS), ctDNA level, tumor molecular features, and K17 tumor levels. Survival curves were summarized using Kaplan-Meier estimator. Results Of 46 patients who received mFOLFIRINOX, 31 (67%) were male, and the median (range) age was 65 (46-80) years. A total of 37 (80%) completed 6 preoperative cycles and 33 (72%) underwent surgery. A total of 27 patients (59%) underwent resection per protocol (25 with R0 disease and 2 with R1 disease); metastatic or unresectable disease was identified in 6 patients during exploration. Ten patients underwent surgery off protocol. The 12-month PFS was 67% (90% CI, 56.9-100); the median PFS and OS were 16.6 months (95% CI, 13.3-40.6) and 37.2 months (95% CI, 17.5-not reached), respectively. Baseline ctDNA levels were detected in 16 of 22 patients (73%) and in 3 of 17 (18%) after 6 cycles of mFOLFIRINOX. Those with detectable ctDNA levels 4 weeks postresection had worse PFS (hazard ratio [HR], 34.0; 95% CI, 2.6-4758.6; P = .006) and OS (HR, 11.7; 95% CI, 1.5-129.9; P = .02) compared with those with undetectable levels. Patients with high K17 expression had nonsignificantly worse PFS (HR, 2.7; 95% CI, 0.7-10.9; P = .09) and OS (HR, 3.2; 95% CI, 0.8-13.6; P = .07). Conclusions and Relevance This nonrandomized controlled trial met its primary end point, and perioperative mFOLFIRINOX warrants further evaluation in randomized clinical trials. Postoperative ctDNA positivity was strongly associated with recurrence. K17 and ctDNA are promising biomarkers that require additional validation in future prospective studies. Trial Registration ClinicalTrials.gov Identifier: NCT02047474
Article
PURPOSE Prostate cancer (PCa) represents a highly heterogeneous disease that requires tools to assess oncologic risk and guide patient management and treatment planning. Current models are based on various clinical and pathologic parameters including Gleason grading, which suffers from a high interobserver variability. In this study, we determine whether objective machine learning (ML)–driven histopathology image analysis would aid us in better risk stratification of PCa. MATERIALS AND METHODS We propose a deep learning, histopathology image–based risk stratification model that combines clinicopathologic data along with hematoxylin and eosin– and Ki-67–stained histopathology images. We train and test our model, using a five-fold cross-validation strategy, on a data set from 502 treatment-naïve PCa patients who underwent radical prostatectomy (RP) between 2000 and 2012. RESULTS We used the concordance index as a measure to evaluate the performance of various risk stratification models. Our risk stratification model on the basis of convolutional neural networks demonstrated superior performance compared with Gleason grading and the Cancer of the Prostate Risk Assessment Post-Surgical risk stratification models. Using our model, 3.9% of the low-risk patients were correctly reclassified to be high-risk and 21.3% of the high-risk patients were correctly reclassified as low-risk. CONCLUSION These findings highlight the importance of ML as an objective tool for histopathology image assessment and patient risk stratification. With further validation on large cohorts, the digital pathology risk classification we propose may be helpful in guiding administration of adjuvant therapy including radiotherapy after RP.
Article
Full-text available
Three hybrid populations (F1) of Coffea arabica were evaluated under field and laboratory conditions, derived from sources carrying the SH1 coffee leaf rust (CLR) resistance gene and the CX.2385 line, obtained from the Caturra × Timor Hybrid CIFC-1343. The results obtained under controlled conditions and analyzed using survival curves allowed to estimate the probable times (p < 0.05) for the development of symptoms associated with CLR in the plants of populations evaluated. Phenotypic variation was observed as a defense response to Hemileia vastatrix infection, and plants with incomplete resistance to CLR were identified via an evaluation using the increasing lesions scale. The plants with incomplete resistance exhibited a delay in the development of the incubation period and an absence of the development of the dormancy period. Data suggest that when resistance genes in the sources are defeated by compatible strains, their recombination can give rise to new levels of resistance in the progeny. Additionally, the detached leaf methodology is recommended as an alternative to preselect genotypes with resistance to CLR, thus reducing the number of plants that are finally planted for field evaluations.
Article
Full-text available
We present a 400–800 MHz polarimetric analysis of 128 nonrepeating fast radio bursts (FRBs) from the first CHIME/FRB baseband catalog, increasing the total number of FRB sources with polarization properties by a factor of ∼3. A total of 89 FRBs have >6 σ linearly polarized detections, 29 FRBs fall below this significance threshold and are deemed linearly unpolarized, and for 10 FRBs, the polarization data are contaminated by instrumental polarization. For the 89 polarized FRBs, we find Faraday rotation measure (RM) amplitudes, after subtracting approximate Milky Way contributions, in the range 0.5–1160 rad m ⁻² with a median of 53.8 rad m ⁻² . Most nonrepeating FRBs in our sample have RMs consistent with Milky Way–like host galaxies, and their linear polarization fractions range from ≤10% to 100% with a median of 63%. We see marginal evidence that nonrepeating FRBs have more constraining lower limits than repeating FRBs for the host electron-density-weighted line of sight magnetic field strength. We classify the nonrepeating FRB polarization position angle (PA) profiles into four archetypes: (i) single component with constant PA (57% of the sample), (ii) single component with variable PA (10%), (iii) multiple components with a single-constant PA (22%), and (iv) multiple components with different or variable PAs (11%). We see no evidence for population-wide frequency-dependent depolarization, and, therefore, the spread in the distribution of fractional linear polarization is likely intrinsic to the FRB emission mechanism. Finally, we present a novel method to derive redshift lower limits for polarized FRBs without host galaxy identification and test this method on 20 FRBs with independently measured redshifts.
Article
Full-text available
Background Existing criteria for predicting patient survival from immunotherapy are primarily centered on the PD-L1 status of patients. We tested the hypothesis that noninvasively captured baseline whole-lung radiomics features from CT images, baseline clinical parameters, combined with advanced machine learning approaches, can help to build models of patient survival that compare favorably with PD-L1 status for predicting ‘less-than-median-survival risk’ in the metastatic NSCLC setting for patients on durvalumab. With a total of 1062 patients, inclusive of model training and validation, this is the largest such study yet. Methods To ensure a sufficient sample size, we combined data from treatment arms of three metastatic NSCLC studies. About 80% of this data was used for model training, and the remainder was held-out for validation. We first trained two independent models; Model-C trained to predict survival using clinical data; and Model-R trained to predict survival using whole-lung radiomics features. Finally, we created Model-C+R which leveraged both clinical and radiomics features. Results The classification accuracy (for median survival) of Model-C, Model-R, and Model-C+R was 63%, 55%, and 68% respectively. Sensitivity analysis of survival prediction across different training and validation cohorts showed concordance indices ([95 percentile]) of 0.64 ([0.63, 0.65]), 0.60 ([0.59, 0.60]), and 0.66 ([0.65,0.67]), respectively. We additionally evaluated generalization of these models on a comparable cohort of 144 patients from an independent study, demonstrating classification accuracies of 65%, 62%, and 72% respectively. Conclusion Machine Learning models combining baseline whole-lung CT radiomic and clinical features may be a useful tool for patient selection in immunotherapy. Further validation through prospective studies is needed.
Article
Full-text available
Survival regression models can achieve longer warning times at similar receiver operating characteristic performance than previously investigated models. Survival regression models are also shown to predict the time until a disruption will occur with lower error than other predictors. Time-to-event predictions from time-series data can be obtained with a survival analysis statistical framework, and there have been many tools developed for this task which we aim to apply to disruption prediction. Using the open-source Auton-Survival package we have implemented disruption predictors with the survival regression models Cox Proportional Hazards, Deep Cox Proportional Hazards, and Deep Survival Machines. To compare with previous work, we also include predictors using a Random Forest binary classifier, and a conditional Kaplan-Meier formalism. We benchmarked the performance of these five predictors using experimental data from the Alcator C-Mod and DIII-D tokamaks by simulating alarms on each individual shot. We find that developing machine-relevant metrics to evaluate models is an important area for future work. While this study finds cases where disruptive conditions are not predicted, there are instances where the desired outcome is produced. Giving the plasma control system the expected time-to-disruption will allow it to determine the optimal actuator response in real time to minimize risk of damage to the device.
Article
Health technology assessments of interventions impacting survival often require extrapolating current data to gain a better understanding of the interventions’ long-term benefits. Both a comprehensive examination of the trial data up to the maximum follow-up period and the fitting of parametric models are required for extrapolation. It is standard practice to visually compare the parametric curves to the Kaplan-Meier survival estimate (or comparison of hazard estimates) and to assess the parametric models using likelihood-based information criteria. In place of these two steps, this work demonstrates how to minimize the squared distance of parametric estimators to the Kaplan-Meier estimate. This is in line with the selection of the model using Mean Squared Error, with the modification that the unknown true survival is replaced by the Kaplan-Meier estimate. We would assure the internal validity of the extrapolated model and its appropriate representation of the data by adhering to this procedure. We use both simulation and real-world data with a scenario where no model that properly fits the data could be found to illustrate how this process can aid in model selection.
Article
On the basis of experience with calculated survivorships of patients following treatment for cancer, a simple function, in terms of two physically meaningful parameters, has been evolved, which fits such survivorship data very well. These two parameters can be used to compare succinctly the mortality of two groups, different in respect of treatment, type of cancer, or other characteristics.The parameters are c (“cured”), which represents the proportion of the population which is subject only to “normal” death rates, and β, which is the death rate from the cancer, to which the rest of the population [not “cured,” (1–c)] is subject. Thus if one treatment is characterized by c 1 = 0.30, β 1 = 0.25, another by c 2 = 0.20, β 2 = 0.15, this could be interpreted as meaning that while the first treatment “cured” a larger proportion of the population than did the second treatment, it did not ameliorate the deaths attributable to cancer in the patients not cured as much as did the second treatment.If l T is the proportion of the total population surviving to time t, then the function is l T = cl 0+ (1 – c)l 0e −-βt where l 0 is the net survivorship corresponding to “normal” deaths, obtained from standard life tables. A graphic method and also a “least squares” method of estimating c and β are presented with an example, and the evaluated parameters are given for several series of treated cancer patients. Expectation of life and other functions of the life table also have been calculated from the evaluated parameters, for the same series.
Article
1. Like all sciences, actuarial science has been developed piecemeal. From time to time theory has been constructed attempting to systematize and unify practices which had already been adopted by the working actuary. In the case of the general treatment of the derivation of rates of decrement from data, little has been published in the Journal to unify the various methods and techniques since W. J. H. Whittall's well-known paper written in 1893 just prior to the commencement of the investigation into the British Offices experience.
Article
The standard actuarial methods of estimating the age-specific one-year probabilities of death in a given community were developed—for the most part, many years ago-with large bodies of observations in mind. Although the familiar “exposed to risk” procedure is known to provide unbiased estimates only when a rather dubious assumption is made about the progression of the instantaneous death-rate (the force of mortality) over the year of age (Cantelli, 1914) it is still the most widely used method of estimation. This is partly because the age-to-age increment in human mortality is relatively small—so that assumptions about its mathematical form are unimportant—and partly because suggested methods of estimation based on more “realistic” assumptions are usually laborious to apply to thousands of observations.
Article
Statistical methods of estimation of the mean life-times of unstable particles from cloud chamber photographs are discussed (i) for decaying particles only, (ii) when non-decaying particles can be recognized.
Article
The problem discussed is that of comparing the longevities of two or more types of equipment under operational conditions where it is not convenient to identify or keep records of individual items. Such a comparison can be made by adopting certain replacement policies and observing their effect on the composition of the population. Methods of estimating relative and absolute longevities are given for the case where k types of equipment are being compared and various logistical requirements are placed upon the replacement policies. Methods of making decisions and testing hypotheses concerning the relative and absolute longevities are also given. Replacement policies are given which, under certain conditions, are optimum for purposes of studying longevity.* Based on research supported by the Office of Naval Research at the Statistical Research Center, University of Chicago. I wish to thank Merrill M. Flood, Columbia University, for bringing this problem to my attention.
Article
* The work described here has been carried out under an Office of Naval Research Contract. Some of the results were obtained at Stanford University in the summer of 1951. This paper is essentially a lecture given at the Stanford Inspection Conference, August 20–22, 1951.
Article
This paper summarizes the rationale and statistical techniques employed in the analysis of some failure data obtained from operations performed by machines and people. These data are compared to frequency distributions arising from either an exponential or a normal theory of failure. The agreement between theory and data is evaluated.
Article
This bibliography contains 999 references on nonparametric statistics and related topics, classified as follows: (A) Surveys and Discussions (39), (B) Theory (31), (C) Tchebycheff Inequalities (94), (D) Tolerance Sets (21), (E) Goodness of Fit (122), (F) Multisample Problems (53), (G) Parameter Problems (135), (H) Contingency Tables (75), (I) Randomness (109), (J) Correlation and Curve Fitting (96), (K) Comparative Studies (49), (L) Systematic Statistics (127), (M) Scaling (37), (N) Distribution Theory (383), (O) Applications (89), (P) Tables (228), (X) Miscellaneous (28).
Article
A life test on $N$ items is considered in which the common underlying distribution of the length of life of a single item is given by the density \begin{equation*}\tag{1} p(x; \theta, A) = \begin{cases}\frac{1}{\theta} e^{-(x-A)/\theta},\quad\text{for} x \geqq A \\ 0,\quad\text{otherwise}\end{cases}\end{equation*} where $\theta > 0$ is unknown but is the same for all items and $A \geqq 0.$ Several lemmas are given concerning the first $r$ out of $n$ observations when the underlying p.d.f. is given by (1). These results are then used to estimate $\theta$ when the $N$ items are divided into $k$ sets $S_j$ (each containing $n_j > 0,$ items, $\sum^k_{j=1} n_j = N)$ and each set $S_j$ is observed only until the first $r_j$ failures occur $(0 < r_j \leqq n_j).$ The constants $r_j$ and $n_j$ are fixed and preassigned. Three different cases are considered: 1. The $n_j$ items in each set $S_j$ have a common known $A_j (j = 1, 2, \cdots, k).$ 2. All $N$ items have a common unknown $A.$ 3. The $n_j$ items in each set $S_j$ have a common unknown $A_j (j = 1, 2, \cdots, k).$ The results for these three cases are such that the results for any intermediate situation (i.e. some $A_j$ values known, the others unknown) can be written down at will. The particular case $k = 1$ and $A = 0$ is treated in [2]. The constant $A$ in (1) can be interpreted in two different ways: (i) $A$ is the minimum life, that is life is measured from the beginning of time, which is taken as zero. (ii) $A$ is the "time of birth", that is life is measured from time $A$. Under interpretation (ii) the parameter $\theta,$ which we are trying to estimate, represents the expected length of life.
Estimation of the interval rate in actuarial calculations: a critique of the person-years concept
  • J Berkson
Berkson, J., "Estimation of the interval rate in actuarial calculations: a critique of the person-years concept," (Summary) Journal of the American Statistical Association 49 (1954), 363.
Practical implications of certain stochastic models on different methods of follow-up studies," paper presented at 1951 annual meeting of the Western Branch
  • Fix
  • Evelyn
Fix, Evelyn, "Practical implications of certain stochastic models on different methods of follow-up studies," paper presented at 1951 annual meeting of the Western Branch, American Public Health Association, November 1, 1951.
Theorie der unabhangigen Wahrscheinlichkeiten
  • P E Bohmer
Bohmer, P. E., "Theorie der unabhangigen Wahrscheinlichkeiten," Rapports, Memoires et Proces-verbaux de Septieme Congres International d' Actuaires, Amster-dam, 2 (1912), 327-43.
The natural duration of cancer His Majesty's Stationery Office
  • Greenwood
  • Major
Greenwood, Major, "The natural duration of cancer," Reports on Public Health and Medical Subjects, No. 33 (1926), His Majesty's Stationery Office.
Cancer illness among residents in Atlanta
  • J Cornfield
Theorie der unabhängigen Wahrscheinlichkeiten,” Rapports. Mémoires et Prooès-verbaux de Septième Congrès International d’Actuaires
  • P E Böhmer
Variance of estimated proportions withdrawing in single decrement follow-up studies when cases are lost from observation
  • H A Kahn
  • Nathan
Testing of survival rates as computed from life tables,” unpublished memorandum
  • Jablon
  • Seymour
Practical implications of certain stochastic models on different methods of follow-up studies
  • Evelyn Fix
  • Bailey W. G.