Kai Wang

Kai Wang
Pfizer · Precision Medicine, Oncology Research

Ph.D.

About

135
Publications
15,898
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,688
Citations
Education
September 2003 - February 2008
Columbia University
Field of study
  • Biomedical Informatics

Publications

Publications (135)
Article
Full-text available
Background: The clinical success of immune checkpoint inhibitors demonstrates that reactivation of the human immune system delivers durable responses for some patients and represents an exciting approach for cancer treatment. An important class of preclinical in vivo models for immuno-oncology is immunocompetent mice bearing mouse syngeneic tumors...
Preprint
The clinical success of immune checkpoint inhibitors that target cytotoxic T-lymphocyte associated protein 4 (CTLA4) and programmed cell death protein 1 (PD-1) or programmed death ligand-1 (PD-L1) demonstrates that reactivation of the human immune system delivers durable responses for some patients and represents an exciting approach for cancer tre...
Article
Full-text available
Background: Ultra-deep next-generation sequencing of circulating tumor DNA (ctDNA) holds great promise as a tool for the early detection of cancer and for monitoring disease progression and therapeutic responses. However, the low abundance of ctDNA in the bloodstream coupled with technical errors introduced during library construction and sequenci...
Article
Background: The Hedgehog (Hh) signaling pathway is key to development, differentiation, and stem cell maintenance. In cancer, dysregulation of Hh signaling is associated with solid tumors and hematological malignancies. Hh signaling in AML and MDS is thought to promote the renewal and maintenance of leukemic stem cells, which may lead to chemothera...
Preprint
Full-text available
The use of ultra-deep, next generation sequencing of circulating tumor DNA (ctDNA) holds great promise for early detection of cancer as well as a tool for monitoring disease progression and therapeutic responses. However, the low abundance of ctDNA in the bloodstream coupled with technical errors introduced during library construction and sequencin...
Article
Liquid biopsies have the potential to revolutionize the way physicians select personalized anti-cancer therapies, monitor patient responses to treatment, and characterize acquired resistance to cancer drugs. New tests that use a simple peripheral blood draw offer snapshots of a patient‘s total tumor DNA mutation profile and are attractive because o...
Article
One of the main challenges in immuno-oncology is to quantify different types of immune cells in the tumor microenvironment, as such data can greatly facilitate our understanding of the mechanism of action of, and response to cancer immunotherapies. Traditionally immune profiling is performed by immunohistochemistry or flow cytometry experiments usi...
Article
While the Notch pathway is reportedly activated in breast cancer, the molecular mechanisms leading to its aberrant activation remain elusive, hampering the optimal development of Notch inhibitors in the clinics. In an effort to identify predictive biomarkers of response to Notch targeted therapies in breast cancer, we used several computational app...
Article
This abstract is being presented as a short talk in the scientific program. A full abstract is printed in the Proffered Abstracts section (PR06) of the Conference Proceedings. Citation Format: Kai Wang, Suet Yi Leung. Genomic characterization of immune escape pathways in gastric cancer. [abstract]. In: Proceedings of the AACR Special Conference: Tu...
Article
We have developed and validated a 109 AML associated gene panel NGS assay based on sequence capture technologies. The overall assay sensitivity was 99.%, and overall assay specificity was 100% when sequenced with reference “gold” standard NA12878. Assay analytical accuracy was evaluated with diluted cell line samples harboring mutations in this gen...
Article
Activation and mutation of the NOTCH signaling pathway is oncogenic in many tissue types and the target of multiple anti-cancer therapies currently in clinical development. Initial therapeutic strategies designed to target the NOTCH pathway have focused on inhibition of aberrant signaling, but can have undesirable side-effects or insufficient anti-...
Article
The Cell Index Database, (CELLX) (http://cellx.sourceforge.net) provides a computational framework for integrating expression, copy number variation, mutation, compound activity, and meta data from cancer cells. CELLX provides the computational biologist a quick way to perform routine analyses as well as the means to rapidly integrate data for offl...
Article
Purpose: To identify and characterize novel, activating mutations in Notch receptors in breast cancer and to determine response to the gamma secretase inhibitor (GSI) PF-03084014. Experimental Design: We used several computational approaches, including novel algorithms, to analyze next generation sequencing data and related omic data sets from The...
Article
Colorectal cancer (CRC) patients have poor prognosis after formation of distant metastasis. Understanding the molecular mechanisms by which genetic changes facilitate metastasis is critical for the development of targeted therapeutic strategies aimed at controlling disease progression while minimizing toxic side effects. A comprehensive portrait of...
Article
Full-text available
Using expression profiles from postmortem prefrontal cortex samples of 624 dementia patients and non-demented controls, we investigated global disruptions in the co-regulation of genes in two neurodegenerative diseases, late-onset Alzheimer's disease (AD) and Huntington's disease (HD). We identified networks of differentially co-expressed (DC) gene...
Article
Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and epigenetic perturbations and unique mutational sign...
Article
Full-text available
Gastric carcinoma is one of the major causes of cancer-related mortality worldwide. Early detection and treatment leads to an excellent prognosis in patients with early gastric cancer (EGC), whereas the prognosis of patients with advanced gastric cancer (AGC) remains poor. It is unclear whether EGCs and AGCs are distinct entities or whether EGCs ar...
Article
c-N-Methyl-N'-nitro-N-nitroso-guanidine HOS transforming gene (c-MET) is a new potential drug target for treatment of patients with hepatocellular carcinoma (HCC), and a recent study of a c-MET inhibitor in such patients has shown promising results. In the present study, we investigated the incidence of c-MET overexpression and its prognostic impac...
Article
Unlabelled: Cancer is a genetic disease with frequent somatic DNA alterations. Studying recurrent copy number aberrations (CNAs) in human cancers would enable the elucidation of disease mechanisms and the prioritization of candidate oncogenic drivers with causal roles in oncogenesis. We have comprehensively and systematically characterized CNAs an...
Article
Progression of hepatocellular carcinoma (HCC) often leads to vascular invasion and intrahepatic metastasis, which correlate with recurrence after surgical treatment and poor prognosis. The molecular prognostic model that could be applied to the HCC patient population in general is needed for effectively predicting disease-free survival (DFS). A coh...
Article
Full-text available
Hepatocellular carcinoma (HCC) is one of the most deadly cancers worldwide and has no effective treatment, yet the molecular basis of hepatocarcinogenesis remains largely unknown. Here we report findings from a whole genome sequencing (WGS) study of 88 matched HCC tumor/normal pairs, 81 of which are HBV positive, seeking to identify genetically alt...
Article
Proceedings: AACR 104th Annual Meeting 2013; Apr 6-10, 2013; Washington, DC Hepatocellular carcinoma (HCC) is one of the most deadly cancers worldwide and has no effective treatment, yet the molecular basis of hepatocarcinogenesis remains largely unknown. Here we report findings from a whole genome sequencing (WGS) study of 88 matched HCC tumour/n...
Article
Full-text available
The MYC protooncogene is associated with the pathogenesis of most human neoplasia. Conversely, its experimental inactivation elicits oncogene addiction. Besides constituting a formidable therapeutic target, MYC also has an essential function in normal physiology, thus creating the need for context-specific targeting strategies. The analysis of post...
Article
Full-text available
To develop a comprehensive overview of copy number aberrations (CNAs) in stage-II/III colorectal cancer (CRC), we characterized 302 tumors from the PETACC-3 clinical trial. Microsatellite-stable (MSS) samples (n = 269) had 66 minimal common CNA regions, with frequent gains on 20 q (72.5%), 7 (41.8%), 8 q (33.1%) and 13 q (51.0%) and losses on 18 (5...
Data
Affected genes in selected GISTIC amplicons. (XLS)
Data
Boxplots for EGFR, ERBB2 and MYC’s mRNA expression grouped by CNA status. (PPT)
Data
Kaplan-Meier curves demonstrate CNAs showing significant association with overall survival. (PPT)
Data
Minimal common regions identified in 269 MSS stage-II/III colon cancer samples. (XLS)
Data
GISTIC peaks identified in 269 MSS stage-II/III colon cancer samples. (XLS)
Data
Characteristics of patients Included in the study. (XLS)
Article
Proceedings: AACR 103rd Annual Meeting 2012‐‐ Mar 31‐Apr 4, 2012; Chicago, IL Centromere-associated protein E (CENP-E) is expressed during mitosis and plays an essential role in establishing and maintaining stable connections between mitotic chromosomes and the microtubules of the spindle. Previous preclinical studies have shown that inhibition of...
Article
Full-text available
Gastric cancer is a heterogeneous disease with multiple environmental etiologies and alternative pathways of carcinogenesis. Beyond mutations in TP53, alterations in other genes or pathways account for only small subsets of the disease. We performed exome sequencing of 22 gastric cancer samples and identified previously unreported mutated genes and...
Article
Full-text available
Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide. A number of molecular profiling studies have investigated the changes in gene and protein expression that are associated with various clinicopathological characteristics of HCC and generated a wealth of scattered information, usually in the form of gene signature tables. A da...
Data
Supplementary Figures This document contains all supplementary figures (Figures S1 through S6).
Data
Full-text available
Supplementary Methods This document contains methods for co-occurrence network construction and module identification.
Data
Supplementary Excel File This Excel file contains the list of modules in the co-expression network and their enriched pathways.
Data
Full-text available
Supplementary Tables This document contains all supplementary tables (Tables S1 through S4).
Data
Full-text available
Supplementary Table 3B
Data
Full-text available
Supplementary Table 3D
Data
Full-text available
Supplementary. Supplementary Table 2
Data
We integrated the Hong Kong gene list with published HCC gene signatures and create the list of HCC prognosis-associated genes. The first part of the list contains genes that reached genome-wide significance level in our Hong Kong data. Herein, we did not adjust for clinicopathologic parameter in order to remain comparable to published signatures....
Data
Full-text available
Figure S2. Association between HCC Prognosis and Gene Expression Profiles in Strata Defined by Clinicopathologic Parameters. Histogram of p-values of the univariate search for genes associated with HCC outcome, conducted within good and poor prognosis strata.
Data
Figure S3. Improving prediction in the good-survival stratum using normal tissue and tumor tissue gene expression profiles together. Shown in Figure 2 and 4, both normal tissue and tumor tissue had prediction value in the good-survival group. Restricted in the patients with both normal and tumor tissues available (N = 110), we followed LOO procedur...
Data
Full-text available
Supplementary Table 3C
Article
Full-text available
The prognosis of hepatocellular carcinoma (HCC) varies following surgical resection and the large variation remains largely unexplained. Studies have revealed the ability of clinicopathologic parameters and gene expression to predict HCC prognosis. However, there has been little systematic effort to compare the performance of these two types of pre...
Data
Enrichments of predictive and differentially correlated genes in the AN and TU co-expression modules. A Co-expression modules tissue, name and number of genes in each module (columns 1–3) are indicated. The number of GO terms enriched, and the top GO terms enriched for each module with enrichment p value and fold enrichment are also shown (columns...
Data
Summary by gene. Shown are a number of measures (columns) derived from this analysis for each gene (rows). The columns are as follows “idx” is a unique row identifier, “Reporter Id” indicates the probe on the array; “chr”, “pos” and “cytoband” refer to the gene location by chromosome nucleotide position and cytoband respectively; “Symbol” indicates...
Data
Co-expression module to module overlaps between tissues. Shown are the overlaps found between co-expression modules derived from AN and TU tissues. “Set1” and “Set2” indicates the AN and TU modules tested respectively. “Pval” and “Fold” indicate the Fishers Exact test p value for enrichment of overlap and the fold increase in the overlap versus wha...
Data
Overlaps between sCNV hotspot correlated genes and co-expression modules. Shown are the overlaps found between genes correlated to the top sCNV hotspots (“Hotspot”) and co-expression modules from AN and TU tissue (“Module”). Also indicated is the fold enrichment of the overlap versus the random expectation in each case (Fold). Only enrichments with...
Data
Demographic and clinical characteristics of HCC patients. (XLS)
Data
Summary of linear regression analysis for AN expression versus TU sCNV markers. The distribution of variance explained by sCNV markers in the regression models for all genes is shown (Methods) using AN expression and TU derived sCNV markers. Counts for number of genes with various cut-offs for variance explained (“R2>cutoff” column 1) for all AN ge...
Data
GO term enrichments for AN and TU co-expression modules. Shown are the top GO term enrichments as judged by the relative fold enrichment for each AN and TU co-expression module. “Similar Set” indicates the GO term, “Module” indicates the co-expression module. “Fold” indicates the fold enrichment of the overlap versus expected, and “Diff” indicates...
Data
Comparison of CGHseq and smoothed logR ratio methods of estimating sCNV. Shown is a comparison of methods for estimating copy number aberrations as described in the Methods. The ability of each method to detect associations between DNA variation and gene variation in cis for a given FDR was assessed. Shown are the FDR cut-offs applied (“FDR”, colum...
Data
Overlaps between genes correlated to sCNV hotspots. Overlaps between genes correlated to the top 7 sCNV hotspots (see Results and Methods) were tested. Hotspots are identified by the chromosome on which they resided (top row and first column). The Fisher Exact test p value for the overlap is shown for each comparison below and to the left of the gr...
Data
Relationship of sCNV markers to survival. Listed are the relationships found between sCNV markers genome wide and survival using Cox regression (Methods). “Idx” indicates a unique identifier for each sCNV marker, “SNPid” is the SNP identifier for the SNP at the center of the smoothed window of adjacent sCNV markers (see Methods for derivation of sm...
Article
Full-text available
In hepatocellular carcinoma (HCC) genes predictive of survival have been found in both adjacent normal (AN) and tumor (TU) tissues. The relationships between these two sets of predictive genes and the general process of tumorigenesis and disease progression remains unclear. Here we have investigated HCC tumorigenesis by comparing gene expression, D...
Article
Full-text available
A critical task in systems biology is the identification of genes that interact to control cellular processes by transcriptional activation of a set of target genes. Many methods have been developed that use statistical correlations in high-throughput data sets to infer such interactions. However, cellular pathways are highly cooperative, often req...
Data
Supplementary text, Supplementary figures S1–15, Supplementary tables SI–X
Article
Full-text available
Assembly of a transcriptional and post-translational molecular interaction network in B cells, the human B-cell interactome (HBCI), reveals a hierarchical, transcriptional control module, where MYB and FOXM1 act as synergistic master regulators of proliferation in the germinal center (GC). Eighty percent of genes jointly regulated by these transcri...
Article
Hepatocellular carcinoma (HCC) is the sixth most common cancer worldwide and the third most common cause of cancer related death. HIF-1 alpha (HIF1α) is overexpressed in many human cancers as a result of intratumoral hypoxia as well as genetic alterations, such as gain-of-function mutations in oncogenes and loss-of-function mutations in tumour-supp...
Data
A Montage display of independently simulated time series for X→Y based on Equation 1. Each time series consists of 240 time points (only the first 50 points are shown here). Blue lines are for X, and red lines are for Y. (0.06 MB EPS)
Data
Inferred causal links in the fast blood Granger causal network. (0.03 MB TXT)
Data
The distributions of bootstrapping confident values of links inferred in both fast and fed Granger causality networks. (A) 80% links in the fast network have confident values above 0.5 (B) 90% of links in the fed network have confident values above 0.5. (0.12 MB TIF)
Data
Inferred inter-slice causal links in the fast blood Dynamic Bayesian network. (0.02 MB TXT)
Data
Prediction accuracies of Granger causality X→Y using the simulated time series shown in Figure S1. Each full series consists of 240 time points and each short series consists of 6 time points. (0.03 MB TIF)
Data
Full-text available
The out-degree distributions of both fasted and fed Granger causality networks exhibit scale-free properties. (A) The out-degree distribution for the fasted network; (B) the out-degree distribution for the fed network. (0.03 MB PDF)
Data
Inferred inter-slice causal links in the fed blood Dynamic Bayesian network. (0.02 MB TXT)
Data
Inferred causal links in the fed blood Granger causal network. (0.01 MB TXT)
Article
Full-text available
Gene expression data generated systematically in a given system over multiple time points provides a source of perturbation that can be leveraged to infer causal relationships among genes explaining network changes. Previously, we showed that food intake has a large impact on blood gene expression patterns and that these responses, either in terms...
Article
Full-text available
Co-expression networks are routinely used to study human diseases like obesity and diabetes. Systematic comparison of these networks between species has the potential to elucidate common mechanisms that are conserved between human and rodent species, as well as those that are species-specific characterizing evolutionary plasticity. We developed a s...
Data
Full-text available
Normality check of the distributions of all pair-wise Pearson correlation coefficients by Kolmogorov-Smirnov (KS) test in (A) human, (B) mouse and (C) rat data. The red dotted lines represent the statistical significance cutoff for rejecting the normality assumption. (0.02 MB PDF)
Data
Full-text available
Visualization of modules identified by spectral clustering on the connectivity matrix of conserved interactions among human, mouse and rat. Orthologous genes among the three species are on both rows and columns. A back dot represents a conserved interaction between the corresponding gene pairs. Colored squares along the diagonal indicate identified...
Data
6455 orthologous genes among human, mouse and rat used in the analysis. (1.42 MB XLS)
Data
Full-text available
The modular structure used in gene expression simulation. The network consists of 10 functional modules and 1 null module. Genes in each functional module are regulated by a latent regulator. (0.02 MB PDF)
Data
Permutation test results for enrichment analyses. (A) Null distribution (blue bars) of the number of lipid associating genes by randomly select 1000 sets of 6455 genes, and the statistic calculated based on the 6455 orthologous genes (red line). (B) Null distribution (blue bars) of the number of validated lipid associating genes by randomly select...
Data
Full-text available
The qualities of top predicted pairs based on existing meta-analysis methods and the proposed method. 20,230 was chosen based on the proposed semi-parametric method at the false positive rate 0.05. ‘%GO’ indicates the percent of gene pairs sharing a common specific Gene Oncology biological process category. ‘%KEGG’ indicates the percent of gene pai...
Data
Full-text available
Comparison of annotations for GWAS candidate genes based on the conserved modules or human modules. The annotation based on conserved module agrees better with the annotation based on the gene's Gene Ontology annotation. (0.01 MB PDF)
Data
Full-text available
Enrichment of lipid-associating genes among orthologous genes between human and rodents. Lipid-associating genes are selected at different window size around a lipid-associating loci in Framingham and Broad studies. (0.03 MB PDF)
Data
Full-text available
Top 20 genes with the most human-specific co-expression interactions. The numbers of interactions among themselves are also shown. (0.02 MB PDF)
Data
Full-text available
2-D Hierarchical clustering results of 3 liver data sets. 6455 orthologous genes are on the horizontal axis, experiments are on the vertical axis. (A) for the human liver data; (B) for the mouse liver data; (C) for the rat liver data. The ordered sample annotations in the vertical axis and the ordered gene symbols in the horizontal axis for each fi...
Data
Full-text available
Conserved and differential interactions between human and rodent species. (A) Human vs. mouse comparison. (B) Human vs. rat comparison. Numbers in the parenthesis are number of genes in each category. P-values were computed using Kruskal Wallis non-parametric test of equal medians (median Ka/Ks for the “Conserved Only” and “greater or euqal to 1 Di...
Data
Full-text available
In comparison with genes involved in only conserved interactions between human and rodents (box on the left), genes having human-specific interactions with human-rodent orthologs (box on the right) display a higher ratio of interactions to human-specific genes vs. human-rodent orthologs in the human liver co-expression network. Top 161 genes from e...
Data
Ordered annotations of human liver samples shown in Figure S3A. (0.01 MB TXT)

Network

Cited By