Hai-Hui Huang
Shaoguan University | SGU · Computer Science

Doctor of Science

About

Publications

5,338

Reads

640

Citations

Email: tomyhwang@163.com

Skills and Expertise

Data Mining and Knowledge Discovery

Advanced Machine Learning

Unsupervised Learning

Feature Extraction

Data Science

Pattern Classification

Publications

Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization

Article

Full-text available

May 2016

Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 p...

Table 3 Clinical features of RA patients and healthy individuals

On-chip enrichment of acidic N-glycans. a Diagram of the specialized...

The greatly improved detection of acidic N-glycans after enrichment on...

A method to identify trace sulfated IgG N-glycans as biomarkers for rheumatoid arthritis

Article

Full-text available

Sep 2017

N-linked glycans on immunoglobulin G (IgG) have been associated with pathogenesis of diseases and the therapeutic functions of antibody-based drugs; however, low-abundance species are difficult to detect. Here we show a glycomic approach to detect these species on human IgGs using a specialized microfluidic chip. We discover 20 sulfated and 4 acety...

Hybrid L 1/2 + 2 Method for Gene Selection in the Cox Proportional Hazards Model

Article

Jun 2018

Background and objective: An important issue in genomic research is to identify the significant genes that related to survival from tens of thousands of genes. Although Cox proportional hazards model is a conventional survival analysis method, it does not induce the gene selection. Methods: In this paper, we extend the hybrid L1/2 + 2 regulariza...

An integrative analysis system of gene expression using self-paced learning and SCAD-Net

Article

Jun 2019

Background Few proposed gene biomarkers have been satisfactory in clinical applications. That is mainly due to the small studies sample size. Because of the batch effect, different gene-expression studies cannot be merged directly. Many integrative methods have attempted to integrate various datasets to eliminate the batch effect while keeping biol...

A Novel Cox Proportional Hazards Model for High-Dimensional Genomic Data in Cancer Prognosis

Article

Dec 2019

The Cox proportional hazards model is a popular method to study the connection between feature and survival time. Because of the high-dimensionality of genomic data, existing Cox models trained on any specific dataset often generalize poorly to other independent datasets. In this paper, we suggest a novel strategy for the cox model. This strategy i...

MUMA: A Multi-Omics Meta-Learning Algorithm for Data Interpretation and Classification

Article

Feb 2024

Multi-omics data integration is a promising field combining various types of omics data, such as genomics, transcriptomics, and proteomics, to comprehensively understand the molecular mechanisms underlying life and disease. However, the inherent noise, heterogeneity, and high dimensionality of multi-omics data present challenges for existing method...

Editorial: Integrative analysis for complex disease biomarker discovery

Article

Full-text available

Aug 2023

Fuzzy dynamic MCDM method based on PRSRV for financial risk evaluation of new energy vehicle industry

Article

Feb 2023

Overview of the proposed DA-DSL-L2 meta-analysis framework. Data1, …,...

The 10 highest-ranked gene alterations in the TCGA NSCLC cancer...

Survival prediction for the top 10 highest-ranked genes selected by...

Association between the mRNA expression of ACAP2, ECHDC3, EGR1, and...

The pathways analysis. Ratio enrichment indicates the functional...

A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression

Article

Full-text available

Aug 2022

Background Gene expression analysis can provide useful information for analyzing complex biological mechanisms. However, many reported findings are unrepeatable due to small sample sizes relative to a large number of genes and the low signal-to-noise ratios of most gene expression datasets. Results Meta-analysis of multi-data sets is an efficient...

Sparse principal component analysis based on genome network for correcting cell type heterogeneity in epigenome-wide association studies

Article

Jul 2022

In epigenome-wide association studies (EWAS), the mixed methylation expression caused by the combination of different cell types may lead the researchers to find the false methylation site related to the phenotype of interest. To correct the EWAS false discovery, some non-reference models based on sparse principal component analysis (sparse PCA) ha...

Graphical representation of the parameters of PFN

Graphical model of distance measure based orthocenter

When CCN meets MCGDM: optimal cache replacement policy achieved by PRSRV with Pythagorean fuzzy set pair analysis

Article

Full-text available

Feb 2022

Cache replacement policy (CRP) in content-centric network (CCN) can reduce cache redundancy, optimize cache utility, and improve network performance. When assessing the CRPs in CCN, it is often full of great uncertainty. Set pair analysis (SPA) is a pioneering uncertainty theory, which consists of three components of the connection number (CN), and...

SLNL: A novel method for gene selection and phenotype classification

Article

Feb 2022

One of the central tasks of genome research is to predict phenotypes and discover some important gene biomarkers. However, there are three main problems in analyzing genomics data to predict phenotypes and gene marker selection. Such as large p and small n, low reproducibility of the selected biomarkers, and high noise. To provide a unified solutio...

Integrating molecular interactions and gene expression to identify biomarkers to predict response to tumor necrosis factor inhibitor therapies in rheumatoid arthritis patients

Article

Full-text available

Jan 2022

Background: Targeted therapy using anti-TNF is the first option for patients with RA. Anti-TNF therapy, however, does not lead to meaningful clinical improvement in many RA patients. To predict which patients will not benefit from anti-TNF therapy, clinical tests should be performed prior to treatment beginning. Objective: Although various effor...

Fig. 1. The gene ontology (GO) enrichment analysis.

Fig. 2. The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment...

Fig. 3. The molecular complex detection algorithm resulted in the...

Integrating Molecular Interactions and Gene Expression to Identify Biomarkers and Network Modules of Chronic Obstructive Pulmonary Disease

Article

Full-text available

Jul 2021

Background: Chronic obstructive pulmonary disease (COPD) causes chronic obstructive conditions, chronic bronchitis, and emphysema, and is a major cause of death worldwide. Although several efforts for identifying biomarkers and pathways have been made, specific causal COPD mechanism remains unknown. Objective: This study combined biological inte...

SPLSN: An efficient tool for survival analysis and biomarker selection

Article

Jun 2021

In genome research, it is a fundamental issue to identify few but important survival‐related biomarkers. The Cox model is a widely used survival analysis technique, which is used to study the relationship between characteristics and survival response. However, limitations of the existing Cox methods for genomic data are as follows: (1) a typical ge...

Cancer classification and biomarker selection via a penalized logsum network-based logistic regression model

Article

Full-text available

Mar 2021

Background: In genome research, it is particularly important to identify molecular biomarkers or signaling pathways related to phenotypes. Logistic regression model is a powerful discrimination method that can offer a clear statistical explanation and obtain the classification probability of classification label information. However, it is unable...

q‐Rung orthopair fuzzy decision‐making framework for integrating mobile edge caching scheme preferences

Article

Feb 2021

Mobile edge caching scheme (MECS) can determine where, how, and what to cache on user equipment by employing its own storage. When considering the performance of MECS, it is often full of uncertainty. The q‐rung orthopair fuzzy set (q‐ROFS), characterized by membership and nonmembership degrees with adjustable parameter q, is quite a high‐efficienc...

Edge sparse PCA based on gene network for correcting cell type heterogeneity in epigenome-wide association studies

Preprint

Full-text available

Aug 2020

Background: In epigenome-wide association studies (EWAS), the mixed methylation expression caused by the combination of different cell types may lead the researchers to find the false methylation site related to the phenotype of interest. In order to fix this problem, researchers have proposed some non-reference methods based on sparse principle co...

FIGURE 3: The prediction result of GECS. The x-axis is the selected...

The information of resetting the gene expression profiles.

The average classification performance of different methods for the...

The selection performance of different methods for the synthetic dataset.

The classification performance of various methods for three real lung...

A Genotype-Based Ensemble Classifier System for Non-Small-Cell Lung Cancer

Article

Full-text available

Jul 2020

The heterogeneity of cancer reflects the complexity of genetic mutations. Dissecting the heterogeneity plays an important role in the field of biomarker discovery, targeted therapy and drug designing. As it is time-consuming to identify new biomarkers in biological experiments, various machine learning methods have been developed. However, the curr...

Figure 2. Top 5 diverse application fields of q-ROFS

Figure 4. Sensitivity analysis of the combined weight information

Figure 9. The comparison with existing methods

Ranking results and optimal enterprise from diverse methods in Example 4

Fuzzy decision making method based on CoCoSo with critic for financial risk evaluation

Article

Full-text available

Mar 2020

The financial risk evaluation is critically vital for enterprises to identify the potential financial risks, provide decision basis for financial risk management, and prevent and reduce risk losses. In the case of considering financial risk assessment, the basic problems that arise are related to strong fuzziness, ambiguity and inaccuracy. q-rung o...

Figure 2. (A) Drug-side-effects ROC curves with different methods in...

Figure 3. (A) Drug-side-effects ROC curves with different methods in...

Figure 4. (A) Drug-side-effects ROC curves with different methods in...

Mechanisms of drugs passing BBB and the applicable scope of prediction...

The four-layer Deep Learning model constructed in this paper,...

Improved Classification of Blood-Brain-Barrier Drugs Using Deep Learning

Article

Full-text available

Jun 2019

Blood-Brain-Barrier (BBB) is a strict permeability barrier for maintaining the Central Nervous System (CNS) homeostasis. One of the most important conditions to judge a CNS drug is to figure out whether it has BBB permeability or not. In the past 20 years, the existing prediction approaches are usually based on the data of the physical characterist...

SM_10.1159000495826.pdf

Data

Dec 2018

Clinical Drug Response Prediction by Using a Lq Penalized Network-Constrained Logistic Regression Method

Article

Full-text available

Dec 2018

Background/Aims: One of the most important impacts of personalized medicine is the connection between patients’ genotypes and their drug responses. Despite a series of studies exploring this relationship, the predictive ability of such analyses still needs to be strengthened. Methods: Here we present the Lq penalized network-constrained logistic re...

Fig. 1 Quantification repeatability of N-glycans on HPLC Chip-QQQ MS....

Reply to ‘Trace N-glycans including sulphated species may originate from various plasma glycoproteins and not necessarily IgG’

Article

Full-text available

Dec 2018

Robust sparse accelerated failure time model for survival analysis

Article

Full-text available

Apr 2018

To identify the bio-mark genes related to disease with high dimension and low sample size gene expression data, various regression approaches with different regularization methods have been proposed to solve this problem. Nevertheless, high-noises in biological data significantly reduce the performances of methods. The accelerated failure time (AFT...

Supplementary Material 6

Data

Sep 2017

Supplementary Material 2

Data

Sep 2017

Supplementary Material 1

Data

Sep 2017

Supplementary Material 3

Data

Sep 2017

Supplementary Material 4

Data

Sep 2017

Supplementary Material 5

Data

Sep 2017

Molecular pathway identification using a new L1/2 solver and biological network-constrained mode

Article

Jul 2017

Low-rank and sparse matrix decomposition based on S1/2 and L1/2 regularizations in dynamic MRI

Conference Paper

Dec 2016

Image Super-Resolution Reconstruction via L1/2 and S1/2 Regularizations

Conference Paper

Nov 2016

S2 File

Data

May 2016

The most frequently selected 10 genes information. Top-10 ranked genes selected by all the methods for prostate and lymphoma datasets. (PDF)

S1 File

Data

May 2016

The proof of theorem 1. (PDF)

Identification of 13 blood-based gene expression signatures to accurately distinguish tuberculosis from other pulmonary diseases and healthy controls

Article

Full-text available

Sep 2015

Tuberculosis (TB), caused by infection with mycobacterium tuberculosis, is still a major threat to human health worldwide. Current diagnostic methods encounter some limitations, such as sample collection problem or unsatisfied sensitivity and specificity issue. Moreover, it is hard to identify TB from some of other lung diseases without invasive bi...

The solution paths of the enhanced L1/2 net for the lung cancer...

The solution paths of L1/2 net for the lung cancer dataset in one...

The solution paths of L1 net for the lung cancer dataset in one sample run.

The solution paths of the Elastic net for the lung cancer dataset in...

Subnetworks identified by the enhanced L1/2 net for lung cancer...

Network-Based Logistic Classification with an Enhanced L 1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer

Article

Full-text available

Jun 2015

Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L 1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic researc...

Supplementary Material

Data

Jun 2015

“Sub-networks identified by the L_1 net and the Elastic net for lung cancer datasets (only those genes that are linked on the PPI network are plotted). Nodes colored based on higher (red) to lower (green) coefficients in the model.”