Home
Tsinghua University
Department of Computer Science and Technology
Yukuo Cen

Yukuo Cen
Tsinghua University | TH · Department of Computer Science and Technology

Focusing on CogDL (https://github.com/THUDM/cogdl).

About

Publications

5,553

Reads

1,186

Citations

I'm currently working on graph neural networks and recommender systems.

Skills and Expertise

Machine Learning

Classification

September 2018 - present

Tsinghua University

Computer Science and Technology

Position

PhD Student

Description

working on graph neural networks and recommender systems

September 2014 - July 2018

Tsinghua University

Field of study

Computer Science and Technology

Publications

GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment

Preprint

Full-text available

Jun 2024

Graph self-supervised learning (SSL) holds considerable promise for mining and learning with graph-structured data. Yet, a significant challenge in graph SSL lies in the feature discrepancy among graphs across different domains. In this work, we aim to pretrain one graph neural network (GNN) on a varied collection of graphs endowed with rich node f...

Does Negative Sampling Matter? A Review with Insights into its Theory and Applications

Article

Full-text available

Feb 2024

Negative sampling has swiftly risen to prominence as a focal point of research, with wide-ranging applications spanning machine learning, computer vision, natural language processing, data mining, and recommender systems. This growing interest raises several critical questions: Does negative sampling really matter? Is there a general framework that...

BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs

Conference Paper

Aug 2023

MTDiag: An Effective Multi-Task Framework for Automatic Diagnosis

Article

Jun 2023

Automatic diagnosis systems aim to probe for symptoms (i.e., symptom checking) and diagnose disease through multi-turn conversations with patients. Most previous works formulate it as a sequential decision process and use reinforcement learning (RL) to decide whether to inquire about symptoms or make a diagnosis. However, these RL-based methods hea...

BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs

Preprint

Jun 2023

In-Batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negative sharing strategy, in which every instance serves as a negative for the others within the mini-batch. Recent studies aim to improv...

CogDL: A Comprehensive Library for Graph Deep Learning

Conference Paper

Apr 2023

GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

Conference Paper

Apr 2023

Ablation studies of GraphMAE2 key components.

Hyper-parameters on large-scale datasets.

GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner

Preprint

Full-text available

Apr 2023

Graph self-supervised learning (SSL), including contrastive and generative approaches, offers great potential to address the fundamental challenge of label scarcity in real-world graph data. Among both sets of graph SSL techniques, the masked graph autoencoders (e.g., GraphMAE)--one type of generative method--have recently produced promising result...

Figure 1: EPFO query reasoning: KGE-based reasoners vs. Pre-trained...

Figure 2: The KG Pre-Training and Reasoning Framework. (a)...

Figure 3: FFN's activation ratio of kgTransformer along pretraining...

Figure 4: Hyperparameter analysis on NELL995 (Hits@3m). The default...

Ablation on pre-training & fine-tuning (Hits@3m).

Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries

Preprint

Full-text available

Aug 2022

Knowledge graph (KG) embeddings have been a mainstream approach for reasoning over incomplete KGs. However, limited by their inherently shallow and static architectures, they can hardly deal with the rising focus on complex logical queries, which comprise logical operators, imputed edges, multiple source entities, and unknown intermediate entities....

Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries

Conference Paper

Aug 2022

GraphMAE: Self-Supervised Masked Graph Autoencoders

Conference Paper

Aug 2022

Rethinking the Setting of Semi-supervised Learning on Graphs

Conference Paper

Jul 2022

We argue that the present setting of semisupervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To ex...

GACT: Activation Compressed Training for General Architectures

Preprint

Full-text available

Jun 2022

Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a...

Rethinking the Setting of Semi-supervised Learning on Graphs

Preprint

May 2022

Improving the Training of Graph Neural Networks with Consistency Regularization

Preprint

Dec 2021

Graph neural networks (GNNs) have achieved notable success in the semi-supervised learning scenario. The message passing mechanism in graph neural networks helps unlabeled nodes gather supervision signals from their labeled neighbors. In this work, we investigate how consistency regularization, one of widely adopted semi-supervised learning methods...

Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning

Preprint

Nov 2021

Adversarial attacks on graphs have posed a major threat to the robustness of graph machine learning (GML) models. Naturally, there is an ever-escalating arms race between attackers and defenders. However, the strategies behind both sides are often not fairly compared under the same and realistic conditions. To bridge this gap, we present the Graph...

Automated Unsupervised Graph Representation Learning

Article

Sep 2021

Graph data mining has largely benefited from the recent developments of graph representation learning. Most attempts to improve graph representations have thus far focused on designing new network embedding or graph neural network (GNN) architectures. Inspired by the SGC and ProNE models, we instead focus on enhancing any existing or learned graph...

Graph Representation Learning: Foundations, Methods, Applications and Systems

Conference Paper

Aug 2021

WuDaoCorpora: A Super Large-scale Chinese Corpora for Pre-training Language Models

Article

Jun 2021

Using large-scale training data to build a pre-trained language model (PLM) with a larger volume of parameters can significantly improve downstream tasks. For example, OpenAI trained the GPT3 model with 175 billion parameters on 570GB English training data, enabling downstream applications building with only a small number of samples. However, ther...

CogDL: An Extensive Toolkit for Deep Learning on Graphs

Preprint

Full-text available

Mar 2021

Graph representation learning aims to learn low-dimensional node embeddings for graphs. It is used in several real-world applications such as social network analysis and large-scale recommender systems. In this paper, we introduce CogDL, an extensive research toolkit for deep learning on graphs that allows researchers and developers to easily condu...

Controllable Multi-Interest Framework for Recommendation

Conference Paper

Aug 2020

Controllable Multi-Interest Framework for Recommendation

Preprint

Full-text available

May 2020

Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behav...

Towards Knowledge-Based Recommender Dialog System

Preprint

Aug 2019

In this paper, we propose a novel end-to-end framework called KBRD, which stands for Knowledge-Based Recommender Dialog System. It integrates the recommender system and the dialog generation system. The dialog system can enhance the performance of the recommendation system by introducing knowledge-grounded information about users' preferences, and...

Representation Learning for Attributed Multiplex Heterogeneous Network

Conference Paper

Jul 2019

Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different at...

Representation Learning for Attributed Multiplex Heterogeneous Network

Preprint

Full-text available

May 2019

Trust Relationship Prediction in Alibaba E-Commerce Platform

Article

Full-text available

Jan 2019

This paper introduces how to infer trust relationships from billion-scale networked data to benefit Alibaba E-Commerce business. To effectively leverage the network correlations between labeled and unlabeled relationships to predict trust, we formalize trust into multiple types and propose a graphical model to incorporate type-based dyadic and tria...

Towards Knowledge-Based Recommender Dialog System

Conference Paper