Donghua Yang's research works | Harbin Institute of Technology and other places

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

TodyNet: Temporal Dynamic Graph Neural Network for Multivariate Time Series Classification

Article

June 2024

4 Reads

1 Citation

Information Sciences

[...]

Schema Integration on Massive Data Sources

Chapter

March 2024

15 Reads

[...]

As the fundamental phrase of collecting and analyzing data, data integration is used in many applications, such as data cleaning, bioinformatics and pattern recognition. In big data era, one of the major problems of data integration is to obtain the global schema of data sources since the global schema could be hardly derived from massive data sources directly. In this paper, we attempt to solve such schema integration problem. For different scenarios, we develop batch and incremental schema integration algorithms. We consider the representation difference of attribute names in various data sources and propose ED Join and Semantic Join algorithms to integrate attributes with different representations. Extensive experimental results demonstrate that the proposed algorithms could integrate schemas efficiently and effectively.

Approximate Query Processing Based on Approximate Materialized View

Chapter

March 2024

7 Reads

[...]

In the context of big data, the interactive analysis database system needs to answer aggregate queries within a reasonable response time. The proposed AQP++ framework can integrate data preprocessing and AQP. It connects existing AQP engine with data preprocessing method to complete the connection between them in the process of interaction analysis. After the research on the application of materialized views in AQP++ framework, it is found that the materialized views used in the two parts of the framework both come from the accurate results of precomputation, so there’s still a time bottleneck under large scale data. Based on such limitations, we proposed to use approximate materialized views for subsequent results reuse. We take the method of identifying approximate interval as an example, compared the improvement of AQP++ by using approximate materialized view, and trying different sampling methods to find better time and accurate performance results. By constructed larger samples, we compared the differences of time, space and accuracy between approximate and general materialized views in AQP++, and analyzed the reasons for the poor performance in some cases of our methods. Based on the experimental results, it proved that the use of approximate materialized view can improve the AQP++ framework, it effectively save time and storage space in the preprocessing stage, and obtain the accuracy similar to or better than the general AQP results as well.

Bandwidth Exploration for Spatial-Temporal Kernel Density Visualizations

Preprint

January 2024

4 Reads

[...]

Reinforcement Learning-Based Adaptive Stateless Routing for Ambient Backscatter Wireless Sensor Networks

Article

January 2024

1 Citation

IEEE Transactions on Communications

Huanyu Guo

Donghua Yang

Hong Gao

This paper explores the routing problem in ambient backscatter wireless sensor networks (AB-WSNs) using reinforcement learning approaches. Ambient RF signals serve as the only power source for battery-less sensor nodes and are also leveraged to enable backscatter communication among these nodes. This results in intermittent connection and dynamic topology within AB-WSNs, thereby making it difficult to route data to the sink, e.g., data may not reach the sink in a timely manner. We first introduce a multi-agent network model with two mechanisms to address this issue. We then model the routing problem with the Markov decision process, allowing each node to make informed route decisions based on the current state of its neighbors. With the aim of enabling each node to learn the optimal routing policy and do adaptive stateless routing, we propose two learning algorithms. The first, a value-based learning algorithm, is designed for sparse AB-WSNs. And the second, a policy-based learning algorithm, is intended to tackle the curse of dimensionality in dense AB-WSNs. We analyze the convergence of both learning algorithms and evaluate their performance through extensive experiments. The experiment results validate the convergence and efficiency of the proposed learning algorithms under various conditions.

Database-Integrated Machine Learning for Enhanced Performance

Conference Paper

December 2023

1 Read

[...]

LSTM-based Flow Prediction

Conference Paper

December 2023

3 Reads

[...]

Automated Feature Interaction and Feature Representation Learning of Multi-field Categorical Data

Conference Paper

December 2023

3 Reads

[...]

Testing Higher-Order Clusterability on Graphs

Chapter

December 2023

1 Read

Yifei Li

Donghua Yang

Jianzhong Li

Analysis of higher-order organizations, usually small connected subgraphs called motifs, is a fundamental task on complex networks. This paper studies a new problem of testing higher-order clusterability: given query access to an undirected graph, can we judge whether this graph can be partitioned into a few clusters of highly-connected motifs? This problem is an extension of the former work proposed by Czumaj et al. (STOC’ 15), who recognized cluster structure on graphs using the framework of property testing. In this paper, a good graph cluster on high dimensions is first defined for higher-order clustering. Then, query lower bound is given for testing whether this kind of good cluster exists. Finally, an optimal sublinear-time algorithm is developed for testing clusterability based on triangles.

Ensemble feature selection with adaptive weights

Conference Paper

October 2023

7 Reads

[...]

... Note that distinguishing drifts and faults by their duration has been done in similar problems before, see for example J. Liu et al. (2023). ...
Reference:
Fault detection in propulsion motors in the presence of concept drift

Anomaly and change point detection for time series with concept drift

Citing Article
Full-text available
July 2023

World Wide Web

[...]

... KGs need to be accessible to support a variety of different tasks, beyond the mere integration of different knowledge sources, and thus KG storage management [164,142,177] is an active area of research. Current KG storage mechanisms are divided into relation based stores (e.g., [1]) and native graph stores (e.g., [197]). ...
Reference:
Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities

PreKar: A learned performance predictor for knowledge graph stores

Citing Article
Publisher preview available
March 2022

World Wide Web

Donghua Yang's research while affiliated with Harbin Institute of Technology and other places

What is this page?

Publications (21)

Citations (2)