Qing Huang

Qing Huang
Jiangxi Normal University · School of Computer

PhD

About

57
Publications
8,057
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
178
Citations
Introduction
Qing Huang graduated from the State Key Laboratory of Software Engineering, School of Computer, Wuhan University. Now Qing works at Jiangxi Normal University. He does research in Programming Languages, Software Engineering and Artificial Intelligence.

Publications

Publications (57)
Article
Data flow graphs (DFGs) capture definitions (defs) and uses across program blocks, which is a fundamental program representation for program analysis, testing and maintenance. However, dynamically-typed programming languages like Python present implicit data flow issues that make it challenging to determine def-use flow information at compile time....
Article
Full-text available
The program construction process is based on rigorous mathematical reasoning, which leads to a fully correct algorithmic program via step-by-step refinement of the program specifications. The existing program construction methods' refinement process is partly based on individual subjective speculation and analysis, which lacks a precise guidance me...
Article
Smart contracts with excessive gas consumption can cause economic losses, such as black hole contracts. Actual gas consumption depends on runtime information and has a probability distribution under different runtime situations. However, existing static analysis tools (e.g., Solc) cannot define runtime information and only provide an approximate up...
Conference Paper
Full-text available
Exploratory testing leverages the tester's knowledge and creativity to design test cases for effectively uncovering system-level bugs from the end user's perspective. Researchers have worked on test scenario generation to support exploratory testing based on a system knowledge graph, enriched with scenario and oracle knowledge from bug reports. Nev...
Article
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
Conference Paper
Full-text available
API documentation, technical blogs and programming Q&A sites contain a large amount of partial code that can be reused in programming tasks. However, due to unresolved simple names and last-mile syntax errors, such partial code is frequently not compilable. To facilitate partial code reuse, we develop PCR-Chain for resolving FQNs and fixing last-mi...
Article
Full-text available
The hybrid automatic readability assessment (ARA) models that combine deep and linguistic features have recently received rising attention due to their impressive performance. However, the utilization of linguistic features is not fully realized , as ARA models frequently concentrate excessively on numerical values of these features, neglecting val...
Article
Full-text available
Unlimited by the state and space, the formal verification technology based on mechanized theorem proof is an important method to ensure software correctness and avoid serious loss from potential software bugs. LLRB (left-leaning red-black trees) is a variant of binary search trees, and its structure has an additional left-leaning constraint over th...
Article
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge ba...
Preprint
We conduct the first empirical study on using knowledge transfer to improve the generalization ability of large language models (LLMs) in software engineering tasks, which often require LLMs to generalize beyond their training data. Our proposed general knowledge transfer approach guides the LLM towards a similar and familiar API or code snippet it...
Article
Full-text available
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
Article
Full-text available
The traditional program refinement strategy cannot be refined to an executable program, and there are issues such as low verification reliability and automation. To solve the above problems, this paper proposes a nonlinear program construction and verification method based on partition recursion and Morgan's refinement rules. First, we use recursiv...
Article
Full-text available
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule-based and sequence-labeling based. They must manually enumerate the rules or label data for a wide r...
Preprint
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
Preprint
API documentation, technical blogs and programming Q&A sites contain numerous partial code that can be reused in programming tasks, but often these code are uncompilable due to unresolved names and syntax errors. To facilitate partial code reuse, we propose the Partial Code Reuse Chain (PCR-Chain) for resolving fully-qualified names (FQNs) and fixi...
Preprint
Full-text available
p>Foundation models, such as GPT-4, DALL-E have brought unprecedented AI "operating system" effect and new forms of human-AI interaction, sparking a wave of innovation in AI-native services, where natural language prompts serve as executable "code" directly (prompt as executable code), eliminating the need for programming language as an intermediar...
Preprint
Full-text available
Foundation models, such as GPT-4, DALL-E have brought unprecedented AI "operating system" effect and new forms of human-AI interaction, sparking a wave of innovation in AI-native services, where natural language prompts serve as executable "code" directly (prompt as executable code), eliminating the need for programming language as an intermediary...
Preprint
Control Flow Graphs (CFGs) are essential for visualizing, understanding and analyzing program behavior. For statically-typed programming language like Java, developers obtain CFGs by using bytecode-based methods for compilable code and Abstract Syntax Tree (AST)-based methods for partially uncompilable code. However, explicit syntax errors during A...
Article
Full-text available
The smart contract, a self-executing program on the blockchain, is key to programmable finance. However, the rise of smart contract use has also led to an increase in vulnerabilities that attract illegal activity from hackers. Traditional manual approaches for vulnerability detection, relying on domain experts, have limitations such as low automati...
Preprint
Developers' API needs should be more pragmatic, such as seeking suggestive, explainable, and extensible APIs rather than the so-called best result. Existing API search research cannot meet these pragmatic needs because they are solely concerned with query-API relevance. This necessitates a focus on enhancing the entire query process, from query def...
Article
Full-text available
Programmers who work with smart contract development often encounter challenges in reusing code from repositories. This is due to the presence of two unknowns that can lead to non-functional and functional failures. These unknowns are implicit collaborations between functions and subtle differences among similar functions. Current code mining metho...
Preprint
Full-text available
p>In this paper, we present the development of an explainable Gas Estimator model EGE, utilizing big data to mine potential distribution of runtime information. Our approach overcomes the limitations of static gas analysis tools by labeling functions based on interval probability distribution and building code representations containing program sem...
Preprint
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule-based and sequence-labeling based. They must manually enumerate the rules or label data for a wide r...
Article
Full-text available
Developers’ API needs should be more pragmatic, such as seeking suggestive, explainable, and extensible APIs rather than the so-called best result. Existing API search research cannot meet these pragmatic needs because they are solely concerned with query-API relevance. This necessitates a focus on enhancing the entire query process, from query def...
Preprint
Pre-trained giant code models (PCMs) start coming into the developers' daily practices. Understanding what types of and how much software knowledge is packed into PCMs is the foundation for incorporating PCMs into software engineering (SE) tasks and fully releasing their potential. In this work, we conduct the first systematic study on the SE factu...
Preprint
The automatic generation of Chinese fonts is an important problem involved in many applications. The predominated methods for the Chinese font generation are based on the deep generative models, especially the generative adversarial networks (GANs). However, existing GAN-based methods (say, CycleGAN) for the Chinese font generation usually suffer f...
Article
Full-text available
The automatic algorithm programming model can increase the dependability and efficiency of algorithm program development, including specification generation, program refinement, and formal verification. However, the existing model has two flaws: incompleteness of program refinement and inadequate automation of formal verification. This paper propos...
Article
Full-text available
In the formal derivation and proof of binary tree algorithms, Dijkstra's weakest predicate method is commonly used. However, the method has some drawbacks, including a time-consuming derivation process, complicated loop invariants, and the inability to generate executable programs from the specification. This paper proposes a unified strategy for t...
Article
Full-text available
The development of artificial intelligence in education promotes the reform of teaching methods in the direction of intelligence and individuation. In this paper, the programming course is taken as an example to propose a curriculum intelligent brain model for open source swarm intelligence based on knowledge graph, and the bootstrapping framework...
Article
Full-text available
Software programming requires both API reference (know-what) knowledge and programming task (know-how) knowledge. Lots of programming know-what and know-how knowledge is documented in text, for example, API reference documentation and programming tutorials. To improve knowledge accessibility and usage, several recent studies use Natural Language Pr...
Article
Traditional program refinement strategy could not be refined to executable program,which suf- fered from the low reliability of verification and insufficient automation. To solve the above problems,a more complete program refinement strategy and an automatic verification method were proposed. The re- cursive definition function technology was used...
Preprint
Full-text available
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge ba...
Preprint
Software programming requires both API reference (know-what) knowledge and programming task (know-how) knowledge. Lots of programming know-what and know-how knowledge is documented in text, for example, API reference documentation and programming tutorials. To improve knowledge accessibility and usage, several recent studies use Natural Language Pr...
Article
Full-text available
We propose a systematic method to deduce and synthesize the Dafny programs. First, the specification of problem is described in strict mathematical language. Then, the derivation process uses program specification transformation technology to perform equivalent transformation. Furthermore, Dafny program is synthesized through the obtained recursive...
Article
The development of algebraic and numerical algorithms is a kind of complicated creative work and it is difficult to guarantee the correctness of the algorithms. This paper introduces a systematic and unified formal development method of algebraic and numerical algorithms. The method implements the complete refinement process from abstract specifica...
Article
Full-text available
Third-party libraries always evolve and produce multiple versions. Lucene, for example, released ten new versions (from version 7.7.0 to 8.4.0) in 2019. These versions confuse the existing code search methods to retrieve the source code that is not compatible with local programming language. To solve this issue, we propose DCSE, a deep code search...
Article
Full-text available
To improve code search, many query expansion (QE) approaches use APIs or crowd knowledge for expanding a query. However, these approaches may sometimes negatively impact the retrieval performance. This is because they can’t distinguish the relevant terms from the irrelevant ones among a large set of candidate expansion terms and expand a query with...
Article
Full-text available
The overexpansion problem negatively affects the quality of query expansion. To improve the quality of queries for searching code, this paper proposed a DBN‐based algorithm for effective query expansion. The deep belief network (DBN) model is trained on the code sequences and their change sequences, which aims to capture the meaningful terms during...
Article
Full-text available
The latest query expansion (QE) methods use the software development features for expanding queries. However, these methods allow only one feature to be considered at a time. To consider additional features simultaneously, we propose a QE method based on Github knowledge; this is a new comprehensive feature that covers both the existing features (i...
Article
Full-text available
Benefited on the open source software movement, many code search tools are proposed to retrieve source code over the internet. However, the retrieved source code rarely meets user needs perfectly so that it has to be changed manually. This is because the retrieved source code is concretely over-specific to some particular context. To solve this pro...
Article
Full-text available
Thesaurus-based, code-related, and software-specific query expansion techniques are the main contributions in free-form query search. However, these techniques still could not put the most relevant query result in the first position because they lack the ability to infer the expansion words that represent the user needs based on a given query. In t...
Article
Full-text available
To make the code search (CS) become more effective, a novel query expansion with intents (QEI) is proposed, in which the intent refers to the common subsequent modifications of the search results. The intent is extracted from the modification history. Within the intent scope, the CS is speeded up based on the semantic and structural matches. The pr...

Network

Cited By