![Qing Huang](https://i1.rgstatic.net/ii/profile.image/11431281083674622-1662699206403_Q128/Qing-Huang-26.jpg)
Qing HuangJiangxi Normal University · School of Computer
Qing Huang
PhD
About
57
Publications
8,057
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
178
Citations
Introduction
Qing Huang graduated from the State Key Laboratory of Software Engineering, School of Computer, Wuhan University. Now Qing works at Jiangxi Normal University. He does research in Programming Languages, Software Engineering and Artificial Intelligence.
Publications
Publications (57)
Data flow graphs (DFGs) capture definitions (defs) and uses across program blocks, which is a fundamental program representation for program analysis, testing and maintenance. However, dynamically-typed programming languages like Python present implicit data flow issues that make it challenging to determine def-use flow information at compile time....
The program construction process is based on rigorous mathematical reasoning, which leads to a fully correct algorithmic program via step-by-step refinement of the program specifications. The existing program construction methods' refinement process is partly based on individual subjective speculation and analysis, which lacks a precise guidance me...
Smart contracts with excessive gas consumption can cause economic losses, such as black hole contracts. Actual gas consumption depends on runtime information and has a probability distribution under different runtime situations. However, existing static analysis tools (e.g., Solc) cannot define runtime information and only provide an approximate up...
Exploratory testing leverages the tester's knowledge and creativity to design test cases for effectively uncovering system-level bugs from the end user's perspective. Researchers have worked on test scenario generation to support exploratory testing based on a system knowledge graph, enriched with scenario and oracle knowledge from bug reports. Nev...
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
API documentation, technical blogs and programming Q&A sites contain a large amount of partial code that can be reused in programming tasks. However, due to unresolved simple names and last-mile syntax errors, such partial code is frequently not compilable. To facilitate partial code reuse, we develop PCR-Chain for resolving FQNs and fixing last-mi...
The hybrid automatic readability assessment (ARA) models that combine deep and linguistic features have recently received rising attention due to their impressive performance. However, the utilization of linguistic features is not fully realized , as ARA models frequently concentrate excessively on numerical values of these features, neglecting val...
Unlimited by the state and space, the formal verification technology based on mechanized theorem proof is an important method to ensure software correctness and avoid serious loss from potential software bugs. LLRB (left-leaning red-black trees) is a variant of binary search trees, and its structure has an additional left-leaning constraint over th...
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge ba...
We conduct the first empirical study on using knowledge transfer to improve the generalization ability of large language models (LLMs) in software engineering tasks, which often require LLMs to generalize beyond their training data. Our proposed general knowledge transfer approach guides the LLM towards a similar and familiar API or code snippet it...
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
The traditional program refinement strategy cannot be refined to an executable program, and there are issues such as low verification reliability and automation. To solve the above problems, this paper proposes a nonlinear program construction and verification method based on partition recursion and Morgan's refinement rules. First, we use recursiv...
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule-based and sequence-labeling based. They must manually enumerate the rules or label data for a wide r...
The emergence of foundation models, such as large language models (LLMs) GPT-4 and text-to-image models DALL-E, has opened up numerous possibilities across various domains. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While people can use foundation models through chatbots (e.g., ChatGPT), chat, regard...
API documentation, technical blogs and programming Q&A sites contain numerous partial code that can be reused in programming tasks, but often these code are uncompilable due to unresolved names and syntax errors. To facilitate partial code reuse, we propose the Partial Code Reuse Chain (PCR-Chain) for resolving fully-qualified names (FQNs) and fixi...
p>Foundation models, such as GPT-4, DALL-E have brought unprecedented AI "operating system" effect and new forms of human-AI interaction, sparking a wave of innovation in AI-native services, where natural language prompts serve as executable "code" directly (prompt as executable code), eliminating the need for programming language as an intermediar...
Foundation models, such as GPT-4, DALL-E have brought unprecedented AI "operating system" effect and new forms of human-AI interaction, sparking a wave of innovation in AI-native services, where natural language prompts serve as executable "code" directly (prompt as executable code), eliminating the need for programming language as an intermediary...
Control Flow Graphs (CFGs) are essential for visualizing, understanding and analyzing program behavior. For statically-typed programming language like Java, developers obtain CFGs by using bytecode-based methods for compilable code and Abstract Syntax Tree (AST)-based methods for partially uncompilable code. However, explicit syntax errors during A...
The smart contract, a self-executing program on the blockchain, is key to programmable finance. However, the rise of smart contract use has also led to an increase in vulnerabilities that attract illegal activity from hackers. Traditional manual approaches for vulnerability detection, relying on domain experts, have limitations such as low automati...
Developers' API needs should be more pragmatic, such as seeking suggestive, explainable, and extensible APIs rather than the so-called best result. Existing API search research cannot meet these pragmatic needs because they are solely concerned with query-API relevance. This necessitates a focus on enhancing the entire query process, from query def...
Programmers who work with smart contract development often encounter challenges in reusing code from repositories. This is due to the presence of two unknowns that can lead to non-functional and functional failures. These unknowns are implicit collaborations between functions and subtle differences among similar functions. Current code mining metho...
p>In this paper, we present the development of an explainable Gas Estimator model EGE, utilizing big data to mine potential distribution of runtime information. Our approach overcomes the limitations of static gas analysis tools by labeling functions based on interval probability distribution and building code representations containing program sem...
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule-based and sequence-labeling based. They must manually enumerate the rules or label data for a wide r...
Developers’ API needs should be more pragmatic, such as seeking suggestive, explainable, and extensible APIs rather than the so-called best result. Existing API search research cannot meet these pragmatic needs because they are solely concerned with query-API relevance. This necessitates a focus on enhancing the entire query process, from query def...
Pre-trained giant code models (PCMs) start coming into the developers' daily practices. Understanding what types of and how much software knowledge is packed into PCMs is the foundation for incorporating PCMs into software engineering (SE) tasks and fully releasing their potential. In this work, we conduct the first systematic study on the SE factu...
The automatic generation of Chinese fonts is an important problem involved in many applications. The predominated methods for the Chinese font generation are based on the deep generative models, especially the generative adversarial networks (GANs). However, existing GAN-based methods (say, CycleGAN) for the Chinese font generation usually suffer f...
The automatic algorithm programming model can increase the dependability and efficiency of algorithm program development, including specification generation, program refinement, and formal verification. However, the existing model has two flaws: incompleteness of program refinement and inadequate automation of formal verification. This paper propos...
In the formal derivation and proof of binary tree algorithms, Dijkstra's weakest predicate method is commonly used. However, the method has some drawbacks, including a time-consuming derivation process, complicated loop invariants, and the inability to generate executable programs from the specification. This paper proposes a unified strategy for t...
The development of artificial intelligence in education promotes the reform of teaching methods in the direction of intelligence and individuation. In this paper, the programming course is taken as an example to propose a curriculum intelligent brain model for open source swarm intelligence based on knowledge graph, and the bootstrapping framework...
Software programming requires both API reference (know-what) knowledge and programming task (know-how) knowledge. Lots of programming know-what and know-how knowledge is documented in text, for example, API reference documentation and programming tutorials. To improve knowledge accessibility and usage, several recent studies use Natural Language Pr...
Traditional program refinement strategy could not be refined to executable program,which suf-
fered from the low reliability of verification and insufficient automation. To solve the above problems,a
more complete program refinement strategy and an automatic verification method were proposed. The re-
cursive definition function technology was used...
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge ba...
Software programming requires both API reference (know-what) knowledge and programming task (know-how) knowledge. Lots of programming know-what and know-how knowledge is documented in text, for example, API reference documentation and programming tutorials. To improve knowledge accessibility and usage, several recent studies use Natural Language Pr...
We propose a systematic method to deduce and synthesize the Dafny programs. First, the specification of problem is described in strict mathematical language. Then, the derivation process uses program specification transformation technology to perform equivalent transformation. Furthermore, Dafny program is synthesized through the obtained recursive...
The development of algebraic and numerical algorithms is a kind of complicated creative work and it is difficult to guarantee the correctness of the algorithms. This paper introduces
a systematic and unified formal development method of algebraic
and numerical algorithms. The method implements the complete
refinement process from abstract specifica...
Third-party libraries always evolve and produce multiple versions. Lucene, for example, released ten new versions (from version 7.7.0 to 8.4.0) in 2019. These versions confuse the existing code search methods to retrieve the source code that is not compatible with local programming language. To solve this issue, we propose DCSE, a deep code search...
To improve code search, many query expansion (QE) approaches use APIs or crowd knowledge for expanding a query. However, these approaches may sometimes negatively impact the retrieval performance. This is because they can’t distinguish the relevant terms from the irrelevant ones among a large set of candidate expansion terms and expand a query with...
The overexpansion problem negatively affects the quality of query expansion. To improve the quality of queries for searching code, this paper proposed a DBN‐based algorithm for effective query expansion. The deep belief network (DBN) model is trained on the code sequences and their change sequences, which aims to capture the meaningful terms during...
The latest query expansion (QE) methods use the software development features for expanding queries. However, these methods allow only one feature to be considered at a time. To consider additional features simultaneously, we propose a QE method based on Github knowledge; this is a new comprehensive feature that covers both the existing features (i...
Benefited on the open source software movement, many code search tools are proposed to retrieve source code over the internet. However, the retrieved source code rarely meets user needs perfectly so that it has to be changed manually. This is because the retrieved source code is concretely over-specific to some particular context. To solve this pro...
Thesaurus-based, code-related, and software-specific query expansion techniques are the main contributions in free-form query search. However, these techniques still could not put the most relevant query result in the first position because they lack the ability to infer the expansion words that represent the user needs based on a given query. In t...
To make the code search (CS) become more effective, a novel query expansion with intents (QEI) is proposed, in which the intent refers to the common subsequent modifications of the search results. The intent is extracted from the modification history. Within the intent scope, the CS is speeded up based on the semantic and structural matches. The pr...