ArticlePDF Available

Two-stage Text-to-BIMQL semantic parsing for building information model extraction using graph neural networks

August 2023
Automation in Construction 152:104902

August 2023
152:104902

DOI:10.1016/j.autcon.2023.104902

Authors:

Mengtian Yin

The University of Hong Kong

Llewellyn Tang

The University of Hong Kong

Chris Webster

The University of Hong Kong

Show all 7 authorsHide

With the increasing complexity of the building process, it is difficult for project stakeholders to retrieve large and multi-disciplinary building information models (BIMs). A natural language interface (NLI) is beneficial for users to query BIM models using natural language. However, parsing natural language queries (NLQs) is challenging due to ambiguous name descriptions and intricate relationships between entities. To address these issues, this study proposes a graph neural network (GNN)-based semantic parsing method that automatically maps NLQs into executable queries. Firstly, ambiguous mentions are collectively linked to referent ontological entities via a GNN-based entity linking model. Secondly, the logical forms of NLQs are interpreted through a GNN-based relation extraction model, which predicts links between mentioned entities in a heterogeneous graph fusing ontology and NLQ texts. The experiment based on 786 queries shows its outstanding performance. Moreover, a real-world case verifies the practicability of the proposed method for BIM model retrieval.

Content uploaded by Mengtian Yin

Content may be subject to copyright.

Two-stage Text-to-BIMQL Semantic Parsing for Building 2

Information Model Extraction Using Graph Neural Networks 3

Mengtian Yin1, Llewellyn Tang1,, Chris Webster2, Jinyang Li3, Haotian Li1, Zhuoquan Wu1, 4

Reynold C.K. Cheng3 5

1 Department of Real Estate and Construction, The University of Hong Kong, Hong Kong SAR 6

2Faculty of Architecture, The University of Hong Kong, Hong Kong SAR 7

3 Department of Computer Science, The University of Hong Kong, Hong Kong SAR 8

u3006144@hku.hk, lcmtang@hku.hk, cwebster@hku.hk, jl0725@connect.hku.hk, 9

hlidh@connect.hku.hk, u3006157@hku.hk, ckcheng@cs.hku.hk 10

Abstract 12

With the increasing complexity of the building process, it is difficult for project stakeholders to 13

retrieve large and multi-disciplinary building information models (BIMs). A natural language 14

interface (NLI) is beneficial for users to query BIM models using natural language. However, 15

parsing natural language queries (NLQs) is challenging due to ambiguous name descriptions and 16

intricate relationships between entities. To address these issues, this study proposes a graph neural 17

network (GNN)-based semantic parsing method that automatically maps NLQs into executable 18

queries. Firstly, ambiguous mentions are collectively linked to referent ontological entities via a 19

GNN-based entity linking model. Secondly, the logical forms of NLQs are interpreted through a 20

GNN-based relation extraction model, which predicts links between mentioned entities in a 21

heterogeneous graph fusing ontology and NLQ texts. The experiment based on 786 queries shows 22

its outstanding performance. Moreover, a real-world case verifies the practicability of the proposed 23

method for BIM model retrieval. 24

1. Introduction 26

Building information modelling (BIM) provides a digital representation of the building product and 27

process with semantic descriptions of different types of information [1]. BIM models can be applied 28

in a range of engineering applications that add value to construction projects, such as clash detection 29

[2] and maintenance management [3]. To facilitate lifecycle information exchange and 30

management, vendor-neutral Industry Foundation Classes (IFC) specifications [4] are widely 31

adopted to represent BIM models, and they are constantly updated with the expanding body of 32

concepts in the construction domain. 33

With the increasing complexity of the modern building process, more parties are involved in 34

projects and increasing amounts of information are generated in BIM models. This has caused the 35

IFC data schema to become structurally more complex. In reality, the number of entities has risen 36

from 653 in the early version of IFC2×3 to 1880 in the latest version, IFC4.3 RC2 [5]. Moreover, 37

the model instances also have heavy sizes, often containing multiple disciplines. With multi-38

domain BIM models, being able to acquire the desired information in time is a key success factor 39

to realizing the practical value of BIM [6]. Due to the disparate disciplines, roles, and project 40

contexts, the information demands of individual stakeholders differ noticeably. This entails 41

efficient retrieval techniques to extract BIM model subsets. 42

The current means of BIM model extraction can be grouped in two categories. First, there are 43

frameworks (e.g., Model View Definition (MVD) [7]) to define domain-specific model views and 44

built-in/add-in tools (e.g., Autodesk COBIE Extension [8]) in BIM software to extract partial 45

models for particular use cases (e.g., building energy model (BEM) [9]). They are oriented toward 46

schema-level model extraction that involves a large number of entities and a long development 47

period [10]. However, these schema-level extraction approaches cannot address ad hoc retrieval 48

requirements for project practitioners to enquire facility information in BIM models, which usually 49

have changing conditions (e.g., question forms, attribute restrictions) [11]. For example, an 50

equipment manager might make a query like “search sensors embedded in air ducts whose reading 51

of supply air temperature is not in the interval of 13-15 degrees” to diagnose broken sensors in air 52

conditioning (AC) systems. In practice, these project-level retrieval tasks mainly rely on the second 53

stream of model extraction solutions, which are professional programming and query languages 54

(e.g., BIM Query Language (BIMQL) [6]). While this means permits the successful extraction of 55

model subsets, it necessitates users having sufficient skill levels in programming and becoming 56

acquainted with complex IFC data schema. Apparently, it is difficult for many practitioners in 57

construction industry who are non-experts in information technology (IT) to use [12]. 58

Recently, a natural language interface (NLI) has been proposed by several studies [5–8] to 59

efficiently query BIM models in construction projects. Inspired by artificial intelligence (AI)-60

driven voice assistants (e.g., Apple SIRI [17]), it was envisioned that BIM users could directly input 61

natural language (NL) texts to retrieve model information, which could hide all formalities of BIM-62

oriented programming languages and data schema. The key to NL-based BIM model retrieval is 63

semantic parsing (SP), which is framed as converting NL to logical forms like structured queries 64

or programs [18,19]. The typical tasks include Text-to-SQL (Structured Query Language) [20–22] 65

and Text-to-AMR (Abstract Meaning Representation) [23,24]. Following these, we define Text-to-66

BIMQL as the task of transforming NL texts into standard query languages used to retrieve IFC 67

schema-compliant BIM models. However, the current methods for Text-to-BIMQL are limited. 68

The existing studies [11,15,25,26] tailored to retrieve BIM models heavily rely on hand-crafted 69

rules to parse natural language queries (NLQs). Despite the high performance, the input must 70

conform to rigid patterns (e.g., returning an attribute of an object) or strict requirements, which are 71

incapable of handling queries with varying user-specified conditions. 72

There are several challenges in Text-to-BIMQL semantic parsing. First, there is a name ambiguity 73

problem in aligning natural language texts with different levels of IFC model information (e.g., 74

object class, instance, properties, property values, etc.) [27]. In other words, the same mention in 75

NLQs can refer to differing IFC concepts [11]. Consider the example: “find walls on the roof whose 76

base offset is less than the base offset of 184944 (tag number of a wall instance)”. Here, the word 77

“roof” can be recognized as (a) IfcRoof that represents roof elements; (b) IfcBuildingStorey that 78

refers to a building story called “roof”; (c) IfcSlab that implies slab elements with a predefined type 79

of ROOF; or (d) a literal property value. This requires an entity linking (EL) system [28] that links 80

name mentions in NL texts to referent entities in BIM models, whereas such a system does not exist 81

in the BIM field. Second, the logical forms of NLQs are hard to predict because they consist of 82

complex relational paths connecting different IFC entities. A relational path between entities could 83

be multi-hop, which implies two entities are separated by intermediate entities in BIM models or 84

ontologies. As shown in Fig. 1, the relational path between the property “base offset” and tag 85

Fig. 1. Example of a multi-hop relational path. The blue and red entities denote head and tail entities,

respectively

; the gray entities denote extra entities along the relational path.

“184944” is a 3-hop move in an IFC ontology [29]. Unfortunately, the existing methods [11,15,30] 86

can only handle 1-hop relationship between entities. 87

To resolve the above problems, contexts of queries and ontological knowledge models must be 88

integrated for joint inference. Recently, ontologies have been widely used to represent BIM 89

knowledge and information [31–34]. For name disambiguation, the neighboring concepts in NLQs 90

and their relationships encoded in the ontology provide key evidence to infer which entity the user 91

refers to. For logical form prediction, an IFC ontology and NLQ texts would together determine 92

the entire relational path between entities. Many studies have proposed integrating natural language 93

processing (NLP) and ontologies to parse NL texts for BIM model checking [35–38] and retrieval 94

[11,30]. However, the existing rule-based methods cannot adapt to the different ontologies and text 95

patterns [39] that are necessary to solve the identified problems. Furthermore, it is intractable to 96

manually formalize rules to extract features of ontology structure for text parsing. 97

Inspired by the recent advancement of graph-based machine learning (ML), this study proposes a 98

novel graph neural network (GNN)-based approach to incorporating ontologies for Text-to-BIMQL 99

semantic parsing. GNNs perform biasing learning and computation over structured graph data [40], 100

thus having advantages in incorporating ontological BIM knowledge for enhanced query parsing. 101

This is realized by using GNNs to fuse contextual information of queries and an IFC ontology into 102

heterogeneous graphs for unified representation learning and inference. The proposed method 103

consists of two stages. In the first stage, the name mentions in NLQs are linked to the most relevant 104

entities in the ontology by scoring the subgraphs of each candidate. In the second stage, the 105

dependencies, logical connections, and semantic relationships between entities are extracted in one 106

go by a GNN edge prediction layer. In the end, the results of the above two stages are turned into 107

structured queries to retrieve IFC-based BIM models. Compared with existing methods, our 108

approach avoids the development of enormous rules to parse NL texts via ontology, suffering from 109

the varying sentence patterns and conditions. Therefore, this study improves the strength and 110

practical value of NL-based BIM model retrieval in construction projects. 111

The remainder of this paper is structured as follows. Section 2 introduces the background of the 112

study. Section 3 illustrates the proposed approach. Section 4 provides a performance evaluation. 113

Section 5 discusses the advantages and limitations of the proposed method. Section 6 concludes by 114

outlining the significance of this research. 115

116

2. Background 117

2.1 Traditional methods for IFC BIM model retrieval and extraction 118

Since IFC has become a widely-used open-source data standard for the explanation, exchange, and 119

sharing of BIM data [41], most existing BIM query systems have been implemented based on the 120

IFC schema. For example, there are some application programming interfaces (APIs) and toolkits 121

for developers to utilize programming languages (e.g., C#) to augment the model repository, such 122

as IFC Engine [42] and Xbim Essentials [43]. Furthermore, different professional query languages 123

have been proposed for end-users to retrieve BIM models with distinct functions, scopes, and 124

purposes. The Building Environment Rule and Analysis (BERA) language [44] is a popular BIM 125

query language for rule analysis and checking. The QL4BIM (Query Language for Building 126

Information Models) framework [45] enables spatial reasoning in BIM models. Mazairac and Beetz 127

[6] present the BIMQL, which serves as an open query language for the end users to create, read, 128

update, and delete (CRUD) IFC models. 129

Recently, there has been a trend to use ontologies to represent BIM schema and models 130

[31,32,46,47], taking advantage of the strength of semantic web technologies, such as query and 131

reasoning engines. This has led to a group of studies that explore how to use ontology to filter BIM 132

models in form of the Resource Description Framework (RDF) [48]. The commonly used format 133

of semantic BIM is the ifcOWL ontology [31], which is an equivalent transformation of the 134

EXPRESS-based IFC data schema. Following this, Zhang et al. [47] proposed a BIMSPARQL 135

framework to retrieve ifcOWL instances. The query functions, such as geometric information 136

retrieval and spatial reasoning, are extended by SPIN (SPARQL Inference Notation) rules [49]. 137

Similarly, de Farias et al. [46] utilize Semantic Web Rule Language (SWRL) to extract the partial 138

views of ifcOWL instances, in which logic rules are pre-encoded according to data requirements. 139

Our approach chooses SPARQL (SPARQL Protocol and RDF Query Language) as the output query 140

language due to its well-structured syntax, functionality, and popularity. The resulting queries can 141

be executed to retrieve IFC BIM models in RDF format. 142

143

2.2 Existing semantic parsing methods for natural language-based BIM querying 144

Due to the ambiguity of human language, a crucial success factor of NL-based BIM model retrieval 145

is semantic parsing. Currently, most methods are rule-based or pattern-based, which can process 146

simple queries with few variables and constraints. Lin et al. [13] present a seminal work that uses 147

NLs to retrieve IFC BIM model stored in a cloud database. The International Framework for 148

Dictionaries (IFD) [50] is leveraged for keyword extraction before discovering the relationships 149

using syntactic parsing and IFC graphs. Shin and Issa [30] propose a BIMASR (building 150

information modeling automatic speech recognition) framework that utilizes voice to collect and 151

manipulate BIM model in a relational database management system (RDBMS). The open-source 152

NLP2SQL library [51] was adopted for parsing NLQs, whereas the query level is limited to one-153

dimensional queries that only include the wall entity and its properties. Elghaish et al. [26] propose 154

an AI-based voice assistant for BIM data management. Their system remains at a “proof of concept” 155

stage and can only translate one type of NLQ (“create a room schedule”) command into Dynamo 156

script [52]. Wang et al. [15] propose a NLP-based Query-Answering (QA) system for BIM 157

information extraction (IE). In their SP module, two fixed styles of NLQ can be interpreted by 158

matching patterns in syntactic parsing trees. Wang et al. [53,54] further propose a transfer learning-159

based text classification method to identify query types and apply a neural network to recognize 160

named entities in NLQs. However, it only supports rigid patterns of queries that contain 1-3 161

variables. 162

If BIM users want to acquire objects in BIM models with some customized constraints, the pattern 163

matching methods cannot cope with changing conditions. Therefore, Yin et al. [11] propose an 164

ontology-based SP pipeline for NL-based BIM model retrieval which allows users to compose 165

queries with arbitrary combinations of (a) object class, type, and instance; (b) logical connection 166

(e.g., disjunction); (c) relationships between contextual objects (e.g., placement); and (d) attribute 167

constraints (e.g., property, material). However, as the extent of BIM model retrieval becomes 168

greater, it was found that a mention can be mapped to several places of IFC models that cause name 169

ambiguity. Meanwhile, the mentioned entities are not always 1-hop related in the IFC model, which 170

requires multi-hop relational reasoning to formulate executable queries. 171

172

2.3 Graph neural network and its applications in construction industry 173

Graph is an abstract data type in computing science, consisting of a set of vertices together with 174

edges joining certain pairs of nodes [55]. Graphs are commonly used to model objects and their 175

relationships in many real-world situations, such as social networks [56]. While current deep neural 176

networks such as convolutional neural networks (CNN) can effectively extract the hidden features 177

of Euclidean data (e.g., images), they cannot handle non-Euclidean graph data structures [57]. 178

Hence, GNNs were developed to learn representations from graphs, by considering both the 179

continuous features and the graph structure itself. During training, GNN propagates across all nodes 180

depending on the states of their neighborhood and collectively aggregates the information in the 181

graphs. This makes GNNs become effective ML models for processing graph data. Based on the 182

message-passing mechanisms of GNNs, multi-hop relational reasoning can be performed with 183

respect to graphs [58,59], and thus inductive bias can be made to prioritize one solution over others 184

regarding any graph problems [34]. In practice, GNNs allow high performance in many graph 185

analytic tasks, including node classification, edge prediction, and graph classification [60]. 186

Consequently, GNN models have been widely applied to different scenarios involving non-187

Euclidean data, such as knowledge graph completion [61] and molecule property prediction [62]. 188

GNNs are being used in increasing numbers of applications in the construction industry for solving 189

various graph-related problems. Wang et al. [63] propose an improved SAGE-E based on 190

GraphSAGE [64] for semantic enrichment of BIM models, where rooms and their connections in 191

apartment layouts are represented as nodes and edges to automatically classify room types. Collins 192

et al. [65] utilize Graph Convolutional Networks (GCNs) to classify semantic categories of the IFC 193

objects. Hu et al. [66] use spatial-temporal GCNs to model the interdependency relationships 194

between buildings for reducing large-scale building energy consumption. Kim and Chi [67] propose 195

a semantic GNN approach to simulate the propagation effects of different resource-to-resource 196

interactions. 197

Compared with traditional NLP models, such as LSTM (long short-term memory) [68], that work 198

on sequential data (e.g., text), GNN can better capture the features of BIM ontology structure 199

through graph learning. Hence, GNN is leveraged for Text-to-BIMQL SP in this study. 200

201

2.4 Gaps in knowledge 202

Semantic parsing for BIM model retrieval and extraction is still in an early stage. Existing methods 203

rely on predefined rules to achieve either (a) accurate parsing of simple queries with fixed patterns; 204

and (b) moderate parsing of complex queries with varied expressions and conditions. While the 205

latter type of queries can support more fine-grained retrieval of BIM models, the problems 206

surrounding name ambiguity and relational path extraction have not yet been sufficiently addressed 207

to successfully interpret NLQs to executable queries. In addition, existing rule-based methods 208

[11,30] that integrate ontology and NLP for interpreting NL texts lack enough adaptivity and 209

flexibility to disambiguate entities and extract multi-hop relational paths in NLQs under the 210

changeable contexts. No methods have used graph ML to capture features of ontological graph 211

structure to effectively infer the logical forms of text-based BIM queries. 212

To fill the above gaps, this study aims to deliver a new semantic parsing method that exploits GNNs 213

to bridge the NL texts and ontology for intelligently retrieving BIM models. The objectives are to 214

(a) link all mentions in NLQs to referent entities in ontology based on a contextual graph structure; 215

and (b) to extract the entire relational path between entities over IFC ontological graphs to construct 216

standard BIM queries. 217

218

3. The proposed approach 219

3.1 Scope 220

Following the scope defined in [11] for multi-constraint BIM querying, the NLQs addressed in this 221

study support the following conditions for NL-based BIM retrieval: 222

• Attribute constraints that allow objects to be filtered by their types (e.g., “search windows with 223

a type of 250mm x 500mm”), property (e.g., “find load bearing walls”), quantity (e.g., “beams 224

have a gross volume of 0.5 ”), and material (e.g., “slabs made of concrete.”). 225

• Abstracted semantic relationships between objects, such as containment and composition. This 226

study considers a total of 11 object relationships for demonstration and testing. The details can 227

be found in Appendix I. 228

The above constraints can be arbitrary combined with logical operation, including conditional 229

conjunction and disjunction. Only one sentence is supported in each NLQ. 230

3.2 Overview 231

Although the state-of-the-art (SOTA) semantic parsers are Sequence-to-Sequence (Seq2Seq) 232

models that directly generate structured queries [18], building such a system to query BIM models 233

would not be possible due to the lack of sufficiently annotated datasets. Instead, our method 234

decomposes the SP task into two primary parts: (a) linking all name mentions in NLQs to entities 235

in a BIM ontology; and (b) extracting relational path between entities to derive a logical form of 236

queries. The BIM ontology refers to ontologies that structurally describe the building entities 237

(classes and individuals), class hierarchy, properties, and relationships. 238

An overview of the proposed GNN-based semantic parsing method for BIM model extraction 239

(GSP4BIM) is presented in Fig. 2. To begin with, there are several preprocessing jobs, including 240

embedding all ontological entities into vectors, and simplifying the original RDF graph of ontology 241

as an ontology graph (OG). 242

The first stage of semantic parsing acquires candidate entities by matching the surface string in 243

NLQs against the ontology. Ambiguous mentions are automatically linked to referent entities via a 244

GNN-based entity linking (GNN-EL) model. For each candidate entity, a subgraph gathering 245

related entities is retrieved and scored by GNN-EL to estimate the probability of being the referent 246

entity. 247

Having obtained all the entities, the second stage extracts relational path between entities. The 248

dependency parsing (DP) graph and the OG subgraph are concatenated into a heterogeneous graph. 249

The dependency, logical, and semantic relations between the mentioned entities are simultaneously 250

predicted by a GNN link prediction layer. After that, the existence of multi-hop relationships is 251

detected, and the supplementary relationships are extracted to find entire relational path. 252

Through two-stage semantic parsing, results are organized into a graph-based logical form 253

containing the entities and relational paths. They are then automatically transformed into standard 254

SPARQL queries to retrieve IFC-based BIM models. 255

256

3.3 Preprocessing of BIM ontology graphs 257

In our study, the IFC Natural Language Expression (INLE) ontology [11], an open modular 258

ontology specialized for NL-based retrieval of IFC data models, was employed as the source BIM 259

Fig. 2. Overview of the proposed GNN-based two-stage semantic parsing approach. Contributions are

highlighted.

ontology. It provides wrapped classes representing NL expressions of IFC concepts (e.g., synonyms 260

and hyponyms), a simplified class hierarchy, and abstracted semantic relationships (e.g., 261

isContainedIn). The scope of the INLE ontology encompasses IfcElement (e.g., IfcWall, IfcBeam, 262

IfcSpace, and IfcBuildingStorey), IfcProperty, IfcPhysicalQuantity, IfcMaterial (including material 263

layer and list), IfcTypeObject, and object attributes like tags and long names. This study additionally 264

models the class for IfcPropertySingleValue to identify literal data values in NLQs. 265

As discussed in [11], there are project-specific concepts in BIM models which are beyond the scope 266

of IFC-related semantic models. For example, the property “HeadHeight” for doors in BIM models 267

is not covered by the IFC predefined property sets. Hence, a model-based ontology population 268

(MOP) method [11] was exploited to assimilate entity names from the target BIM model and 269

populate the INLE ontology with instances and synonyms. This study additionally extracts literal 270

property values from BIM models. 271

To input ontology data files into GNNs, we adopted the OWL2Vec* [69], an ontology embedding 272

method based on random walks and Word2Vec [70], to encode the semantics of ontological entities 273

into vector representations. Consequently, the class and instance entities in the INLE ontology were 274

embedded into 100-dimensional vectors. 275

Finally, considering that the original RDF graph of the INLE ontology is too large and complex for 276

graph learning, our approach converts it into a simplified multi-directional graph, which is a 277

directed graph that can have multiple edges between nodes. In addition to the basic RDF and OWL 278

relationships such as “rdfs:subClassof” and “rdf:type” preserved as edges, we simplified object 279

properties used to encode semantic relationships as new typed directed edges. To this end, the 280

ontology graph is denoted as OG = (,), where  is the set of entity nodes;    ×× is 281

the set of edges; and  represents relation types. 282

283

3.4 Entity linking for first-stage semantic parsing 284

The initial stage of semantic parsing is to identify and disambiguate the mentioned IFC entities in 285

NLQs. The NLQ text is first matched against the ontology in Section 3.4.1 to generate candidate 286

entities. The ontology subgraphs are then extracted for the candidates in Section 3.4.2 and 287

processed by GNNs for name disambiguation in Section 3.4.3. 288

3.4.1 Surface string matching and candidate generation 289

The named entity recognition (NER) method proposed in [11] was first applied to match the surface 290

strings of NLQs against entity names in ontology. The set of matched entities is denoted as topic 291

entities . 292

If a name mention can only be mapped to one entity, there is no need to operate EL. Otherwise, all 293

candidate entities should be collected for that name mention. For each ambiguous name observation, 294

the candidate entities are denoted as  = {,, … , }, and the entities mapped to other name 295

mentions (including other ambiguous name mentions) are denoted as question entities,  = 296

{,, … , }, where ,   and = . 297

298

3.4.2 Ontology-guided subgraph extraction 299

For each candidate entity  of a name mention, a subgraph linking the candidate entity and the 300

question entities will be retrieved from ontology graph. To reduce the noisy nodes that cause 301

overfitting and heavy computation, this study proposes an ontology-guided subgraph retrieval 302

method that extracts informative nodes through specified edges. As shown in Fig. 3, the method 303

consists of four steps. 304

(a) For every topic entity node, its neighbor nodes in OG are retrieved except its instances. 305

(b) Up to 3 higher-order super classes of entities are retrieved via “subClassOf” edges. 306

“subClassOf” nor “type”. 308

(d) Up to 3 higher-order super class nodes of newly extracted nodes in (c) are extracted. 309

Fig. 3. Ontology-guided subgraph extraction for a single entity node.

Finally, all retrieved nodes for topic entities are gathered; edges between nodes are retrieved to 310

generate an OG subgraph O = (, ). The extra nodes in OG subgraph that are not 311

mentioned in NLQs are denoted as . 312

313

3.4.3 Graph neural network for entity linking 314

The GNN-EL model for linking mentions to IFC entities is based on relational graph attention 315

network (RGAT) [71,72], which uses self-attention mechanisms to iteratively compute node 316

representations by attending over its relation-aware neighbors. Recently, RGATs have been widely 317

applied to various language understanding problems, such as text-to-SQL [73], sentiment analysis 318

[74], and question-answering systems [58]. Inspired by their work, the GNN-EL adapts the RGAT 319

network structure to choose the most appropriate IFC entity from among all the candidate entities 320

for resolving ambiguous name mentions. 321

The overview of the GNN-EL model is presented in Fig. 4. Given an NLQ and its ambiguous name 322

mentions (“roof” in this example), the GNN module systematically disambiguates each name 323

mention by predicting the probability of every candidate entity. The probability score of a candidate 324

 is derived by joint reasoning over the NLQ context and OG subgraph. Owing to the different 325

modalities of the NLQ text and OG, a working graph is created that unifies the representation of 326

both sources of information. The representations of nodes in the working graph are iteratively 327

updated through several rounds of message passing. Finally, the features of the context node and 328

the pooled working graph are concatenated, and fed into a multiple-layer perception (MLP) layer, 329

which outputs the predicted probability score of a candidate entity. 330

Fig. 4. GNN-based approach to linking ambiguous name mention to appropriate IFC entities. Gray edges

in the graph

denote the original edges in OG, such as “subClassOf” and “isContainedIn”.

3.4.3.1 Working graph construction for joint representation 331

The NLQ is first encoded into a vector representation  using language models (LMs). In this 332

study, RoBERTa (Robustly Optimized BERT Pretraining Approach) [75] was adopted due to its 333

superior performance. The embedding of [CLS] token in the BERT output was utilized as the vector 334

representation of the NLQ context. 335

The NLQ is then connected to O by injecting a context node  that represents the NLQ context 336

(orange node in Fig. 4). This context node is linked to candidate nodes () and question nodes 337

() with new relationships 

, and 

, that specify the type of the connected nodes, and position 338

edges 

 to represent the position of the mentioned entities in an NLQ sentence. 339

To encode the position information of entities in an NLQ context, the relative position ratios of 340

entities are first calculated by dividing the start and end index of the entities by the total length of 341

the NLQ. Next, a distance bin method [62] is applied to map the position ratio into one of ten bins 342

within the interval [0,1] (see “Position bins” in Fig.5). Each i th bin represents a distinct positional 343

edge (,) to connect context node and entity nodes. 344

Analogous to position edges, our method also encodes the distance between topic entities in NLQs 345

into typed edges. The absolute difference between the start indexes of each pair of nodes in  is 346

computed and to obtain a distance ratio against sentence length. The distance edge , is added 347

between topic nodes, where j depends on which distance bin the ratio falls into. The interval 348

numbers for position edges and distance edges are determined based on a fine-tuning experiment. 349

Fig. 5 provides an example of both edges for nodes “base offset” and “BuildingStorey”. 350

Fig. 5. Examples of position edges and distance edges between context node and entity nodes, “base offset”

and “BuildingStorey”.

As a result, a working graph =(,) was obtained, which was then input into the GNNs to 351

update the representation of NLQ context and OG subgraph. 352

353

3.4.3.2 Message passing in the GNN architecture 354

As shown in Fig. 6, the next step after constructing the working graph is to perform multiple rounds 355

of message passing in the RGAT network to update the representation of the nodes. The input is a 356

set of node features, h = {

, 

, … , 

 }, where 

 represents initial node embeddings (dimension 357

(D) = 100), which comes from linear transformation of ontology embeddings in Section 3.3 and 358

the sentence vector of NLQ context  in Section 3.4.3.1; N denotes the total number of nodes in 359

the working graph (). 360

The hidden states of each node are updated through a L-layer message passing in RGAT. 361

Specifically, the hidden states 

() of the target node  in (l+1)-th layer are computed as follows: 362



= [

||W

][

|||| W

] (1) 363



=Softmax

// (2) 364



= W

(

|||| ) (3) 365





= ||

(











) (4) 366



()=MLP(

+



 W) (5) 367

where 

  / represents the message passed from relation-aware neighboring node  368

(source node whose hidden state is 

 at l-th layer) to target node , and 

  is an attention 369

Fig. 6. GNN-EL architecture for structured reasoning over working graph.

weight that indicates the importance of message from  to ; || denotes vector concatenation; 370

matrixes W

 .×/, W

, W

 .×/, and W ×/ are learnable parameters;  371

is the number of heads for multi-head attention [76] to stabilize the learning process; 



 denotes 372

the receptive filed of node ; softmax represents the Softmax function that operates on ; MLP 373

denotes a 2-layer multiple layer perception (MLP) [58] and a batch normalization [77]. Furthermore, 374

both message 

 and attention 

 incorporate node type embedding (,  .) and edge 375

embedding (  ), where the node type embeddings come from a linear transformation 376

(|| .) of one-hot vectors 󰅰,󰅰,{0,1}|| . Herein,  is 4, which represents the total 377

number of node types: question nodes, candidate nodes, extra nodes, and context nodes. On the 378

other hand, edge embedding is computed as follows: 379

 =MLP(||󰅰||󰅰) (6) 380

where   {0,1}|| is a one-hot vector that symbolizes the type of relationship. 381

382

3.4.3.3 Prediction and training 383

Given a candidate entity  for an ambiguous name mention in an NLQ, the probability of it being 384

the intended entity is estimated as follows: 385

(|)=MLP(Dropout(|| ||) (7) 386

where    is the LM representation of NLQ text;    denotes the hidden states of 387

context node in the final RGAT layer; and    is a multi-head attention pooling over hidden 388

states of all nodes in OG subgraph (). The concatenated features are passed through a dropout 389

layer [78] and a 2-layer MLP with a layer normalization [79] before outputting the final probability 390

scores. The activation function adopts the GELU (Gaussian Error Linear Units) [80] function. 391

Since a probability score of each candidate entity can be estimated, the name mention is linked to 392

the entity with the highest score. During the training phase, the graph data are obtained by 393

converting NLQs and ontology graphs into working graphs. The model parameters are optimized 394

in backward propagation. The loss function is cross entropy loss, which is defined as: 395

(,)=(

)log ( ) (8) 396

where ( ) and () are ground truth and the estimated probability of candidate entity  397

being the referent entity in the NLQ context. 398

3.5 Relational path extraction for second-stage semantic parsing 399

After resolving the IFC entities in NLQs, they are passed on to the second-stage model for relation 400

extraction (RE). Prior to RE, the nearby entities that refer to the same object instance (e.g., the 401

IfcSpace class entity and space instance entity “S101” for “room S101”) are merged into one entity 402

as their relationship was deterministic. 403

Based on the scope in Section 3.1, this study considers a total of 36 types of directed relationships 404

in the following aspects: 405

(a) dependency: if two entities in NLQs do not have any connections, they have a "No relation" 406

relationship. 407

(b) logical relationship: logical disjunction relationship (inclusive Logic-OR) between the two 408

entities. The entities on both sides can be data values (e.g., “list objects with phase created at 409

construction phase or operation phase”), object instances (e.g., “find walls at level 1 or level 2”), 410

or object attributes (e.g., “retrieve walls whose area is less than 8  and length less than 4 m”). 411

(d) attribute value constraints that require relationships like “hasProperty” for object entities and 413

“hasPropertyValue” for property entities. 414

Each relationship can be mapped to an object property in INLE ontology. Given an NLQ and a pair 415

of head entity  and tail entity , a GNN model firstly predicts the major relationship 

 416

for (, ) from among 36 relationships in Section 3.5.1. Afterward, the supplementary 417

relationships are sought to trace an entire relational path between entities in Section 3.5.2. The full 418

list of relationships and definitions is presented in Appendix I. 419

420

3.5.1 Graph neural networks for link prediction 421

In contrast to the first stage GNN-EL model, which uses a context node  to bridge the NLQ context 422

and ontology graph, the second stage GNN-based relation extraction (GNN-RE) model uses a DP 423

graph to represent NLQ contextual information, which more accurately capture the dependencies 424

between entities in an NLQ. A DP graph comes from dependency parsing, which analyzes the 425

syntactic structure of a sentence and establishes grammatical relations between words [81]. The 426

resulting DP graph, as shown in Fig.7, consists of a set of nodes representing sentence tokens and 427

edges denoting dependency relations between head and dependent. Additionally, extra edges were 428

appended between neighborhood tokens to encode sequential information of words into DP graph. 429

The resulting DP graph is concatenated with the OG subgraph to form a new working graph, which 430

is then passed through GNNs to predict the edge types (relationships) between every pair of head 431

entities and tail entities in NLQ context. 432

433

3.5.1.1 Working graph construction 434

For each entity pair (, ), a working graph is constructed that combines the OG subgraph 435

and the DP graph. Here, the disambiguated entities are used to extract a new OG subgraph using 436

the same method as shown in Section 3.4.2. 437

The strategy for constructing a working graph construction is as follows. As shown in Fig. 8, the 438

head node and tail node are linked to their corresponding tokens with relation types  and . Since 439

the content between name mentions of two entities is informative for inferring relationships, the 440

Fig. 7. The DP graph of the NLQ that represents the grammatical relation between tokens.

Fig. 8. The GNN architecture for relation extraction. The red, blue, and orange rectangles represent

the

head, tail, and question nodes, respectively

; red and blue circles represent tokens linked to head

and tail entity nodes; green circles denote the tokens between name mentions of head and tail entities.

tokens sandwiched between the two mentions are linked to head and tail entities with relation types 441

 and , respectively (see the red and blue dotted lines in Fig. 8). In addition, other extracted 442

entities, termed as question nodes , are linked to their mapped tokens in NLQs with 443

relation type 

. 444

3.5.1.2 GNN architecture for relation classification 445

As shown in Fig.8, the proposed GNN-RE model is a link prediction model that predicts the types 446

of relationship between head and tail nodes. In general, the architecture of GNN-RE is close to the 447

GNN-EL, but it differs in the working graph structure, node type, and final output layer. 448

(a) Initial embeddings of nodes in working graph 449

In the GNN-RE model, initial node embeddings that represent NLQ context utilize the embeddings 450

of each word in NLQ context from RoBERTa outputs, noted as  = {

,

,…,

}. All 768-451

dimensional word vectors are converted into 100-dimensional vectors through a linear 452

transformation. However, the initial hidden states of entity nodes in the OG subgraph are the same 453

as the setup in the first stage. 454

(b) Node type 455

In contrast to GNN-EL, the GNN-RE model has five node types: token node, head entity node, tail 456

entity node, question entity node, and extra entity node. 457

The hidden states of all nodes in  are updated through a L-layer RGAT. After message passing, 459

the hidden states of the head entity node and tail entity node at the last RGAT layer are extracted, 460

which are noted as 

, 

 . The two vectors are then concatenated and passed through 461

a 2-layer MLP and a layer normalization (see Fig. 8). The output size is the number of relation 462

types (36), which indicates the probability score of each relation type for entity pair (, ). 463

Finally, the multi-class classifier makes edge prediction by choosing the relationship with highest 464

score. 465

In the training data, each entity pair is labelled with one ground truth relationship. If multi-hop 466

relationships exist between entities, the major relationship is adopted. Herein, the major 467

relationship is defined as the mainly intended relationship in NLQs, when compared with 468

supplementary relationships (

) due to the lack of complete mention of variables. Consider the 469

NLQ example in Fig.8, the major relationship between the second “base offset” entity and “184944” 470

is “isPropertyOf”, which implies that the “base offset” property is possessed by an object instance. 471

In contrast, “hasTag” is the supplementary relationship that arises from the lack of a mention of the 472

wall entity. In this case, the GNN-based classifier is responsible for predicting the major 473

relationships between entities. The model parameters are optimized using cross entropy loss. 474

475

3.5.2 Extra relation finding for multi-hop relational path 476

Once the major relationship between entities has been found by the GNN-RE model, the 477

connectivity between the relationship and head/tail entity nodes in ontology graph must be checked 478

to identify whether supplementary relationships are needed. If so, the extra relationships are 479

extracted based on the ontology. 480

3.5.2.1 Connectivity checking 481

The connectivity between a relationship and an entity node depends on whether the entity is an 482

instance of or is inherited from the domain/range of the relationship. In the semantic web, domain 483

and range assert that the subjects and objects of object property statements must belong to the class 484

extension of the indicated class description [82]. Therefore, it is important to check if the class 485

restrictions are violated to prevent invalid output queries. 486

Given the major relationship for an entity pair, the connectivity to head and tail entities is checked 487

separately. Specifically, the head entity is compared with relationship’s domain class, and the tail 488

entity is compared with the range class. As shown in Fig. 9, the head entity “base offset” is 489

connected to 

, whereas the tail entity “184944” is disconnected because “Tag” is not inherited 490

from the range class “Object” in the ontology. 491

Fig. 9. Process of supplement relationship extraction.

3.5.2.2 Supplementary relationship extraction 492

Following the connectivity checking, the disconnected side(s) of the 

 must acquire 493

supplementary relationships. In this study, only one more extra relationship is taken for each 494

disconnected side, because the experiment’s results showed that this was enough to process the 495

multi-hop relationships in the NLQs. 496

The process of extraction starts by iterating all relationships with their domain and range classes. 497

The connectivity of each candidate’s supplementary relationship is evaluated as follows: 498

(a) For a disconnected head entity, the supplementary relationship should connect with the head 499

entity and the domain class of major relationship. 500

(b) For a disconnected tail entity, the supplementary relationship should connect with the tail entity 501

and the range class of major relationship. 502

Finally, the candidate relationship is selected if it passes connectivity checking on both sides. As 503

shown in Fig. 9, “hasTag” is chosen as the supplementary relationship because it connects with 504

“Object” (range of 

) and “184944” (instance of “Tag” entity). If more than one candidate 505

relationship is returned, a graph distance score [83] is employed to prioritize relationships. 506

507

3.6 Automatic BIM query language generation 508

As all the entities and relational paths are extracted from the NLQ, the graph-based logical form 509

(see Fig. 10) can be derived and easily transformed into different types of query languages for 510

retrieving IFC-based BIM models. This study employs the template-based method [11] to 511

automatically generate SPARQL queries by filling in the slots of the prepared query templates with 512

the identified variables, classes, instances, object properties, and data values. The resulting queries 513

are executed by the BIMSPARQL framework [84] to retrieve ifcOWL-based BIM model instances. 514

The ifcOWL format is used to represent IFC models because of its recent popularity [85], and 515

BIMSPARQL provides extension functions to load and process ifcOWL instances. 516

Fig. 10 shows an example of the resulting SPARQL query. All recognized entities are regarded as 517

variables with the asserted classes. The extracted semantic relationships are replaced with the 518

corresponding SPIN (SPARQL Inference Notation) inference rules [49] from BIMSPARQL[84] 519

and NLQ4BIM [11]. 520

521

4. Approach evaluation and validation 522

This research assesses the efficiency and effectiveness of the proposed approach in two aspects. 523

First, a laboratory experiment is conducted for performance evaluation, in comparison with the 524

SOTA models. Second, a case study is carried out in a real-world project to validate the 525

practicability of the approach. In the following parts, Section 4.1 introduce the implementation 526

details. Section 4.2 and 4.3 illustrate the experiment design and test results, respectively. Section 527

4.4 demonstrates the real case. 528

529

4.1 Implementation 530

Our study implemented all algorithms in the Python (ML model development) and Java (ontology 531

and BIM model processing) languages. The BIM models in IFC2×3 TC1 specification [86] were 532

converted into ifcOWL instances [87] in the RDF format for ontology population and SPARQL 533

query execution, based on the Apache Jena Framework [88]. The INLE ontology was converted 534

into graph data using the RDFlib toolkit [89] and was further manipulated using the NetworkX 535

package [90] for graph simplification. Regarding text processing, the SuPar toolkit [91] was used 536

to carry out dependency parsing. Finally, all GNN models were established using the Pytorch 537

library [92]. The performance evaluation of EL and RE models was based on the Scikit-learn library 538

[93]. 539

Fig. 10. The automatically generated SPARQL query.

4.2 Experiment design 540

To evaluate the proposed method for BIM model retrieval, our study extends the BIM-NLQ dataset 541

published in [11,29]. The scale of the dataset was enlarged from around 200 queries to 786 queries 542

over five architectural and structural BIM models. The newly added NLQs were manually created 543

by the first and sixth authors with an emphasis on ambiguous entities and multi-hop relations. The 544

comparison with the only large publicly available BIM-NLQ dataset, the iBISDS dataset [53], is 545

outlined in Table 1. 546

Table 1. Description of the developed dataset. 547

Aspect

iBISDS dataset

Our dataset

The number of

mentioned variables

1-3 variables per query, with an

average number of 2.57.

1-8 variables per query, with an

average number of 3.02.

Data annotation Entity type and question type.

Mentioned entities in ontology,

relationships and SPARQL queries

Logical connections

Not addressed.

logical conjunction and disjunction.

Semantic relations

Not addressed.

34 types of semantic relationships

Attribute value

restrictions Not addressed.

Literal, quantitative, and Boolean

value restrictions

548

The dataset was split into a training set, a development set, and a test set with at a ratio of 8:1:1. 549

The training set was used to train ML models; the development set was applied to tune model 550

parameters; and the test set was used to evaluate the final performance. As shown in Fig. 11, since 551

the proposed method consists of two GNN models, the raw BIM-NLQ dataset must be further 552

transformed into two datasets. Thereafter, the models that were trained on these two datasets were 553

put together to form a holistic two-stage semantic parsing model. 554

In the following sections, Sections 4.2.1 and 4.2.2 introduce the preprocessing for deriving EL and 555

RE datasets. Section 4.3.1 firstly reports the overall performance of the proposed two-stage SP 556

method based on the test set of the BIM-NLQ dataset. Afterward, Section 4.3.2 and Section 4.3.3 557

demonstrate the performance at two stages respectively. 558

559

4.2.1 Dataset preprocessing for first-stage entity linking 560

In the EL dataset, each NLQ has an ambiguous name mention, a set of candidate entities, and a 561

ground truth entity. To obtain this, the NLQs were processed via the candidate generation programs 562

introduced in Section 3.4.1. Although a total of 911 ambiguous name mentions were found in 786 563

queries, this scale of dataset was too small to train GNN models. Therefore, additional training data 564

was synthesized using the following strategy. Each NLQ that did not have ambiguous name 565

mentions, had a grounded entity randomly selected, and three other entities randomly extracted 566

from the INLE ontology. These four entities formed the candidate entities of the corresponding 567

name mention. As a result, the final EL dataset was expanded to include 1116 data examples, and 568

Table 2 shows their statistical properties. 569

Table 2. Statistical properties of the EL dataset. “Avg.” denotes “average”; “Num.” denotes “number”. 570

Dataset division Size

Avg. word count

Avg. Num. of candidate nodes

(excluding synthesized data)

Avg. Num. of

question nodes

Avg. Num. of extra

nodes

Training set

893

11.59

2.34

3.19

30.99

Development set

111

13.14

2.22

3.92

30.44

Test set

112

12.12

2.48

3.67

31.39

571

4.2.2 Dataset preprocessing for second-stage relation extraction 572

In the RE dataset, each data example has a head entity, a tail entity, and an annotated relationship. 573

Through pairing all entities in the NLQ dataset, the RE dataset contained 2113 training data, 252 574

development data, and 271 test data. Nevertheless, it was observed that there was an imbalanced 575

Fig. 11. Experiment design.

distribution of classes in the dataset. Among 36 classes, 13 classes have less than 10 training data, 576

which would cause a multi-class classifier to be biased towards major classes. To alleviate this 577

problem, this study applies an oversampling technique [94] that randomly replicates examples of 578

the minority classes to at least 100. A synonym dictionary was prepared to replace words in 579

duplicated data with their alternative expressions. Consequently, the training set of the RE dataset 580

was expanded with 2564 new examples. Finally, statistical properties of the GNN-RE dataset are 581

shown in Table 3. 582

Table 3. Statistical properties of the RE dataset. “Avg.” denotes “average”; “Num.” denotes “number”. 583

Dataset division Size

Avg. word

count

Avg. Num. of head-tail

entity pair

Avg. Num. of question

nodes

Avg. Num. of extra

nodes

Training set

4667

11.59

1.49

31.03

Development set

252

13.14

1.95

30.06

Test set

271

12.12

2.24

29.87

584

4.2.3 Training details 585

This study sets the dimension of node embeddings to 100 and the number of RGAT layers (L) to 5 586

for both GNN models. Rectified Adam (RAdam) [95] was used as the optimizer, with a batch size 587

of 32 and a learning rate of 2e-5 and 1e-3 for the LM (RoBERTa) and GNN modules, respectively. 588

The maximum epoch is 200, and the training will terminate when there is no more performance 589

improvement in the past 50 epochs. All hyperparameters are determined based on tuning 590

experiments. 591

592

4.3 Test result 593

4.3.1 The accuracy of the overall query results 594

This part evaluates the overall performance of the integrated two-stage SP method based on the test 595

set of the BIM-NLQ dataset. The trained GNN-EL and GNN-RE models are combined into one 596

program to parse the input NLQs. In terms of value restrictions in NLQs, the numerical and Boolean 597

data values are extracted based on [11] due to its perfect performance. 598

The accuracy of the semantic parsing depends on whether the resulting SPARQL query is valid and 599

the correct retrieval results can be obtained, which are manually checked against ground truth 600

answers. NLQ4BIM [11] is chosen as the comparison method owing to its suitable functional scope 601

for the dataset and high performance. 602

Table 4. Accuracy of the overall query results. “QResult” stands for query result. 603

Model

Correct QResult

Accuracy

GSP4BIM (proposed method)

81.01%

NLQ4BIM [11]

62.03%

604

The overall SP results are shown in Table 4. The total accuracy of the proposed SP method was 605

81.01% over 79 queries. By contrast, NLQ4BIM only achieved 62.03% accuracy due to the 606

frequent occurrence of ambiguous mentions and multi-hop relationships. Table 5 presents several 607

qualitative results that compare our method with NLQ4BIM. In the first sample, GSP4BIM 608

correctly identified “level 2” and “up to level:Roof” as property values of “base constraint” and 609

“top constraint”, whereas NLQ4BIM superficially treated them as IfcBuildingStorey and IfcRoof, 610

which resulted in an invalid query. This shows the strength of GSP4BIM in value restriction 611

extraction due to its disambiguation function. In the second sample, GSP4BIM could extract the 612

two-hop relationship between “ceiling” and “TotalThickness”, where “ceiling” is a predefined type 613

of IfcCovering that has the property “TotalThickness”. In contrast, NLQ4BIM cannot predict more 614

than one relationship between entities because of its rigid rule-based principle. 615

616

Table 5. Sample predictions from our method and the comparison method. The recognized entities (colored) 617

and relationships (directional connectors) between entities are presented. 618

NLQ Sentence

NLQ4BIM [11]

GSP4BIM (ours)

Ground truth

Return walls

with a base

constraint of

Level 2 and a

top constraint of

up to level:

Roof.

1(a)

1(b)

1(c)

Get all ceilings

with a

TotalThickness

equal to 0.052.

2(a)

2(b)

2(c)

Get curtain

walls composed

of mullions with

a span larger

than the span of

1227576 .

3(a)

3(b)

3(c)

Show me the

levels

that have

Floor:Hangover

Shading_100m

m .

4(a)

4(b)

4(c)

619

All errors in GSP4BIM stem from both the EL and RE stages. In the first stage, it was found that 620

the GNN-EL model makes mistakes in distinguishing between literal data values and class entities. 621

For example, one error occurs in disambiguating the mention “roof” in NLQ “the beams with 622

reference level to be roof”. While the ground truth answer is a literal value for the property 623

“reference level”, the GNN-EL model wrongly classified it as an IfcRoof entity. In the second stage, 624

it was observed that the GNN-RE model occasionally predicts supplementary relationships instead 625

of major relationships, meaning that the full query path cannot be traced. As shown in Table 5.4(a), 626

the GNN-RE model returns “hasObjecType” between “levels” and “Floor:Hangover 627

Shading_100mm”. In this example, “hasContainedProduct” is the major relationship, while 628

“hasObjectType” is the supplementary relationship. However, the former cannot be extracted 629

because the latter has already passed connectivity checking. 630

Provided that the major relationships were correctly classified by GNN-RE, there were no errors in 631

extracting the rest of the supplementary relationships. The performance of two proposed GNN 632

models is further elaborated in the next two sections. 633

634

4.3.2 Performance of first-stage entity linking 635

As illustrated in Fig. 11, the performance of the GNN-EL model was tested on 111 data examples. 636

The standard metrics of micro accuracy and macro accuracy from [96] are adopted to evaluate EL 637

models, which are defined as: 638

= 

 (9) 639

 =

()

()





 (10) 640

where NumMentions, NumEntities and NumCorrect denote the total number of mentions, total 641

number of entities, and total number of correct predictions, respectively. The second metric 642

measures EL accuracy averaged over all entities. 643

Most SOTA EL models mainly work on document-level texts, or leverage Wikipedia’s knowledge 644

base [28] and knowledge graph [97] as supplementary data sources. Hence, there is a technical gap 645

applying these models for entity linking in Text-to-BIMQL. 646

This study selects two baseline models that are compatible with our dataset: 647

• Baseline 1: generative entity-mention model [98] that leverages entity popularity 648

knowledge, name knowledge, and context knowledge to estimate the likelihood. The EL 649

training set serves as the resource to train the probabilistic models. 650

• Baseline 2: an ontology-based EL model [99] that uses rich semantic information and 651

ontology structures to rank entities. The INLE ontology was used as the source ontology 652

for calculating entropy scores of candidate entities. 653

The test results are presented in Table 6. Our GNN-EL model achieved a micro accuracy of 91.45% 654

and a macro accuracy of 75.87%, which significantly outperforms baseline models by at least 29.06% 655

in micro accuracy and at least 41.49% in macro accuracy. These results demonstrate the strength 656

of the proposed GNN model in handling EL tasks in this domain-specific context. Moreover, 657

because the expanded EL dataset is still small, we used 10-fold cross validation to evaluate the 658

GNN-EL model's generalization ability. The entire EL dataset was re-divided into 10 parts, and 659

each part was used as a test set at each iteration, while the remaining parts were used for model 660

training and convergence. As shown in Table 7, the average micro and macro accuracy over 10 661

iterations are 89.21% and 74.6%, respectively, with a small variance (standard deviation < 5%). 662

This indicates that the model can generalize well to new data. 663

Table 6. Micro and macro accuracy of the GNN-EL model and the baseline models. 664

Model

Micro Accuracy

Macro Accuracy

Baseline 1

62.39%

34.38%

Baseline 2

51.28%

25.43%

GNN-EL (ours)

91.45%

75.87%

665

Table 7. Results of 10-fold cross validation for evaluating the GNN-EL model. 666

Iteration

Micro Accuracy

Macro Accuracy

89.29%

74.35%

91.96%

80.45%

91.07%

76.38%

88.39%

70.64%

86.61%

75.5%

90.18%

77.92%

85.71%

69.04%

89.29%

75.02%

89.29%

73.69%

90.35%

72.79%

667

Although the generative entity-mention model performs well (around 80% accuracy) in large-scale 668

common domain datasets [98], it does not work well in our domain-specific EL task, because the 669

semantics of different candidate entities vary widely in common EL tasks. By contrast, the 670

candidate entities of ambiguous mentions in the BIM environment are close and are usually posed 671

by varied data representations. For example, “level 2” could represent an IfcBuildingStorey instance 672

or data value of the property “base offset”, depending on whether it is associated with property 673

entities. This makes it difficult for traditional EL models to distinguish candidate entities when 674

based purely on NLQ context. By contrast, our model captures the relevance between entities over 675

the BIM ontological structure through graph learning. 676

Despite the fact that the second baseline model [99] also utilizes ontologies, its original focus was 677

on biomedical ontologies, so that the model lacks sufficient adaptivity to the BIM ontology, which 678

contains lots of schema-level constructs. By comparison, our supervised learning-based model can 679

better capture the features of ontology structures because the representation of nodes can be updated 680

based on the objectives of EL. 681

Table 6 shows that the macro accuracy is lower than the micro accuracy for both the GNN-EL and 682

baseline models, a situation that can be attributed to the imbalanced dataset where some types of 683

entities are much more abundant than others. To investigate whether our model was trained to 684

simply prioritize more frequently occurring entities, the entity-mention model [98] was only 685

modified and tested using the popularity score that awards the most frequently mentioned entities 686

in the training set. The resulting micro-accuracy and macro-accuracy values were 82.05% and 687

53.5%, respectively, showing that GNN-EL learns features of minor types of entities for entity 688

linking. Apart from the dataset problem, it was observed that the current GNN architecture is weak 689

in capturing complex logical assertions in the ontology. An error occurs in “The floor where the 690

column called Pile - 011 locate”, where “floor” indicates IfcBuildingStorey but the prediction is an 691

enumerated type of IfcSlab. Here, the machine incorrectly interprets that “floor” describes the type 692

of column, whereas IfcSlab and IfcColumn are disjoint in the ontology so that their types cannot be 693

shared. 694

695

4.3.2.1 Ablation study 696

An ablation study was conducted to investigate the contribution of the components to the GNN-EL 697

model, including extra nodes, context nodes, position edges, and distance edges. The performance 698

of the model after the removal of different components is shown in Table 8. Without the use of 699

extra entity nodes, the model's micro and macro accuracy decline by 6.84% and 23.86%, 700

respectively, demonstrating the importance of the ontology-guided OG subgraph extraction. 701

The influence of LM is studied by removing the context node from , and the final output is 702

obtained by applying 2-layer MLP to the final hidden state of the candidate entity node. 703

Consequently, the micro accuracy and macro accuracy are reduced by 3.54% and 11.17%, 704

respectively. The model performance without position edges and distance edges has also declined. 705

Table 8. Micro and macro accuracy of our GNN-EL model and the baseline models. 706

Model setup

Micro Accuracy

Macro Accuracy

GNN-EL (origin)

91.45%

75.87%

Without extra nodes

84.61%

52.11%

Without LM

88.03%

64.7%

Without position edges

90.6%

75.2%

Without distance edges

87.18%

68.94%

707

4.3.3 Performance of second-stage relation extraction 708

The GNN-RE model for the second-step semantic parsing predicts the major relationships between 709

entities. Its performance was evaluated based on 271 data examples that specified head entities and 710

tail entities in queries. Following [100], the metrics to measure RE models are accuracy and macro 711

F1 score: 712

Accuracy = 

 (11) 713

F1 score =  ()





 (12) 714

where NumRelations represents the total number of entity pairs to be predicted; NumClasses stands 715

for the number of relationship types. The F1 score() denotes the F1 score for classification of a 716

certain type of relationship in the test data. 717

Two SOTA models for the RE tasks were adopted as baseline models in this study: 718

• Baseline 1: R-BERT [101] that leverages the pretrained BERT model and incorporates 719

target entities’ information for relation classification. 720

• Baseline 2: GNNs with Generated Parameters (GP-GNNs) [100] also rely on GNNs to 721

conduct relational reasoning on unstructured texts. 722

The test results are shown in Table 9. The proposed GNN-RE model yielded an accuracy of 90.04% 723

and a macro F1 score of 78.84%, which outperformed all the baseline models, demonstrating the 724

superior performance of our model in dealing with relationship extraction for NL-based BIM model 725

retrieval. Although GP-GNNs also employ the GNN architecture, it only models name mentions as 726

nodes in the working graph, which lacks sufficient background information of the associated 727

entities. By contrast, our GNN-RE model combines name mentions and entity nodes in a unified 728

graph representation for joint inference, which is more efficient because it makes the RE model 729

aware of the semantics of the entities and how they interact in the ontology. 730

Table 9. Performance of our GNN-RE model and the baseline models. 731

Model

Accuracy

Macro F1 score

Baseline 1

86.71%

68.19%

Baseline 2

65.35%

53.61%

GNN-RE (ours)

90.04%

78.84%

732

Like the test results of EL models, the macro F1 scores of three models are generally less than the 733

accuracy, which result from the imbalanced class distribution in the dataset. Table 9 shows that this 734

is a pervasive problem that hampers all RE models. 735

Fig. 12 presents a confusion matrix that details the results in classifying major relationships of 736

entity pairs in the test dataset. As can be seen, the false positives (FPs) and false negatives (FNs) 737

relevant to “No relation” relationships take up most errors. This arises because the number of entity 738

pairs that have “No relation” in the test dataset is large. Moreover, when the NLQ sentence is long 739

and contains many variables, the GNN-RE model is more error-prone in identifying the dependency 740

relationships between distant entities. For example, an error occurs in identifying the relationship 741

between “column” and “width” in the sentence “the columns that satisfy its gross volume < 0.3 742

cubic meter, depth > 200 mm, width < 300 mm”. The model wrongly outputs “No relation” because 743

of too many intermediate words between entities. Another source of error stems from DP. A 744

problematic DP graph would lead to incorrect messages passing between the nodes. In addition to 745

misidentification of “No relation”, it can be observed that there are seldom FP and FN errors when 746

identifying semantic relationships, which shows that the model can make accurate classifications 747

for entity pairs that have confident dependencies. Typical errors include mistaking "lessThan" for 748

the correct answer of "largerThan" and "equalTo". These three relationships are confused because 749

they have similar semantics regarding quantitative comparison. More importantly, too little training 750

data (less than 10 examples) for the “less than” and “equal to” relations led to incomplete model 751

training. 752

Fig. 12. Confusion matrix of multi-class classification for relationship extraction. The vertical axis and

horizontal axis represent the ground truth relationships and the predicted relationships, respectively.

classes that do not appear in test set are not sh

own.

4.3.3.1 Ablation study 753

The ablation analysis is presented in Table 10. Without extracting extra entity nodes, the accuracy 754

and macro F1 score of the model decreased by 5.54% and 9.84%, respectively. This again 755

demonstrates the effectiveness of GNN-based reasoning over the BIM ontology graph. On the other 756

hand, if the DP graph is replaced with a single context node in the working graph, the accuracy and 757

macro F1 score reduce to 72.69% and 36.74%, respectively. This dramatic decline illustrates the 758

importance of combining DP graph and OG to reveal the dependencies between entities. In addition, 759

without connecting token nodes of middle text segment with head entity () and tail entity (), 760

the accuracy and macro F1 scores drop to 87.82% and 75.49%, respectively. The degradation is not 761

obvious because the messages of the middle text segment can also be passed through nearby token 762

nodes, but  and  can pass more significant messages. 763

Table 10. Micro and macro accuracy of our GNN-RE model and baseline models. 764

Model setup

Accuracy

Macro F1 score

GNN-RE (origin)

90.04%

78.84%

Without extra nodes

84.50%

69%

Without DP graph

72.69%

36.74%

Without 



and 



87.82%

75.49%

765

4.3.4 Computation cost 766

The computation cost is evaluated in terms of training time, loading time, and model inference time. 767

All the formalized models were trained and tested on a server computer with an AMD EPYC 7252 768

8-Core Processor (3.09 GHz), the Windows Server 2019 system, and a NVIDIA A30 display card. 769

Table 11. Computation time. “s” and “h” denote seconds and hours. 770

Category

Item

Duration

Training time

Ontology embedding model training

0.83 h

GNN-EL model training

3 h

GNN-RE model training

20 h

Loading time

Data loading for text, graph, and ontology embedding

20 s

Loading language models and GNN models

21 s

Inference time

First-stage subgraph extraction

0.75 s/graph

First-stage inference for entity linking

0.017 s/graph

Second-stage subgraph extraction

0.6 s/graph

Second-stage inference for relation extraction

0.091 s/graph

Extra-relationship finding

0.06 s

Standard SPARQL query generation

5 s

771

Table 11 reports the computation time at each phase. Training an ontology embedding model with 772

the populated INLE ontology takes approximately 0.83 h. The GNN-EL model converges in 3 h, 773

with its optimal test accuracy occurring on the 8th epoch. The GNN-RE model took around 20 h to 774

reach convergence, with its best test accuracy occurring on the 72nd epoch. The latter takes more 775

time because of more training data and a more complex working graph structure. 776

The total loading time for processing a single NLQ takes around 41 s for text data and graph ML 777

models. Batch processing a group of NLQs can substantially reduce the average loading time. In 778

contrast, the proposed approach is efficient in model inference. The average time for the GNN-EL 779

model to extract an OG subgraph and process a graph for each candidate entity is 0.75 s and 0.017 780

s. The total disambiguation time for a query depends on the number of ambiguous names and 781

candidate entities. In the second step, the average computation time of subgraph extraction and 782

model inference is around 0.6 s and 0.091 s per graph (entity pair). Finally, it takes around 5 s to 783

automatically generate a SPARQL query based on the SP results. In comparison, the average 784

loading time and inference time of the NLQ4BIM [11] system are 30 s and 25 s, respectively. 785

Overall, the total processing times of the two systems are close. 786

787

4.4 Case study 788

To demonstrate the practicability of the proposed method, a case study was conducted in a real-789

world construction project about a public library building [102] located in Hong Kong, as shown 790

in Fig. 13. The three-story building has a gross building area of around 600 . The Revit BIM 791

model used in the project coordination was collected, which integrates architecture, structure, and 792

MEP parts. In all, the library BIM model comprises 1776 building and facility components. Two 793

project engineers, who have backgrounds in architectural engineering and mechanical engineering, 794

were asked to create NLQs based on their respective information needs. Based on the open-source 795

BIMSPARQL-GUI framework [84], a web-based NLI prototype was developed, with the trained 796

SP models deployed in the server. As shown in Fig 14(a), the participants use the NLI by simply 797

inputting the NLQs in the left-side textbox. The retrieval results are then returned in the form of a 798

table and a graphical representation (see Fig. 14(b)). 799

As presented in Table 12, a total of 10 NLQs were created by the participants. Most questions are 800

about searching for elements with several conditions, which can be useful in various management 801

and engineering tasks. For example, the 2nd query is used to identify walls that have poor thermal 802

insulation, and the 9th query diagnoses the leaking pipe segments. Also, there are questions raised 803

to count elements (the 4th and 5th queries) and return attributes of physical or spatial elements (the 804

8th query), both of which improve the situational awareness of the building. 805

Table 12. Natural language queries generated by participants. 806

Queries

Result

1. Search the mullion with a thickness of 175 mm.

True

2. Search the exterior wall with a thermal transmittance higher than 10 W/(m2*K).

True

3. Search the walls that contain 150 mm aluminum on the second floor.

True

4. Count the number of the glazed panels with a thickness of 25 mm on the first

floor.

False

5. Count the number of risers of the stairs on the second floor.

True

6. Search the walls with a height greater than 400 mm on the UR floor.

True

7. Search the double flush panel doors on the ground floor whose material is glass.

True

8. Return the gross area of floor slabs on the ground floor.

True

9. Pipes with the system type of “Supply Air” whose friction pressure is lower

than 5 pa/m.

False

10. Select the windows that have a width > 1 m and have the minimum offset on

the first floor.

True

807

As a result, 8 out of 10 queries were correctly parsed and executed in 1-1.5 minutes. The 808

performance of GNN-EL model and GNN-RE model is evaluated based on the metrics employed 809

in Section 4.3.2 and Section 4.3.3, respectively. In terms of EL, a total of 15 ambiguous mentions 810

Fig. 13. The case study of the library building. (a) the real photograph [102]; (b) the rendered BIM model.

were processed, with micro and macro accuracy of 93.33% and 95%, respectively. The error occurs 811

in the 4th query when distinguishing whether “panel” refers to a property or IfcPlate, probably 812

because of the scarce training data relevant to the plate entity. 813

The evaluation of the GNN-RE model is based on 37 head-tail entity pairs, arising from the entities 814

correctly identified in the first stage. Consequently, the accuracy and macro F1 score are 94.59% 815

and 96.6%, respectively. The error occurs in the 9th query when predicting relationships for “pipes”, 816

because the training data for developing GNN models in Section 4.2 do not cover any NLQs related 817

to MEP concepts. This suggests the limitation of the GNN-based approach in processing NLQs 818

with unseen domain concepts, which will be tackled by zero-shot learning [103] in future studies. 819

Compared with searching manually for objects in the multi-domain BIM model with thousands of 820

elements, the NLI can help the engineers retrieve model information much more quickly. Both 821

participants recognize that they prefer to use an NLI to search for building elements if there are 822

constraints on attributes or relationships. In comparison, programming languages are considered 823

by participants as an impractical way to use in the project due to the tedious process and the lack 824

of IT skills. Moreover, it is found that the proposed SP method efficiently interprets questions 825

submitted by users, which conform to general expression habits but are often not in line with IFC 826

semantics. For example, in the first query, “mullion” is an enumerated type of IfcMember, but users 827

who do not know would not make NLQs like “search mullion members…”. In this scenario, the 828

proposed method successfully extracts the multi-hop relational path between entities when 829

IfcMember is missing in the query. 830

In summary, the assessment results of the proposed SP method in this case study are plausible, with 831

the two GNN models having over 90% performance based on the different metrics. The result 832

Fig. 14. Web-based NLI for BIM model retrieval; (a) interface with the uploaded BIM model; (b) the retrieval

results

after inputting the second query in Table 12.

verifies the practicability of the graph ML-based approach in real-world projects, implying that 833

practitioners can effectively retrieve BIM models by using an NLI that deploys the developed 834

models. 835

836

5. Discussion 837

5.1 Effectiveness of GNN-based Text-to-BIMQL semantic parsing 838

The task of transforming NL texts into common BIM query languages encounters two critical 839

problems: name ambiguity and relational reasoning. Both must look at the BIM ontology and the 840

NLQ context together to figure out the entities and relationships that BIM queries talk about. The 841

proposed GNN-based approach solves these problems by putting both kinds of information into a 842

single graph representation for joint reasoning. The test results in Section 4.3 show that our method 843

is accurate when parsing text-based queries with different ambiguous name descriptions and 844

complex constraints. They also show the benefits of GNN-based models for automatically learning 845

the BIM ontology and estimating the logical form of unstructured texts. 846

By deploying the GNN models in NLIs or voice assistant systems, the proposed SP method can be 847

effectively used to translate NLQs and retrieve BIM models. The retrieval results can be further 848

represented in graphics or NL responses to address the different information needs of BIM users in 849

construction projects. 850

851

5.2 Limitations and future works 852

While the research’s achievements are promising, several limitations should also be noted. 853

(a) The GNN-EL model still makes mistakes due to confusing entities. For example, the EL model 854

often fails to distinguish whether “area” refers to a property or a space in test data. A potential 855

reason is that position edges and distance edges are not sufficient for capturing the position 856

information of entities in an NLQ and the dependencies between entities, respectively. 857

Consequently, the working graph for GNN-EL cannot effectively pass contextual information 858

to entity nodes. In future work, a better strategy to set up working graphs and score subgraphs 859

for candidate entities will be explored. 860

(b) Affected by the features of the head entity or tail entity, the GNN-RE model sometimes predicts 861

supplementary relationships rather than major relationships between entities in multi-hop 862

relational paths. Even though the prediction is reasonable, it is difficult to conversely 863

distinguish major relationships from supplementary relationships since the latter often describe 864

the identification of objects only. Future work will attempt to solve this problem by looking 865

into multi-label classification that instantly returns multiple-hop relationships. 866

less attention to the code generation. Although the obtained logical form of NLQ contains 868

entities and relational paths, the proposed method cannot extract and represent much more 869

complex logics and operations in NLQs, such as negation (e.g., “return spaces that are not on 870

the first floor”), and summation (e.g., “Does the sum of floor areas of Bath RM and Kitchen 871

exceed 50 ?”). In the future, how to identify these implicit operations and turn them into 872

logical forms of a query will be explored. 873

(d) Since the method involves two GNN models, the total computation cost is higher than that of 874

the Seq2Seq models, which utilize one holistic neural network. There are several factors that 875

slow down the computation time. First, the large LMs were loaded twice in two stages. Second, 876

the sum of the parameters in two GNN models is large. Third, the OG subgraphs were re-877

extracted during the second stage. Future studies will explore how to devise a GNN architecture 878

that can accomplish both tasks simultaneously. Also, more annotated data needs to be made so 879

that deep learning-based SP models can be better trained, which would also make it possible to 880

develop Seq2Seq models that can directly output codes. 881

882

6. Conclusion 883

As the building process gets more complicated, project practitioners need to be able to quickly and 884

flexibly composite ad hoc views and extract partial subsets of BIM models. Emerging natural 885

language-based query interface systems have the potential to allow BIM users to retrieve BIM 886

models in a time- and cost-efficient manner. However, the existing method cannot successfully 887

predict the logical forms of natural language queries that contain various user-specified conditions. 888

Name ambiguity and multi-hop relational path extraction are two formidable problems. Therefore, 889

this study proposes a novel graph neural network-based semantic parsing method for NL-based 890

BIM model retrieval. The method consists of two stages. In the first stage, the candidate entities in 891

NLQs are recognized in a coded ontological context. The ambiguous name mentions are then 892

processed by the proposed GNN-based entity linking model to match the correct entities. GNN-EL 893

conducts joint reasoning over the NLQ context and ontology graph for each candidate entity. 894

Having linked all mentions to ontological entities, the second stage extracts the relational paths 895

between entities in the NLQ. GNN-based link prediction is exploited to extract relationships 896

between each entity pair based on a heterogeneous graph that concatenates a dependency parsing 897

graph and an ontology graph. Finally, logical forms of NLQs are derived and transformed into 898

standard SPARQL queries for retrieving BIM models. 899

The proposed approach was developed and evaluated based on a new BIM-NLQ dataset containing 900

786 queries over five BIM models. The overall accuracy of semantic parsing was 81.01%, which 901

outperformed the existing NL-based BIM query systems. Furthermore, a case study was carried out 902

on a real-world building project. It was found that a natural language interface that deploys the 903

developed models can be used by project engineers to retrieve BIM models with different constraint 904

conditions. 905

The main contributions of this research are acknowledged in three aspects. 906

(a) A new GNN-based entity-linking model is given to automatically align ambiguous name 907

mentions in natural language texts with the BIM ontology. 908

(b) A novel GNN link prediction approach is presented to integrate ontologies to parse the NL-909

based BIM queries by extracting multi-categorical relationships. 910

executable queries for retrieving BIM models. 912

Based on the above innovations, complex NLQs that include different constraint conditions can be 913

made to perform more fine-grained queries over BIM models. Finally, the proposed method has 914

several limitations. First, the proposed method cannot recognize complex logics (e.g., negation) 915

and engage the relevant operators in the queries. Second, the use of two GNN models poses heavy 916

computation costs. In future studies, end-to-end SP architecture that can handle various tasks 917

simultaneously will be devised to improve the performance. 918

919

Acknowledgements 920

We sincerely thank Architectural Technology and Innovation Services Limited for providing us 921

with the data used in the case study. 922

923

924

925

References 926

[1] C.C.M.C. Eastman, C.C.M.C. Eastman, P. Teicholz, R. Sacks, K. Liston, BIM handbook: 927

A guide to building information modeling for owners, managers, designers, engineers and 928

contractors, 2nd ed., John Wiley & Sons, Hoboken, NJ, USA, (2011). ISBN: 0470541377. 929

[2] Y. Hu, D. Castro-Lacouture, C.M. Eastman, Holistic clash detection improvement using a 930

component dependent network in BIM projects, Automation in Construction. 105 (2019) 931

pp.102832. https://doi.org/10.1016/j.autcon.2019.102832. 932

[3] I. Motawa, A. Almarshad, A knowledge-based BIM system for building maintenance, 933

Automation in Construction. 29 (2013) pp.173–182. 934

https://doi.org/10.1016/j.autcon.2012.09.008. 935

[4] buildingSMART International Ltd., Industry Foundation Classes: Version 4.2 bSI Draft 936

Standard IFC Bridge proposed extension. 937

https://standards.buildingsmart.org/IFC/DEV/IFC4_2/FINAL/HTML/, 2017 (accessed 938

December 16, 2022). 939

[5] buildingSMART International Ltd., IFC4.3 RC2 - Release Candidate 2 [Draft]. 940

https://standards.buildingsmart.org/IFC/DEV/IFC4_3/RC2/HTML/, 2020 (accessed 941

December 16, 2022). 942

[6] W. Mazairac, J. Beetz, BIMQL - An open query language for building information 943

models, Advanced Engineering Informatics. 27 (2013) pp.444–456. 944

https://doi.org/10.1016/j.aei.2013.06.001. 945

This is the manuscript version of the paper:

Mengtian Yin, Llewellyn Tang, Chris Webster, Jinyang Li, Haotian Li, Zhuoqian Wu, Reynold Cheng.

(2023) ‘Two-stage Text-to-BIMQL semantic parsing for building information model extraction using

graph neural networks’. Automation in Construction. Elsevier, 152, p. 104902. doi:

https://doi.org/10.1016/j.autcon.2023.104902.

The final version of this paper is available at: https://doi.org/10.1016/j.autcon.2023.104902

The use of this file must follow the Creative Commons Attribution Non-Commercial No Derivatives

License, as required by Elsevier’s policy.

[7] buildingSMART International Ltd., Model View Definitions (MVD). 946

https://www.buildingsmartusa.org/standards/bsi-standards/model-view-definitions-mvd/, 947

2021 (accessed December 16, 2022). 948

[8] E.W. East, S. O’Keeffe, R. Kenna, E. Hooper, Delivering COBie Using Autodesk Revit 949

(Perfect Bound), Lulu. com, (2017). ISBN: 1387200917. 950

[9] H. Ying, S. Lee, Generating second-level space boundaries from large-scale IFC-951

compliant building information models using multiple geometry representations, 952

Automation in Construction. 126 (2021) pp.103659. 953

https://doi.org/10.1016/j.autcon.2021.103659. 954

[10] M. Venugopal, C.M. Eastman, R. Sacks, J. Teizer, Semantics of model views for 955

information exchanges using the industry foundation class schema, Advanced Engineering 956

Informatics. 26 (2012) pp.411–428. https://doi.org/10.1016/j.aei.2012.01.005. 957

[11] M. Yin, L. Tang, C. Webster, S. Xu, X. Li, H. Ying, An ontology-aided, natural language-958

based approach for multi-constraint BIM model querying, ArXiv Preprint 959

arXiv:2303.15116. (2023). https://doi.org/10.48550/arXiv.2303.15116. 960

[12] C. Preidel, S. Daum, A. Borrmann, Data retrieval from building information models based 961

on visual programming, Visualization in Engineering. 5 (2017) pp.1–14. 962

https://doi.org/10.1186/s40327-017-0055-0. 963

[13] J.R. Lin, Z.Z. Hu, J.P. Zhang, F.Q. Yu, A Natural-Language-Based Approach to 964

Intelligent Data Retrieval and Representation for Cloud BIM, Computer-Aided Civil and 965

Infrastructure Engineering. 31 (2016) pp.18–33. https://doi.org/10.1111/mice.12151. 966

[14] S. Wu, Q. Shen, Y. Deng, J. Cheng, Natural-language-based intelligent retrieval engine 967

for BIM object database, Computers in Industry. 108 (2019) pp.73–88. 968

https://doi.org/10.1016/j.compind.2019.02.016. 969

[15] N. Wang, R.R.A. Issa, C.J. Anumba, NLP-based Query Answering System for 970

Information Extraction from Building Information Models, Journal of Computing in Civil 971

Engineering. 36 (2022). https://doi.org/10.1061/(ASCE)CP.1943-5487.0001019. 972

[16] I. Motawa, Spoken dialogue BIM systems–an application of big data in construction, 973

Facilities. (2017). https://doi.org/10.1108/F-01-2016-0001. 974

[17] M. Hoy, Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants, Medical 975

Reference Services Quarterly. 37 (2018) pp.81–88. 976

https://doi.org/10.1080/02763869.2018.1404391. 977

[18] A. Kamath, R. Das, A Survey on Semantic Parsing, ArXiv Preprint ArXiv:1812.00978. 978

(2018). https://doi.org/10.48550/arXiv.1812.00978. 979

[19] P. Pasupat, P. Liang, Compositional semantic parsing on semi-structured tables, ArXiv 980

Preprint ArXiv:1508.00305. (2015). https://doi.org/10.48550/arXiv.1508.00305. 981

[20] T. Yu, R. Zhang, K. Yang, M. Yasunaga, D. Wang, Z. Li, J. Ma, I. Li, Q. Yao, S. Roman, 982

Spider: A large-scale human-labeled dataset for complex and cross-domain semantic 983

parsing and text-to-sql task, ArXiv Preprint ArXiv:1809.08887. (2018). 984

https://doi.org/10.48550/arXiv.1809.08887. 985

[21] J. Guo, Z. Zhan, Y. Gao, Y. Xiao, J.-G. Lou, T. Liu, D. Zhang, Towards complex text-to-986

sql in cross-domain database with intermediate representation, ArXiv Preprint 987

ArXiv:1905.08205. (2019). https://doi.org/10.48550/arXiv.1905.08205. 988

[22] C. Finegan-Dollak, J.K. Kummerfeld, L. Zhang, K. Ramanathan, S. Sadasivam, R. Zhang, 989

D. Radev, Improving text-to-sql evaluation methodology, ArXiv Preprint 990

ArXiv:1806.09029. (2018). https://doi.org/10.48550/arXiv.1806.09029. 991

[23] M. Bevilacqua, R. Blloshmi, R. Navigli, One SPRING to rule them both: Symmetric 992

AMR semantic parsing and generation without a complex pipeline, in: Proceedings of the 993

AAAI Conference on Artificial Intelligence, (2021): pp. 12564–12573. ISBN: 2374-3468. 994

[24] I. Konstas, S. Iyer, M. Yatskar, Y. Choi, L. Zettlemoyer, Neural amr: Sequence-to-995

sequence models for parsing and generation, ArXiv Preprint ArXiv:1704.08381. (2017). 996

https://doi.org/https://doi.org/10.48550/arXiv.1704.08381. 997

[25] N. Wang, R.R.A. Issa, C.J. Anumba, A Framework for Intelligent Building Information 998

Spoken Dialogue System (iBISDS), in: EG-ICE 2021 Workshop on Intelligent Computing 999

in Engineering, Universitätsverlag der TU Berlin, (2021): p. 228. ISBN: 3798332118. 1000

[26] F. Elghaish, J. Chauhan, S. Matarneh, F. Pour Rahimian, Artificial intelligence-based 1001

voice assistant for BIM data management, Automation in Construction. (2022). 1002

https://doi.org/10.1016/j.autcon.2022.104320. 1003

[27] R. Zhang, N. El-Gohary, Transformer-based approach for automated context-aware IFC-1004

regulation semantic information alignment, Automation in Construction. 145 (2023) 1005

pp.104540. https://doi.org/10.1016/j.autcon.2022.104540. 1006

[28] X. Han, L. Sun, J. Zhao, Collective Entity Linking in Web text: A graph-based method, 1007

SIGIR’11 - Proceedings of the 34th International ACM SIGIR Conference on Research 1008

and Development in Information Retrieval. (2011) pp.765–774. 1009

https://doi.org/10.1145/2009916.2010019. 1010

[29] M. Yin, L. Tang, C. Webster, S. Xu, X. Li, Data repository of the reviewed article “An 1011

ontology-aided, natural language-based approach for multi-constraint BIM model 1012

querying” https://github.com/MengtianYin/BIM-NLQI, 2021 (accessed December 16, 1013

2022). 1014

[30] S. Shin, R.R.A. Issa, BIMASR: Framework for Voice-Based BIM Information Retrieval, 1015

Journal of Construction Engineering and Management. 147 (2021) pp.4021124. 1016

https://doi.org/10.1061/(ASCE)CO.1943-7862.0002138. 1017

[31] P. Pauwels, W. Terkaj, EXPRESS to OWL for construction industry: Towards a 1018

recommendable and usable ifcOWL ontology, Automation in Construction. 63 (2016) 1019

pp.100–133. https://doi.org/10.1016/j.autcon.2015.12.003. 1020

[32] J. Beetz, J. Van Leeuwen, B. De Vries, IfcOWL: A case of transforming EXPRESS 1021

schemas into ontologies, Artificial Intelligence for Engineering Design, Analysis and 1022

Manufacturing: AI EDAM. 23 (2009) pp.89. 1023

https://doi.org/10.1017/S0890060409000122. 1024

[33] K. Janowicz, M.H. Rasmussen, M. Lefrançois, G.F. Schneider, P. Pauwels, “BOT: the 1025

Building Topology Ontology of the W3C Linked Building Data Group, Semantic Web. 12 1026

(2019) pp.143–161. https://doi.org/10.3233/SW-200385. 1027

[34] G.F. Schneider, M.H. Rasmussen, P. Bonsma, J. Oraskari, P. Pauwels, Linked building 1028

data for modular building information modelling of a smart home, in: 12th European 1029

Conference on Product and Process Modelling (ECPPM), CRC Press, (2018): pp. 407–1030

414. ISBN: 042950621X. 1031

[35] J. Zhang, N.M. El-Gohary, Automated information transformation for automated 1032

regulatory compliance checking in construction, Journal of Computing in Civil 1033

Engineering. 29 (2015) pp.1–16. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000427. 1034

[36] P. Zhou, N. El-Gohary, Ontology-based automated information extraction from building 1035

energy conservation codes, Automation in Construction. 74 (2017) pp.103–117. 1036

https://doi.org/10.1016/j.autcon.2016.09.004. 1037

[37] X. Xu, H. Cai, Ontology and rule-based natural language processing approach for 1038

interpreting textual regulations on underground utility infrastructure, Advanced 1039

Engineering Informatics. 48 (2021) pp.101288. https://doi.org/10.1016/j.aei.2021.101288. 1040

[38] Z. Zheng, Y.-C. Zhou, X.-Z. Lu, J.-R. Lin, Knowledge-informed semantic alignment and 1041

rule interpretation for automated compliance checking, Automation in Construction. 142 1042

(2022) pp.104524. https://doi.org/10.1016/j.autcon.2022.104524. 1043

[39] R. Zhang, N. El-Gohary, A deep neural network-based method for deep information 1044

extraction using transfer learning strategies to support automated compliance checking, 1045

Automation in Construction. 132 (2021) pp.103834. 1046

https://doi.org/10.1016/j.autcon.2021.103834. 1047

[40] J.B. Hamrick, V. Bapst, A. Sanchez-gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, 1048

D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, 1049

A. Vaswani, K. Allen, C. Nash, V. Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. 1050

Botvinick, Relational inductive biases, deep learning, and graph networks, ArXiv Preprint 1051

ArXiv:1806.01261. pp.1–40. https://doi.org/10.48550/arXiv.1806.01261. 1052

[41] ISO (International Organization for Standardization), ISO 16739:2018, Industry 1053

Foundation Classes (IFC) for Data Sharing in the Construction and Facility Management 1054

Industries – Part 1: Data Schema. https://www.iso.org/standard/70303.html, 2018 1055

(accessed December 16, 2022). 1056

December 16, 2022). 1058

[43] S. Lockley, C. Benghi, M. Cerny, Xbim. Essentials: a library for interoperable building 1059

information applications, The Journal of Open Source Software. 2 (2017) pp.473. 1060

https://doi.org/10.21105/joss.00473. 1061

[44] J.K. Lee, Building environment rule and analysis (BERA) language and its application for 1062

evaluating building circulation and spatial program, Georgia Institute of Technology, 1063

2011. https://smartech.gatech.edu/bitstream/handle/1853/39482/Lee_Jin-1064

Kook_201105_PhD.pdf?sequence=1 (accessed March 29, 2023). 1065

[45] S. Daum, A. Borrmann, Processing of topological BIM queries using boundary 1066

representation based methods, Advanced Engineering Informatics. 28 (2014) pp.272–286. 1067

https://doi.org/10.1016/j.aei.2014.06.001. 1068

[46] W. Terkaj, A. Šojić, Ontology-based representation of IFC EXPRESS rules: An 1069

enhancement of the ifcOWL ontology, Automation in Construction. 57 (2015) pp.188–1070

201. https://doi.org/10.1016/j.autcon.2015.04.010. 1071

[47] M. Bonduel, J. Oraskari, P. Pauwels, M. Vergauwen, R. Klein, The IFC to linked building 1072

data converter: current status, in: 6th Linked Data in Architecture and Construction 1073

Workshop, CEUR Workshop Proceedings, 2018: pp. 34–43. 1074

[48] O. Lassila, R.R. Swick, Resource description framework (RDF) model and syntax 1075

specification. https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/, 1998 (accessed 1076

December 16, 2022). 1077

[49] Holger Knublauch, J.A. Hendler, K. Idehen, SPIN - SPARQL Inferencing Notation. 1078

https://spinrdf.org/, 2011 (accessed December 16, 2022). 1079

[50] BuildingSMART, IFD library white paper. https://www.buildingsmart.org/standards/bsi-1080

standards/standards-library/, 2008 (accessed December 16, 2022). 1081

[51] A. Chaudhary, A. Battan, Natural Language Interface to Databases-An Implementation., 1082

International Journal of Advanced Research in Computer Science. 5 (2014). 1083

https://doi.org/10.26483/ijarcs.v5i6.2248. 1084

[52] N.V. Divin, BIM by using Revit API and Dynamo. A review, AlfaBuild. (2020) pp.1404. 1085

https://doi.org/10.34910/ALF.14.4. 1086

[53] N. Wang, R.R.A. Issa, C.J. Anumba, Transfer learning-based query classification for 1087

intelligent building information spoken dialogue, Automation in Construction. 141 (2022) 1088

pp.104403. https://doi.org/10.1016/J.AUTCON.2022.104403. 1089

[54] N. Wang, R.R.A. Issa, C.J. Anumba, Named Entity Recognition Algorithm for iBISDS 1090

Using Neural Network, Construction Research Congress 2022. (2022) pp.521–529. 1091

https://doi.org/10.1061/9780784483961.055. 1092

[55] J.A. Bondy, U.S.R. Murty, Graph theory with applications, Macmillan London, (1976). 1093

ISBN: 0444194517. 1094

[56] L. Tang, H. Liu, Graph mining applications to social network analysis, in: Managing and 1095

Mining Graph Data, Springer, (2010): pp. 487–513. ISBN: 978-1-4419-6044-3. 1096

[57] Z. Liu, J. Zhou, Introduction to Graph Neural Networks, Synthesis Lectures on Artificial 1097

Intelligence and Machine Learning. 14 (2020) pp.1–127. 1098

https://doi.org/10.2200/S00980ED1V01Y202001AIM045. 1099

[58] M. Yasunaga, H. Ren, A. Bosselut, P. Liang, J. Leskovec, QA-GNN: Reasoning with 1100

Language Models and Knowledge Graphs for Question Answering, ArXiv Preprint 1101

ArXiv:2104.06378. (2021). https://doi.org/10.48550/arXiv.2104.06378. 1102

[59] S.-X. Zhang, X. Zhu, J.-B. Hou, C. Liu, C. Yang, H. Wang, X.-C. Yin, Deep relational 1103

reasoning graph network for arbitrary shape text detection, in: Proceedings of the 1104

IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: pp. 9699–1105

9708. 1106

[60] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun, Graph Neural 1107

Networks: A Review of Methods and Applications, ArXiv. (2018) pp.1–22. 1108

https://doi.org/10.1016/j.aiopen.2021.01.001. 1109

[61] M. Schlichtkrull, T.N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, M. Welling, Modeling 1110

relational data with graph convolutional networks, in: European Semantic Web 1111

Conference, Springer, 2018: pp. 593–607. 1112

[62] J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, Neural message passing for 1113

quantum chemistry, in: International Conference on Machine Learning, PMLR, (2017): 1114

pp. 1263–1272. ISBN: 2640-3498. 1115

[63] Z. Wang, R. Sacks, T. Yeung, Exploring graph neural networks for semantic enrichment: 1116

Room type classification, Automation in Construction. (2021) pp.104039. 1117

https://doi.org/10.1016/j.autcon.2021.104039. 1118

[64] W.L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, 1119

ArXiv Preprint ArXiv:1706.02216. (2017). https://doi.org/10.48550/arXiv.1706.02216. 1120

[65] F.C. Collins, A. Braun, M. Ringsquandl, D.M. Hall, A. Borrmann, Assessing ifc classes 1121

with means of geometric deep learning on different graph encodings, in: Proceedings of 1122

the 2021 European Conference on Computing in Construction, 2021: pp. 332–341. 1123

https://doi.org/10.35490/EC3.2021.168. 1124

[66] Y. Hu, X. Cheng, S. Wang, J. Chen, T. Zhao, E. Dai, Times series forecasting for urban 1125

building energy consumption based on graph convolutional network, Applied Energy. 307 1126

(2022) pp.118231. https://doi.org/10.1016/j.apenergy.2021.118231. 1127

[67] J. Kim, S. Chi, Graph neural network-based propagation effects modeling for detecting 1128

visual relationships among construction resources, Automation in Construction. 141 1129

(2022) pp.104443. https://doi.org/10.1016/j.autcon.2022.104443. 1130

[68] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation. 9 (1997) 1131

pp.1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. 1132

[69] J. Chen, P. Hu, E. Jiménez-Ruiz, O. Holter, D. Antonyrajah, I. Horrocks, OWL2Vec*: 1133

Embedding of OWL Ontologies, Machine Learning, 2020. 1134

https://doi.org/10.1007/s10994-021-05997-6. 1135

[70] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed Representations of 1136

Words and Phrases and Their Compositionality, in: Proceedings of the 26th International 1137

Conference on Neural Information Processing Systems - Volume 2, Curran Associates 1138

Inc., Red Hook, NY, USA, 2013: pp. 3111–3119. 1139

[71] P. Veličković, A. Casanova, P. Liò, G. Cucurull, A. Romero, Y. Bengio, Graph attention 1140

networks, in: 6th International Conference on Learning Representations, ICLR 2018 - 1141

Conference Track Proceedings (2018), 2018: pp. 1–12. https://doi.org/10.1007/978-3-031-1142

01587-8_7. 1143

[72] D. Busbridge, D. Sherburn, P. Cavallo, N.Y. Hammerla, Relational Graph Attention 1144

Networks, ArXiv Preprint ArXiv:1904.05811. (2019) pp.1–21. 1145

https://doi.org/10.48550/arXiv.1904.05811. 1146

[73] R. Cao, L. Chen, Z. Chen, Y. Zhao, S. Zhu, K. Yu, LGESQL: line graph enhanced text-to-1147

SQL model with mixed local and non-local relations, ArXiv Preprint ArXiv:2106.01093. 1148

(2021). https://doi.org/10.48550/arXiv.2106.01093. 1149

[74] K. Wang, W. Shen, Y. Yang, X. Quan, R. Wang, Relational graph attention network for 1150

aspect-based sentiment analysis, ArXiv Preprint ArXiv:2004.12362. (2020). 1151

https://doi.org/10.48550/arXiv.2004.12362. 1152

[75] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional 1153

Transformers for Language Understanding, ArXiv Preprint ArXiv:1810.04805. (2018). 1154

https://doi.org/10.48550/arXiv.1810.04805. 1155

[76] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. 1156

Polosukhin, Attention is all you need, Advances in Neural Information Processing 1157

Systems. 2017-Decem (2017) pp.5999–6009. https://doi.org/10.48550/arXiv.1706.03762. 1158

[77] S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing 1159

internal covariate shift, in: The 32nd International Conference on Machine Learning, 1160

PMLR, 2015: pp. 448–456. https://doi.org/10.48550/arXiv.1502.03167. 1161

[78] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a 1162

simple way to prevent neural networks from overfitting, The Journal of Machine Learning 1163

Research. 15 (2014) pp.1929–1958. ISSN: 1532-4435. 1164

[79] J.L. Ba, J.R. Kiros, G.E. Hinton, Layer normalization, ArXiv Preprint ArXiv:1607.06450. 1165

(2016). https://doi.org/10.48550/arXiv.1607.06450. 1166

[80] D. Hendrycks, K. Gimpel, Gaussian error linear units (gelus), ArXiv Preprint 1167

ArXiv:1606.08415. (2016). https://doi.org/10.48550/arXiv.1606.08415. 1168

[81] J. Pollock, E. Waller, R. Politt, Speech and language processing, Day-to-Day Dyslexia in 1169

the Classroom. (2010) pp.16–28. https://doi.org/10.4324/9780203461891_chapter_3. 1170

[82] W.W.W. Consortium, OWL 2 web ontology language document overview, 2012. 1171

https://www.w3.org/TR/owl2-overview/ (accessed March 29, 2023). 1172

[83] V. Tablan, D. Damljanovic, K. Bontcheva, A natural language query interface to 1173

structured information, in: European Semantic Web Conference 2008: The Semantic Web: 1174

Research and Applications, 2008: pp. 361–375. https://doi.org/10.1007/978-3-540-68234-1175

9_28. 1176

[84] C. Zhang, J. Beetz, B. De Vries, BimSPARQL: Domain-specific functional SPARQL 1177

extensions for querying RDF building data, Semantic Web. 9 (2018) pp.829–855. 1178

https://doi.org/10.3233/SW-180297. 1179

[85] P. Pauwels, S. Zhang, Y.C. Lee, Semantic web technologies in AEC industry: A literature 1180

overview, Automation in Construction. 73 (2017) pp.145–165. 1181

https://doi.org/10.1016/j.autcon.2016.10.003. 1182

[86] buildingSMART International Ltd., IFC2x Edition 3 Technical Corrigendum 1. 1183

https://standards.buildingsmart.org/IFC/RELEASE/IFC2x3/TC1/HTML/, 2007 (accessed 1184

December 16, 2022). 1185

[87] P. Pauwels, IFCtoRDF Converter. https://github.com/pipauwel/IFCtoRDF, 2017 (accessed 1186

December 16, 2022). 1187

[88] The Apache Software Foundation, Apache Jena. https://jena.apache.org/ (accessed 1188

December 16, 2022). 1189

[89] D. Krech, Rdflib: A python library for working with rdf. 1190

https://github.com/RDFLib/rdflib, 2006 (accessed December 16, 2022). 1191

[90] A. Hagberg, D. Conway, NetworkX: Network Analysis with Python. 1192

https://networkx.github.io, 2020 (accessed December 16, 2022). 1193

[91] Y. Zhang, SuPar. https://github.com/yzhangcs/parser, 2020 (accessed December 16, 1194

2022). 1195

[92] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. 1196

Gimelshein, L. Antiga, Pytorch: An imperative style, high-performance deep learning 1197

library, Advances in Neural Information Processing Systems. 32 (2019). 1198

[93] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, 1199

P. Prettenhofer, R. Weiss, V. Dubourg, Scikit-learn: Machine learning in Python, The 1200

Journal of Machine Learning Research. 12 (2011) pp.2825–2830. 1201

https://doi.org/10.1145/2786984.2786995. 1202

[94] M.S. Shelke, P.R. Deshmukh, V.K. Shandilya, A review on imbalanced data handling 1203

using undersampling and oversampling technique, International Journal Of Recent Trends 1204

In Engineering & Research. 3 (2017) pp.444–449. 1205

https://doi.org/10.23883/ijrter.2017.3168.0uwxm. 1206

[95] L. Liu, H. Jiang, P. He, W. Chen, X. Liu, J. Gao, J. Han, On the variance of the adaptive 1207

learning rate and beyond, ArXiv Preprint ArXiv:1908.03265. (2019). 1208

https://doi.org/10.48550/arXiv.1908.03265. 1209

[96] P. McNamee, H.T. Dang, Overview of the TAC 2009 knowledge base population track, 1210

in: Text Analysis Conference (TAC), 2009: pp. 111–113. 1211

[97] G. Zhu, C.A. Iglesias, Exploiting semantic similarity for named entity disambiguation in 1212

knowledge graphs, Expert Systems with Applications. 101 (2018) pp.8–24. 1213

https://doi.org/10.1016/j.eswa.2018.02.011. 1214

[98] X. Han, L. Sun, A generative entity-mention model for linking entities with knowledge 1215

base, ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for 1216

Computational Linguistics: Human Language Technologies. 1 ((2011)) pp.945–954. 1217

ISBN: 9781932432879. 1218

[99] J.G. Zheng, D. Howsmon, B. Zhang, J. Hahn, D. McGuinness, J. Hendler, H. Ji, Entity 1219

linking for biomedical literature, BMC Medical Informatics and Decision Making. 15 1220

(2015) pp.1–9. https://doi.org/10.1186/1472-6947-15-S1-S4. 1221

[100] H. Zhu, Y. Lin, Z. Liu, J. Fu, T.S. Chua, M. Sun, Graph neural networks with generated 1222

parameters for relation extraction, in: ACL 2019 - 57th Annual Meeting of the Association 1223

for Computational Linguistics, Proceedings of the Conference, 2020: pp. 1331–1339. 1224

https://doi.org/10.18653/v1/p19-1128. 1225

[101] S. Wu, Y. He, Enriching pre-trained language model with entity information for relation 1226

classification, in: International Conference on Information and Knowledge 1227

Management,Proceedings, 2019: pp. 2361–2364. 1228

https://doi.org/10.1145/3357384.3358119. 1229

[102] LEO ARCHITECTS, The North Lamma Public Library & Heritage and Cultural 1230

Showroom. https://www.leighorange.com/project/north-lamma-public-library-heritage-1231

cultural-showroom/, 2019 (accessed December 16, 2022). 1232

[103] Y. Xian, B. Schiele, Z. Akata, Zero-shot learning-the good, the bad and the ugly, in: 1233

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1234

pp. 4582–4591. https://doi.org/10.48550/arXiv.1703.04394. 1235

1236

Appendix I. List of relationships 1237

Table 13 presents the list of relationships and their definitions for the second-stage RE in Section 1238

3.5. Note that * represents the relation definitions which come from BIMSPARQL [84]. 1239

Table 13. List of relationships and definitions for the second-stage relation extraction. 1240

Relationship

Definition

No relation

There is no dependency relationship between entities.

Logic-OR

The conditions of two entities have a logical disjunction relation (inclusive-OR).

hasProperty*

Relationship between an object instance and its property.

isPropertyOf

Relationship between a property and an object instance.

hasTypeEnumeration

Relationship between an object instance and a predefined object type that is a

specific enumeration (e.g., IfcSlab has predefined types FLOOR and ROOF).

isTypeEnumerationOf

Relationship between an enumerated predefined object type and an object instance.

hasSpaceBoundary*

Relationship between a space and its boundary elements.

isSpaceBoundaryOf

Relationship between boundary elements and space.

isContainedIn*

Relationship between a building element and the spatial structure that contains it.

hasContainedElement

Relationship between a spatial structure and its contained building elements.

hasSpatialDecomposition*

Relationship between a spatial element and objects that decompose it.

hasSpatialComposition

Relationship between an object and its composed spatial elements.

hasElementDecomposition*

Relationship between a building element and its composite element.

hasElementComposition*

Representing a building element has an aggregation structure that can be

decomposed into other elements (e.g., IfcStair and IfcMember).

hasQuantity*

Relationship between an object instance and its quantity.

isQuantityOf

Relationship between quantity and an object instance.

hasNextSpace*

Relationship between spaces that are next to each other.

hasObjectType

Relationship between an object instance and its ObjectType attribute.

isObjectTypeOf

Relationship between ObjectType attribute and an object instance.

hasSingleMaterial*

Relationship between an object instance and a single uniform material.

hasListMaterial

Relationship between an object instance and materials contained in a material list.

hasLayerMaterial

Relationship between an object instance and material contained in a material layer.

isMaterialOf

Relationship between material and an object instance.

hasTypeObject*

Relationship between an object instance and its object type represented as IfcType

-Object.

isTypeObjectOf

Relationship between object type (IfcTypeObject) and an object instance.

isPlacedIn*

Relationship between elements such as doors and windows and the elements in

which they are placed (e.g., walls).

hasPlacedElement

Relationship between an element and other elements that placed in it.

hasLongName

Relationship between an object instance and its attribute LongName

isLongNameOf

Relationship between the attribute LongName and the object instance.

hasPropertyValue

Relationship between a property and its value.

isPropertyValueOf

Relationship between property value and property.

hasTag

Relationship between an object instance and its attribute Tag

isTagOf

Relationship between attribute Tag and the object instance.

largerThan

The value of a property/quantity is greater than the value of another one.

lessThan

The value of a property/quantity is less than the value of another one.

equalTo

The value of a property/quantity is equal to the value of another one.

1241

Appendix II. Data 1242

The developed BIM-NLQ dataset can be accessed via https://github.com/MengtianYin/BIM-GNN-1243

dataset. 1244

Implementation of data parsing technology using neural network and web driver

Article

Full-text available

Jun 2024

As a rule, data parsing is used to quickly obtain information from various web resources for further study and use. For parsing, you can use both specialized online services and desktop applications. Unfortunately, existing parsing technologies have some limitations. For example, it is often difficult to parse dynamic web pages and classify information obtained through parsing. New approaches are needed in implementing data collection and analysis - using language models and software (web driver) that simulate human actions when working with websites. The web driver assists in accessing data from dynamically updated sites, while artificial intelligence technologies help correctly recognize and classify data. This technology can be used to create parsers for real estate agencies, employment services, university admission committees, advertising campaigns, and financial organizations.

Financial Statement Text Information Mining and Key Information Extraction Model Construction

Article

Full-text available

Apr 2024

Yi Xu

Financial statement text information mining and key information extraction model design are critical areas of research that aim to use advanced computational approaches to extract important insights from textual data contained in financial documents. In this work, they look at methodologies, techniques, and applications that combine natural language processing (NLP) and machine learning to automate financial statement interpretation. To lay the groundwork for the research, researchers first conduct a thorough examination of existing literature in interdisciplinary domains such as computational linguistics, information retrieval, and finance. Building on insights from earlier studies, they design and use unique NLP approaches, such as named entity identification, syntactic parsing, sentiment analysis, and topic modelling, to extract essential financial metrics from textual data. Additionally, they create machine learning models that are suited to the peculiarities of financial terminology and reporting standards, combining domain-specific knowledge with linguistic experience to improve accuracy and reliability. They demonstrate the efficacy and scalability of the technique in automating the extraction of crucial financial information, such as revenue trends, cost patterns, and risk factors, through rigorous testing on real-world financial data. These results highlight the transformative power of natural language processing and machine learning in financial analysis, providing stakeholders in finance and accounting with actionable intelligence for informed decision-making, risk assessment, and compliance monitoring. By bridging the gap between computational linguistics and financial analysis, this study advances financial text analysis and provides the framework for future research and innovation in this emerging field.

Integration of Industry Foundation Classes and Ontology: Data, Applications, Modes, Challenges, and Opportunities

Article

Full-text available

Mar 2024

Industry Foundation Classes (IFCs), as the most recognized data schema for Building Information Modeling (BIM), are increasingly combined with ontology to facilitate data interoperability across the whole lifecycle in the Architecture, Engineering, Construction, and Facility Management (AEC/FM). This paper conducts a bibliometric analysis of 122 papers from the perspective of data, model, and application to summarize the modes of IFC and ontology integration (IFCOI). This paper first analyzes the data and models of the integration from IFC data formats and ontology development models to the IfcOWL data model. Next, the application status is summed up from objective and phase dimensions, and four frequent applications with maturity are identified. Based on the aforementioned multi-dimensional analysis, three integration modes are summarized, taking into account various data interoperability requirements. Accordingly, ontology behaves as the representation of domain knowledge, an enrichment tool for IFC model semantics, and a linkage between IFC data and other heterogeneous data. Finally, this paper points out the challenges and opportunities for IFCOI in the data, domain ontology, and integration process and proposes a building lifecycle management model based on IFCOI.

Neural semantic tagging for natural language-based search in building information models: Implications for practice

Article

Feb 2024
COMPUT IND

While the adoption of open Building Information Modeling (open BIM) standards continues to grow, the inherent complexity and multifaceted nature of the built asset lifecycle data present a critical bottleneck for effective information retrieval. To address this challenge, the research community has started to investigate advanced natural language-based search for building information models. However, the accelerated pace of advancements in deep learning-based natural language processing research has introduced a complex landscape for domain-specific applications, making it challenging to navigate through various design choices that accommodate an effective balance between prediction accuracy and the accompanying computational costs. This study focuses on the semantic tagging of user queries, which is a cardinal task for the identification and classification of references related to building entities and their specific descriptors. To foster adaptability across various applications and disciplines, a semantic annotation scheme is introduced that is firmly rooted in the Industry Foundation Classes (IFC) schema. By taking a comparative approach, we conducted a series of experiments to identify the strengths and weaknesses of traditional and emergent deep learning architectures for the task at hand. Our findings underscore the critical importance of domain-specific and context-dependent embedding learning for the effective extraction of building entities and their respective descriptions.

Text mining and natural language processing in construction

Article

Feb 2024
AUTOMAT CONSTR

Text mining (TM) and natural language processing (NLP) have stirred interest within the construction field, as they offer enhanced capabilities for managing and analyzing text-based information. This highlights the need for a systematic review to identify the status quo, gaps, and future directions from the perspective of construction management. A review was conducted by aligning the objectives of 205 publications with the specific domains, areas, tasks, and processes outlined in construction management practices. This review reveals multiple facets of the construction sector empowered by TM/NLP approaches and highlights essential voids demanding consideration for automation possibilities and minimizing manual tasks. Ultimately, following identified obstacles, the review results indicate potential research opportunities: (1) strengthening overlooked construction aspects, (2) coupling diverse data formats, and (3) leveraging pre-trained language models and reinforcement learning. The findings will provide vital insights, fostering further progress in TM/NLP research and its applications in academia and industry.

A contrastive learning framework for safety information extraction in construction

Article

Oct 2023
ADV ENG INFORM

An ontology-aided, natural language-based approach for multi-constraint BIM model querying

Article

Full-text available

Jun 2023

Construction project stakeholders often have to retrieve the required information in Building Information Models (BIMs) to support their design, engineering, and management activities. Natural language interface (NLI) systems are emerging as a time- and cost-effective way to query complex BIM models. However, the existing attempts cannot logically combine different constraints to perform fine-grained queries, dampening the usability of BIM-oriented NLIs. This paper presents a novel ontology-aided semantic parser to automatically map natural language queries (NLQs) that contain different attribute and relational constraints into computer-readable codes for BIM model retrieval in the context of building project development. A modular ontology was first developed to represent natural language expressions of Industry Foundation Classes (IFC) concepts, relationships, and reasoning rules; it was then populated with entities from target BIM models to assimilate project-specific information. After that, the ontology-aided semantic parser progressively extracts concepts, relationships, and value restrictions from NLQs to identify multi-level constraint conditions, resulting in standard SPARQL queries to successfully retrieve IFC-based BIM models. The approach was evaluated based on 225 NLQs collected from BIM users, with a 91% accuracy rate. Finally, a case study about the design-checking of a real-world residential building demonstrates the practicability of the proposed method in the construction industry.

Knowledge-informed semantic alignment and rule interpretation for automated compliance checking

Article

Full-text available

Aug 2022
AUTOMAT CONSTR

As an essential prodecure to improve design quality in the construction industry, automated rule checking (ARC) requires intelligent rule interpretation from regulatory texts and precise alignment of concepts from different sources. However, there still exists semantic gaps between design models and regulatory texts, hindering the exploitation of ARC. Thus, a knowledge-informed framework for improved ARC is proposed based on natural language processing. Within the framework, an ontology is first established to represent domain knowledge, including concepts, synonyms, relationships, constraints, etc. Then, semantic alignment and conflict resolution are introduced to enhance the rule interpretation process based on predefined domain knowledge and unsu-pervised learning techniques. Finally, an algorithm is developed to identify the proper SPARQL function for each rule, and then to generate SPARQL-based queries for model checking purposes, thereby making it possible to interpret complex rules where extra implicit data needs to be inferred. Experiments show that the proposed framework and methods successfully filled the semantic gaps between design models and regulatory texts with domain knowledge, which achieves a 90.1% accuracy and substantially outperforms the commonly used keyword matching method. In addition, the proposed rule interpretation method proves to be 5 times faster than the manual interpretation by domain experts. This research contributes to the body of knowledge of a novel framework and the corresponding methods to enhance automated rule checking with domain knowledge.

Artificial intelligence-based voice assistant for BIM data management

Article

Full-text available

Aug 2022
AUTOMAT CONSTR

Existing systems that employ Automatic Speech Recognition (ASR) technology to retrieve information from the BIM model fail to provide remote interaction, retrieve a wide range of data, and automate the entire process. This is particularly a problem for users with disabilities. The paper offers a two-way, automated, and agnostic solution to this theoretical and methodological gap. A ‘Proof of Concept’ prototype was developed using Amazon Alexa – as the AI voice assistant platform – to test the applicability. The outcome shows that the created and the retrieved information is valid. Furthermore, there is a high level of interoperability among the components of the proposed solution, including the AI voice assistant interface and mediation environment to convert verbal requests and retrieve information to CSV files. Future research will extend the created solution to retrieve and access information from a BIM cloud model.

Named Entity Recognition Algorithm for iBISDS Using Neural Network

Conference Paper

Full-text available

Mar 2022

Conversational Artificial Intelligence (AI) systems have become more and more popular to provide information support for human daily life. However, the construction industry still lags other industries in developing a conversational AI system to support construction activities. The developed intelligent Building Information Spoken Dialogue System (iBISDS) is a conversational AI system that provides a speech-based virtual assistant for construction personnel with considerable building information to support construction activities. The iBISDS enables construction personnel to use flexible spoken natural language queries instead of detecting exact keywords. To build an iBISDS, it is necessary to understand the intents of natural language queries for building information. This research aims to develop a named entity recognition (NER) algorithm for iBISDS to recognize and classify keywords within natural language queries. A dataset with 2,008 building information-related natural language queries was developed and manually annotated for training and testing. A Neural Network (NN) deep learning method was trained to recognize named entities within natural language queries. After training, the developed NER algorithm was applied to the testing dataset which achieved a precision of 99.74, a recall of 99.87, and an F1-score of 99.81. The preliminary result indicated that the developed NER algorithm can recognize named entities within the natural language queries accurately. This research will facilitate the further development of conversational AI systems in the construction industry.

Exploring Graph Neural Networks for Semantic Enrichment: Room Type Classification

Article

Full-text available

Dec 2021
AUTOMAT CONSTR

Semantic enrichment of Building Information Modeling (BIM) models supplements models with the implicit semantics for further applications. In this paper, we use the room classification task to develop, test and illustrate a novel approach to semantic enrichment of BIM models-representation of models as graphs and application of graph neural networks (GNNs). A dedicated graph dataset consisting of 224 apartment layouts with nine room types and node/edge features was compiled. An improved GNN algorithm, SAGE-E, was developed for processing both node and edge features and a batch method was used to improve efficiency. The experiments showed that 1) The novel approach of adopting graphs and GNNs was feasible. 2) SAGE-E achieved higher accuracy (79%) and more balanced prediction (F1 = 0.79) when compared with other machine learning algorithms. 3) SAGE-E shortened the training and validation process. This work 2 pioneers the application of GNNs for semantic enrichment and opens the door to other possible applications. The dataset and source code are available for public access at https://github.com/ZijianWang1995/SAGE-E.

Transformer-based approach for automated context-aware IFC-regulation semantic information alignment

Article

Jan 2023
AUTOMAT CONSTR

One of the main challenges of automated compliance checking systems is aligning the semantics of the building information models (BIMs), in Industry Foundation Classes (IFC) format, and the semantics of the regulations, in natural language, to allow for checking the compliance of the BIM with the regulations. Existing information alignment methods typically require intensive manual effort and their ability to deal with the complex regulatory concepts in the regulations is limited. To address this gap, this paper proposes a deep learning method for IFC-regulation semantic information alignment. The proposed method uses a relation classification model to relate and align the IFC and regulatory concepts. The method uses a transformer-based model and leverages the definitions of the concepts and an IFC knowledge graph to provide additional contextual information and knowledge for improved classification and alignment. The proposed method was evaluated on IFC concepts from IFC 4 and regulatory concepts from different building codes and standards. The experimental results showed good information alignment performance.

One SPRING to Rule Them Both: Symmetric AMR Semantic Parsing and Generation without a Complex Pipeline

Article

May 2021

In Text-to-AMR parsing, current state-of-the-art semantic parsers use cumbersome pipelines integrating several different modules or components, and exploit graph recategorization, i.e., a set of content-specific heuristics that are developed on the basis of the training set. However, the generalizability of graph recategorization in an out-of-distribution setting is unclear. In contrast, state-of-the-art AMR-to-Text generation, which can be seen as the inverse to parsing, is based on simpler seq2seq. In this paper, we cast Text-to-AMR and AMR-to-Text as a symmetric transduction task and show that by devising a careful graph linearization and extending a pretrained encoder-decoder model, it is possible to obtain state-of-the-art performances in both tasks using the very same seq2seq approach, i.e., SPRING (Symmetric PaRsIng aNd Generation). Our model does not require complex pipelines, nor heuristics built on heavy assumptions. In fact, we drop the need for graph recategorization, showing that this technique is actually harmful outside of the standard benchmark. Finally, we outperform the previous state of the art on the English AMR 2.0 dataset by a large margin: on Text-to-AMR we obtain an improvement of 3.6 Smatch points, while on AMR-to-Text we outperform the state of the art by 11.2 BLEU points. We release the software at github.com/SapienzaNLP/spring.

Graph neural network-based propagation effects modeling for detecting visual relationships among construction resources

Article

Sep 2022
AUTOMAT CONSTR

Detecting visual relationships among construction resources plays a pivotal role in understanding complex construction scenes and performing vision-based site monitoring and digitalization. Despite extensive efforts, the propagation effects of different resource-to-resource interactions were overlooked and thus, it is still challenging to precisely detect entangled and intertwined visual relationships from actual construction images. To address the challenge, this study proposes a semantic graph neural network approach that structuralizes construction resources and their entangled interactions in the form of a graph, and simulates the propagation effects using a neural message passing mechanism. The experimental results showed that the proposed approach achieved 77.1% F-score—11.5% higher than the performance of the baseline model. This suggests the positive impacts of the propagation effects and the applicability of the proposed approach. These findings can help understand what are actually happening at construction sites automatically and provide valuable insights for future vision-based monitoring studies.

Transfer learning-based query classification for intelligent building information spoken dialogue

Article

Jun 2022
AUTOMAT CONSTR

Retrieving queried information from building information models (BIM) requires experience in structured query languages and manipulation of BIM software. Artificial Intelligence (AI)-based spoken dialogue systems provide more opportunities for information retrieval from building information models via natural language queries. This research developed a transfer learning-based text classification (TC) method to classify different queries into pre-defined categories for an intelligent building information spoken dialogue system (iBISDS), a virtual assistant that provides information retrieval support for construction project team members. The architecture of the TC neural network (NN) was built based on the pre-trained Robustly Optimized BERT Pretraining Approach (RoBERTa). After the training process, the re-trained and fine-tuned RoBERTa NN achieved a precision of 99.76%, a recall of 99.76%, and an F1 score of 99.76% on the testing dataset. The experimental results indicated that the developed NN algorithm for TC can relatively accurately classify different building information-related queries into pre-defined TC categories.

NLP-Based Query-Answering System for Information Extraction from Building Information Models

Article

Feb 2022

The construction industry is information-intensive, and building information modeling (BIM) has been proposed as an information source for supporting decision making by construction project team members in the architecture, engineering, construction, and operation (AECO) industry. Because building information models contain more building data, further use of the aggregated building information to support construction and operation activities has become important. In Industry 4.0, similar-to-real-life virtual assistants, e.g., Apple’s Siri and Google Assistant, are becoming ever more popular. This research developed a query-answering (QA) system for BIM information extraction (IE) by using natural language processing (NLP) methods to build a virtual assistant for construction project team members. The architecture of the developed QA system for BIM IE consists of three major modules: natural language understanding, IE, and natural language generation. A Python-based prototype application was developed based on the architecture of the QA system for BIM IE to evaluate functionalities of the developed QA system using several BIM/industry foundation classes (IFC) models. Seven building information models and 127 test queries were utilized to evaluate the accuracy of the developed QA system for BIM IE. The experimental results indicated that the developed QA system for BIM IE achieved an 81.9 accuracy score. The developed NLP-based QA system for BIM is valid to provide relatively accurate answers based on natural language queries. The contributions of this research facilitate the development of virtual assistants in the AECO industry, and the architecture of the developed QA system can be extended to queries in other areas.

Two-stage Text-to-BIMQL semantic parsing for building information model extraction using graph neural networks

Abstract

Recommended publications

Boosting Video-Text Retrieval with Explicit High-Level Semantics

Information Requirement Analysis for Establishing BIM-Oriented Natural Language Interfaces

An ontology-aided, natural language-based approach for multi-constraint BIM model querying

An ontology-aided, natural language-based approach for multi-constraint BIM model querying

Semantic Enrichment of Object Associations across Federated BIM Semantic Graphs in a Common Data Env...