ArticlePDF Available

Two-stage Text-to-BIMQL semantic parsing for building information model extraction using graph neural networks

Authors:

Abstract

With the increasing complexity of the building process, it is difficult for project stakeholders to retrieve large and multi-disciplinary building information models (BIMs). A natural language interface (NLI) is beneficial for users to query BIM models using natural language. However, parsing natural language queries (NLQs) is challenging due to ambiguous name descriptions and intricate relationships between entities. To address these issues, this study proposes a graph neural network (GNN)-based semantic parsing method that automatically maps NLQs into executable queries. Firstly, ambiguous mentions are collectively linked to referent ontological entities via a GNN-based entity linking model. Secondly, the logical forms of NLQs are interpreted through a GNN-based relation extraction model, which predicts links between mentioned entities in a heterogeneous graph fusing ontology and NLQ texts. The experiment based on 786 queries shows its outstanding performance. Moreover, a real-world case verifies the practicability of the proposed method for BIM model retrieval.
0
1
Two-stage Text-to-BIMQL Semantic Parsing for Building 2
Information Model Extraction Using Graph Neural Networks 3
Mengtian Yin1, Llewellyn Tang1,, Chris Webster2, Jinyang Li3, Haotian Li1, Zhuoquan Wu1, 4
Reynold C.K. Cheng3 5
1 Department of Real Estate and Construction, The University of Hong Kong, Hong Kong SAR 6
2Faculty of Architecture, The University of Hong Kong, Hong Kong SAR 7
3 Department of Computer Science, The University of Hong Kong, Hong Kong SAR 8
u3006144@hku.hk, lcmtang@hku.hk, cwebster@hku.hk, jl0725@connect.hku.hk, 9
hlidh@connect.hku.hk, u3006157@hku.hk, ckcheng@cs.hku.hk 10
11
Abstract 12
With the increasing complexity of the building process, it is difficult for project stakeholders to 13
retrieve large and multi-disciplinary building information models (BIMs). A natural language 14
interface (NLI) is beneficial for users to query BIM models using natural language. However, 15
parsing natural language queries (NLQs) is challenging due to ambiguous name descriptions and 16
intricate relationships between entities. To address these issues, this study proposes a graph neural 17
network (GNN)-based semantic parsing method that automatically maps NLQs into executable 18
queries. Firstly, ambiguous mentions are collectively linked to referent ontological entities via a 19
GNN-based entity linking model. Secondly, the logical forms of NLQs are interpreted through a 20
GNN-based relation extraction model, which predicts links between mentioned entities in a 21
heterogeneous graph fusing ontology and NLQ texts. The experiment based on 786 queries shows 22
its outstanding performance. Moreover, a real-world case verifies the practicability of the proposed 23
method for BIM model retrieval. 24
25
1
1. Introduction 26
Building information modelling (BIM) provides a digital representation of the building product and 27
process with semantic descriptions of different types of information [1]. BIM models can be applied 28
in a range of engineering applications that add value to construction projects, such as clash detection 29
[2] and maintenance management [3]. To facilitate lifecycle information exchange and 30
management, vendor-neutral Industry Foundation Classes (IFC) specifications [4] are widely 31
adopted to represent BIM models, and they are constantly updated with the expanding body of 32
concepts in the construction domain. 33
With the increasing complexity of the modern building process, more parties are involved in 34
projects and increasing amounts of information are generated in BIM models. This has caused the 35
IFC data schema to become structurally more complex. In reality, the number of entities has risen 36
from 653 in the early version of IFC2×3 to 1880 in the latest version, IFC4.3 RC2 [5]. Moreover, 37
the model instances also have heavy sizes, often containing multiple disciplines. With multi-38
domain BIM models, being able to acquire the desired information in time is a key success factor 39
to realizing the practical value of BIM [6]. Due to the disparate disciplines, roles, and project 40
contexts, the information demands of individual stakeholders differ noticeably. This entails 41
efficient retrieval techniques to extract BIM model subsets. 42
The current means of BIM model extraction can be grouped in two categories. First, there are 43
frameworks (e.g., Model View Definition (MVD) [7]) to define domain-specific model views and 44
built-in/add-in tools (e.g., Autodesk COBIE Extension [8]) in BIM software to extract partial 45
models for particular use cases (e.g., building energy model (BEM) [9]). They are oriented toward 46
schema-level model extraction that involves a large number of entities and a long development 47
period [10]. However, these schema-level extraction approaches cannot address ad hoc retrieval 48
requirements for project practitioners to enquire facility information in BIM models, which usually 49
have changing conditions (e.g., question forms, attribute restrictions) [11]. For example, an 50
equipment manager might make a query like “search sensors embedded in air ducts whose reading 51
of supply air temperature is not in the interval of 13-15 degrees” to diagnose broken sensors in air 52
conditioning (AC) systems. In practice, these project-level retrieval tasks mainly rely on the second 53
stream of model extraction solutions, which are professional programming and query languages 54
(e.g., BIM Query Language (BIMQL) [6]). While this means permits the successful extraction of 55
model subsets, it necessitates users having sufficient skill levels in programming and becoming 56
acquainted with complex IFC data schema. Apparently, it is difficult for many practitioners in 57
construction industry who are non-experts in information technology (IT) to use [12]. 58
2
Recently, a natural language interface (NLI) has been proposed by several studies [5–8] to 59
efficiently query BIM models in construction projects. Inspired by artificial intelligence (AI)-60
driven voice assistants (e.g., Apple SIRI [17]), it was envisioned that BIM users could directly input 61
natural language (NL) texts to retrieve model information, which could hide all formalities of BIM-62
oriented programming languages and data schema. The key to NL-based BIM model retrieval is 63
semantic parsing (SP), which is framed as converting NL to logical forms like structured queries 64
or programs [18,19]. The typical tasks include Text-to-SQL (Structured Query Language) [2022] 65
and Text-to-AMR (Abstract Meaning Representation) [23,24]. Following these, we define Text-to-66
BIMQL as the task of transforming NL texts into standard query languages used to retrieve IFC 67
schema-compliant BIM models. However, the current methods for Text-to-BIMQL are limited. 68
The existing studies [11,15,25,26] tailored to retrieve BIM models heavily rely on hand-crafted 69
rules to parse natural language queries (NLQs). Despite the high performance, the input must 70
conform to rigid patterns (e.g., returning an attribute of an object) or strict requirements, which are 71
incapable of handling queries with varying user-specified conditions. 72
There are several challenges in Text-to-BIMQL semantic parsing. First, there is a name ambiguity 73
problem in aligning natural language texts with different levels of IFC model information (e.g., 74
object class, instance, properties, property values, etc.) [27]. In other words, the same mention in 75
NLQs can refer to differing IFC concepts [11]. Consider the example: “find walls on the roof whose 76
base offset is less than the base offset of 184944 (tag number of a wall instance)”. Here, the word 77
“roof” can be recognized as (a) IfcRoof that represents roof elements; (b) IfcBuildingStorey that 78
refers to a building story called “roof”; (c) IfcSlab that implies slab elements with a predefined type 79
of ROOF; or (d) a literal property value. This requires an entity linking (EL) system [28] that links 80
name mentions in NL texts to referent entities in BIM models, whereas such a system does not exist 81
in the BIM field. Second, the logical forms of NLQs are hard to predict because they consist of 82
complex relational paths connecting different IFC entities. A relational path between entities could 83
be multi-hop, which implies two entities are separated by intermediate entities in BIM models or 84
ontologies. As shown in Fig. 1, the relational path between the property “base offset” and tag 85
Fig. 1. Example of a multi-hop relational path. The blue and red entities denote head and tail entities,
respectively
; the gray entities denote extra entities along the relational path.
3
“184944” is a 3-hop move in an IFC ontology [29]. Unfortunately, the existing methods [11,15,30] 86
can only handle 1-hop relationship between entities. 87
To resolve the above problems, contexts of queries and ontological knowledge models must be 88
integrated for joint inference. Recently, ontologies have been widely used to represent BIM 89
knowledge and information [31–34]. For name disambiguation, the neighboring concepts in NLQs 90
and their relationships encoded in the ontology provide key evidence to infer which entity the user 91
refers to. For logical form prediction, an IFC ontology and NLQ texts would together determine 92
the entire relational path between entities. Many studies have proposed integrating natural language 93
processing (NLP) and ontologies to parse NL texts for BIM model checking [3538] and retrieval 94
[11,30]. However, the existing rule-based methods cannot adapt to the different ontologies and text 95
patterns [39] that are necessary to solve the identified problems. Furthermore, it is intractable to 96
manually formalize rules to extract features of ontology structure for text parsing. 97
Inspired by the recent advancement of graph-based machine learning (ML), this study proposes a 98
novel graph neural network (GNN)-based approach to incorporating ontologies for Text-to-BIMQL 99
semantic parsing. GNNs perform biasing learning and computation over structured graph data [40], 100
thus having advantages in incorporating ontological BIM knowledge for enhanced query parsing. 101
This is realized by using GNNs to fuse contextual information of queries and an IFC ontology into 102
heterogeneous graphs for unified representation learning and inference. The proposed method 103
consists of two stages. In the first stage, the name mentions in NLQs are linked to the most relevant 104
entities in the ontology by scoring the subgraphs of each candidate. In the second stage, the 105
dependencies, logical connections, and semantic relationships between entities are extracted in one 106
go by a GNN edge prediction layer. In the end, the results of the above two stages are turned into 107
structured queries to retrieve IFC-based BIM models. Compared with existing methods, our 108
approach avoids the development of enormous rules to parse NL texts via ontology, suffering from 109
the varying sentence patterns and conditions. Therefore, this study improves the strength and 110
practical value of NL-based BIM model retrieval in construction projects. 111
The remainder of this paper is structured as follows. Section 2 introduces the background of the 112
study. Section 3 illustrates the proposed approach. Section 4 provides a performance evaluation. 113
Section 5 discusses the advantages and limitations of the proposed method. Section 6 concludes by 114
outlining the significance of this research. 115
116
4
2. Background 117
2.1 Traditional methods for IFC BIM model retrieval and extraction 118
Since IFC has become a widely-used open-source data standard for the explanation, exchange, and 119
sharing of BIM data [41], most existing BIM query systems have been implemented based on the 120
IFC schema. For example, there are some application programming interfaces (APIs) and toolkits 121
for developers to utilize programming languages (e.g., C#) to augment the model repository, such 122
as IFC Engine [42] and Xbim Essentials [43]. Furthermore, different professional query languages 123
have been proposed for end-users to retrieve BIM models with distinct functions, scopes, and 124
purposes. The Building Environment Rule and Analysis (BERA) language [44] is a popular BIM 125
query language for rule analysis and checking. The QL4BIM (Query Language for Building 126
Information Models) framework [45] enables spatial reasoning in BIM models. Mazairac and Beetz 127
[6] present the BIMQL, which serves as an open query language for the end users to create, read, 128
update, and delete (CRUD) IFC models. 129
Recently, there has been a trend to use ontologies to represent BIM schema and models 130
[31,32,46,47], taking advantage of the strength of semantic web technologies, such as query and 131
reasoning engines. This has led to a group of studies that explore how to use ontology to filter BIM 132
models in form of the Resource Description Framework (RDF) [48]. The commonly used format 133
of semantic BIM is the ifcOWL ontology [31], which is an equivalent transformation of the 134
EXPRESS-based IFC data schema. Following this, Zhang et al. [47] proposed a BIMSPARQL 135
framework to retrieve ifcOWL instances. The query functions, such as geometric information 136
retrieval and spatial reasoning, are extended by SPIN (SPARQL Inference Notation) rules [49]. 137
Similarly, de Farias et al. [46] utilize Semantic Web Rule Language (SWRL) to extract the partial 138
views of ifcOWL instances, in which logic rules are pre-encoded according to data requirements. 139
Our approach chooses SPARQL (SPARQL Protocol and RDF Query Language) as the output query 140
language due to its well-structured syntax, functionality, and popularity. The resulting queries can 141
be executed to retrieve IFC BIM models in RDF format. 142
143
2.2 Existing semantic parsing methods for natural language-based BIM querying 144
Due to the ambiguity of human language, a crucial success factor of NL-based BIM model retrieval 145
is semantic parsing. Currently, most methods are rule-based or pattern-based, which can process 146
simple queries with few variables and constraints. Lin et al. [13] present a seminal work that uses 147
NLs to retrieve IFC BIM model stored in a cloud database. The International Framework for 148
5
Dictionaries (IFD) [50] is leveraged for keyword extraction before discovering the relationships 149
using syntactic parsing and IFC graphs. Shin and Issa [30] propose a BIMASR (building 150
information modeling automatic speech recognition) framework that utilizes voice to collect and 151
manipulate BIM model in a relational database management system (RDBMS). The open-source 152
NLP2SQL library [51] was adopted for parsing NLQs, whereas the query level is limited to one-153
dimensional queries that only include the wall entity and its properties. Elghaish et al. [26] propose 154
an AI-based voice assistant for BIM data management. Their system remains at a “proof of concept” 155
stage and can only translate one type of NLQ (“create a room schedule”) command into Dynamo 156
script [52]. Wang et al. [15] propose a NLP-based Query-Answering (QA) system for BIM 157
information extraction (IE). In their SP module, two fixed styles of NLQ can be interpreted by 158
matching patterns in syntactic parsing trees. Wang et al. [53,54] further propose a transfer learning-159
based text classification method to identify query types and apply a neural network to recognize 160
named entities in NLQs. However, it only supports rigid patterns of queries that contain 1-3 161
variables. 162
If BIM users want to acquire objects in BIM models with some customized constraints, the pattern 163
matching methods cannot cope with changing conditions. Therefore, Yin et al. [11] propose an 164
ontology-based SP pipeline for NL-based BIM model retrieval which allows users to compose 165
queries with arbitrary combinations of (a) object class, type, and instance; (b) logical connection 166
(e.g., disjunction); (c) relationships between contextual objects (e.g., placement); and (d) attribute 167
constraints (e.g., property, material). However, as the extent of BIM model retrieval becomes 168
greater, it was found that a mention can be mapped to several places of IFC models that cause name 169
ambiguity. Meanwhile, the mentioned entities are not always 1-hop related in the IFC model, which 170
requires multi-hop relational reasoning to formulate executable queries. 171
172
2.3 Graph neural network and its applications in construction industry 173
Graph is an abstract data type in computing science, consisting of a set of vertices together with 174
edges joining certain pairs of nodes [55]. Graphs are commonly used to model objects and their 175
relationships in many real-world situations, such as social networks [56]. While current deep neural 176
networks such as convolutional neural networks (CNN) can effectively extract the hidden features 177
of Euclidean data (e.g., images), they cannot handle non-Euclidean graph data structures [57]. 178
Hence, GNNs were developed to learn representations from graphs, by considering both the 179
continuous features and the graph structure itself. During training, GNN propagates across all nodes 180
depending on the states of their neighborhood and collectively aggregates the information in the 181
6
graphs. This makes GNNs become effective ML models for processing graph data. Based on the 182
message-passing mechanisms of GNNs, multi-hop relational reasoning can be performed with 183
respect to graphs [58,59], and thus inductive bias can be made to prioritize one solution over others 184
regarding any graph problems [34]. In practice, GNNs allow high performance in many graph 185
analytic tasks, including node classification, edge prediction, and graph classification [60]. 186
Consequently, GNN models have been widely applied to different scenarios involving non-187
Euclidean data, such as knowledge graph completion [61] and molecule property prediction [62]. 188
GNNs are being used in increasing numbers of applications in the construction industry for solving 189
various graph-related problems. Wang et al. [63] propose an improved SAGE-E based on 190
GraphSAGE [64] for semantic enrichment of BIM models, where rooms and their connections in 191
apartment layouts are represented as nodes and edges to automatically classify room types. Collins 192
et al. [65] utilize Graph Convolutional Networks (GCNs) to classify semantic categories of the IFC 193
objects. Hu et al. [66] use spatial-temporal GCNs to model the interdependency relationships 194
between buildings for reducing large-scale building energy consumption. Kim and Chi [67] propose 195
a semantic GNN approach to simulate the propagation effects of different resource-to-resource 196
interactions. 197
Compared with traditional NLP models, such as LSTM (long short-term memory) [68], that work 198
on sequential data (e.g., text), GNN can better capture the features of BIM ontology structure 199
through graph learning. Hence, GNN is leveraged for Text-to-BIMQL SP in this study. 200
201
2.4 Gaps in knowledge 202
Semantic parsing for BIM model retrieval and extraction is still in an early stage. Existing methods 203
rely on predefined rules to achieve either (a) accurate parsing of simple queries with fixed patterns; 204
and (b) moderate parsing of complex queries with varied expressions and conditions. While the 205
latter type of queries can support more fine-grained retrieval of BIM models, the problems 206
surrounding name ambiguity and relational path extraction have not yet been sufficiently addressed 207
to successfully interpret NLQs to executable queries. In addition, existing rule-based methods 208
[11,30] that integrate ontology and NLP for interpreting NL texts lack enough adaptivity and 209
flexibility to disambiguate entities and extract multi-hop relational paths in NLQs under the 210
changeable contexts. No methods have used graph ML to capture features of ontological graph 211
structure to effectively infer the logical forms of text-based BIM queries. 212
7
To fill the above gaps, this study aims to deliver a new semantic parsing method that exploits GNNs 213
to bridge the NL texts and ontology for intelligently retrieving BIM models. The objectives are to 214
(a) link all mentions in NLQs to referent entities in ontology based on a contextual graph structure; 215
and (b) to extract the entire relational path between entities over IFC ontological graphs to construct 216
standard BIM queries. 217
218
3. The proposed approach 219
3.1 Scope 220
Following the scope defined in [11] for multi-constraint BIM querying, the NLQs addressed in this 221
study support the following conditions for NL-based BIM retrieval: 222
Attribute constraints that allow objects to be filtered by their types (e.g., “search windows with 223
a type of 250mm x 500mm”), property (e.g., “find load bearing walls”), quantity (e.g., “beams 224
have a gross volume of 0.5 ”), and material (e.g., “slabs made of concrete.”). 225
Abstracted semantic relationships between objects, such as containment and composition. This 226
study considers a total of 11 object relationships for demonstration and testing. The details can 227
be found in Appendix I. 228
The above constraints can be arbitrary combined with logical operation, including conditional 229
conjunction and disjunction. Only one sentence is supported in each NLQ. 230
3.2 Overview 231
Although the state-of-the-art (SOTA) semantic parsers are Sequence-to-Sequence (Seq2Seq) 232
models that directly generate structured queries [18], building such a system to query BIM models 233
would not be possible due to the lack of sufficiently annotated datasets. Instead, our method 234
decomposes the SP task into two primary parts: (a) linking all name mentions in NLQs to entities 235
in a BIM ontology; and (b) extracting relational path between entities to derive a logical form of 236
queries. The BIM ontology refers to ontologies that structurally describe the building entities 237
(classes and individuals), class hierarchy, properties, and relationships. 238
An overview of the proposed GNN-based semantic parsing method for BIM model extraction 239
(GSP4BIM) is presented in Fig. 2. To begin with, there are several preprocessing jobs, including 240
embedding all ontological entities into vectors, and simplifying the original RDF graph of ontology 241
as an ontology graph (OG). 242
8
The first stage of semantic parsing acquires candidate entities by matching the surface string in 243
NLQs against the ontology. Ambiguous mentions are automatically linked to referent entities via a 244
GNN-based entity linking (GNN-EL) model. For each candidate entity, a subgraph gathering 245
related entities is retrieved and scored by GNN-EL to estimate the probability of being the referent 246
entity. 247
Having obtained all the entities, the second stage extracts relational path between entities. The 248
dependency parsing (DP) graph and the OG subgraph are concatenated into a heterogeneous graph. 249
The dependency, logical, and semantic relations between the mentioned entities are simultaneously 250
predicted by a GNN link prediction layer. After that, the existence of multi-hop relationships is 251
detected, and the supplementary relationships are extracted to find entire relational path. 252
Through two-stage semantic parsing, results are organized into a graph-based logical form 253
containing the entities and relational paths. They are then automatically transformed into standard 254
SPARQL queries to retrieve IFC-based BIM models. 255
256
3.3 Preprocessing of BIM ontology graphs 257
In our study, the IFC Natural Language Expression (INLE) ontology [11], an open modular 258
ontology specialized for NL-based retrieval of IFC data models, was employed as the source BIM 259
Fig. 2. Overview of the proposed GNN-based two-stage semantic parsing approach. Contributions are
highlighted.
9
ontology. It provides wrapped classes representing NL expressions of IFC concepts (e.g., synonyms 260
and hyponyms), a simplified class hierarchy, and abstracted semantic relationships (e.g., 261
isContainedIn). The scope of the INLE ontology encompasses IfcElement (e.g., IfcWall, IfcBeam, 262
IfcSpace, and IfcBuildingStorey), IfcProperty, IfcPhysicalQuantity, IfcMaterial (including material 263
layer and list), IfcTypeObject, and object attributes like tags and long names. This study additionally 264
models the class for IfcPropertySingleValue to identify literal data values in NLQs. 265
As discussed in [11], there are project-specific concepts in BIM models which are beyond the scope 266
of IFC-related semantic models. For example, the property “HeadHeight” for doors in BIM models 267
is not covered by the IFC predefined property sets. Hence, a model-based ontology population 268
(MOP) method [11] was exploited to assimilate entity names from the target BIM model and 269
populate the INLE ontology with instances and synonyms. This study additionally extracts literal 270
property values from BIM models. 271
To input ontology data files into GNNs, we adopted the OWL2Vec* [69], an ontology embedding 272
method based on random walks and Word2Vec [70], to encode the semantics of ontological entities 273
into vector representations. Consequently, the class and instance entities in the INLE ontology were 274
embedded into 100-dimensional vectors. 275
Finally, considering that the original RDF graph of the INLE ontology is too large and complex for 276
graph learning, our approach converts it into a simplified multi-directional graph, which is a 277
directed graph that can have multiple edges between nodes. In addition to the basic RDF and OWL 278
relationships such as “rdfs:subClassof and “rdf:type” preserved as edges, we simplified object 279
properties used to encode semantic relationships as new typed directed edges. To this end, the 280
ontology graph is denoted as OG = (,), where is the set of entity nodes; ×× is 281
the set of edges; and represents relation types. 282
283
3.4 Entity linking for first-stage semantic parsing 284
The initial stage of semantic parsing is to identify and disambiguate the mentioned IFC entities in 285
NLQs. The NLQ text is first matched against the ontology in Section 3.4.1 to generate candidate 286
entities. The ontology subgraphs are then extracted for the candidates in Section 3.4.2 and 287
processed by GNNs for name disambiguation in Section 3.4.3. 288
10
3.4.1 Surface string matching and candidate generation 289
The named entity recognition (NER) method proposed in [11] was first applied to match the surface 290
strings of NLQs against entity names in ontology. The set of matched entities is denoted as topic 291
entities . 292
If a name mention can only be mapped to one entity, there is no need to operate EL. Otherwise, all 293
candidate entities should be collected for that name mention. For each ambiguous name observation, 294
the candidate entities are denoted as = {,, , }, and the entities mapped to other name 295
mentions (including other ambiguous name mentions) are denoted as question entities, = 296
{,, , }, where , and = . 297
298
3.4.2 Ontology-guided subgraph extraction 299
For each candidate entity of a name mention, a subgraph linking the candidate entity and the 300
question entities will be retrieved from ontology graph. To reduce the noisy nodes that cause 301
overfitting and heavy computation, this study proposes an ontology-guided subgraph retrieval 302
method that extracts informative nodes through specified edges. As shown in Fig. 3, the method 303
consists of four steps. 304
(a) For every topic entity node, its neighbor nodes in OG are retrieved except its instances. 305
(b) Up to 3 higher-order super classes of entities are retrieved viasubClassOfedges. 306
(c) Nodes linked to newly extracted nodes in (b) are retrieved via edges that are neither 307
subClassOfnor type. 308
(d) Up to 3 higher-order super class nodes of newly extracted nodes in (c) are extracted. 309
Fig. 3. Ontology-guided subgraph extraction for a single entity node.
11
Finally, all retrieved nodes for topic entities are gathered; edges between nodes are retrieved to 310
generate an OG subgraph O = (, ). The extra nodes in OG subgraph that are not 311
mentioned in NLQs are denoted as . 312
313
3.4.3 Graph neural network for entity linking 314
The GNN-EL model for linking mentions to IFC entities is based on relational graph attention 315
network (RGAT) [71,72], which uses self-attention mechanisms to iteratively compute node 316
representations by attending over its relation-aware neighbors. Recently, RGATs have been widely 317
applied to various language understanding problems, such as text-to-SQL [73], sentiment analysis 318
[74], and question-answering systems [58]. Inspired by their work, the GNN-EL adapts the RGAT 319
network structure to choose the most appropriate IFC entity from among all the candidate entities 320
for resolving ambiguous name mentions. 321
The overview of the GNN-EL model is presented in Fig. 4. Given an NLQ and its ambiguous name 322
mentions (“roof” in this example), the GNN module systematically disambiguates each name 323
mention by predicting the probability of every candidate entity. The probability score of a candidate 324
 is derived by joint reasoning over the NLQ context and OG subgraph. Owing to the different 325
modalities of the NLQ text and OG, a working graph is created that unifies the representation of 326
both sources of information. The representations of nodes in the working graph are iteratively 327
updated through several rounds of message passing. Finally, the features of the context node and 328
the pooled working graph are concatenated, and fed into a multiple-layer perception (MLP) layer, 329
which outputs the predicted probability score of a candidate entity. 330
Fig. 4. GNN-based approach to linking ambiguous name mention to appropriate IFC entities. Gray edges
in the graph
denote the original edges in OG, such as “subClassOf” and “isContainedIn”.
12
3.4.3.1 Working graph construction for joint representation 331
The NLQ is first encoded into a vector representation  using language models (LMs). In this 332
study, RoBERTa (Robustly Optimized BERT Pretraining Approach) [75] was adopted due to its 333
superior performance. The embedding of [CLS] token in the BERT output was utilized as the vector 334
representation of the NLQ context. 335
The NLQ is then connected to O by injecting a context node that represents the NLQ context 336
(orange node in Fig. 4). This context node is linked to candidate nodes () and question nodes 337
() with new relationships
, and
, that specify the type of the connected nodes, and position 338
edges
to represent the position of the mentioned entities in an NLQ sentence. 339
To encode the position information of entities in an NLQ context, the relative position ratios of 340
entities are first calculated by dividing the start and end index of the entities by the total length of 341
the NLQ. Next, a distance bin method [62] is applied to map the position ratio into one of ten bins 342
within the interval [0,1] (see “Position bins” in Fig.5). Each i th bin represents a distinct positional 343
edge (,) to connect context node and entity nodes. 344
Analogous to position edges, our method also encodes the distance between topic entities in NLQs 345
into typed edges. The absolute difference between the start indexes of each pair of nodes in is 346
computed and to obtain a distance ratio against sentence length. The distance edge , is added 347
between topic nodes, where j depends on which distance bin the ratio falls into. The interval 348
numbers for position edges and distance edges are determined based on a fine-tuning experiment. 349
Fig. 5 provides an example of both edges for nodes “base offset” and “BuildingStorey”. 350
Fig. 5. Examples of position edges and distance edges between context node and entity nodes, “base offset
and “BuildingStorey”.
13
As a result, a working graph =(,) was obtained, which was then input into the GNNs to 351
update the representation of NLQ context and OG subgraph. 352
353
3.4.3.2 Message passing in the GNN architecture 354
As shown in Fig. 6, the next step after constructing the working graph is to perform multiple rounds 355
of message passing in the RGAT network to update the representation of the nodes. The input is a 356
set of node features, h = {
,
, ,
}, where
represents initial node embeddings (dimension 357
(D) = 100), which comes from linear transformation of ontology embeddings in Section 3.3 and 358
the sentence vector of NLQ context  in Section 3.4.3.1; N denotes the total number of nodes in 359
the working graph (). 360
The hidden states of each node are updated through a L-layer message passing in RGAT. 361
Specifically, the hidden states
() of the target node in (l+1)-th layer are computed as follows: 362

= [
||W
][
|||| W
] (1) 363

=Softmax
// (2) 364

= W
(
|||| ) (3) 365
= ||
(


) (4) 366
()=MLP(
+
W) (5) 367
where 
/ represents the message passed from relation-aware neighboring node 368
(source node whose hidden state is
at l-th layer) to target node , and 
is an attention 369
Fig. 6. GNN-EL architecture for structured reasoning over working graph.
14
weight that indicates the importance of message from to ; || denotes vector concatenation; 370
matrixes W
.×/, W
, W
.×/, and W ×/ are learnable parameters; 371
is the number of heads for multi-head attention [76] to stabilize the learning process;
denotes 372
the receptive filed of node ; softmax represents the Softmax function that operates on ; MLP 373
denotes a 2-layer multiple layer perception (MLP) [58] and a batch normalization [77]. Furthermore, 374
both message 
and attention 
incorporate node type embedding (, .) and edge 375
embedding ( ), where the node type embeddings come from a linear transformation 376
(|| .) of one-hot vectors 󰅰,󰅰,{0,1}|| . Herein, is 4, which represents the total 377
number of node types: question nodes, candidate nodes, extra nodes, and context nodes. On the 378
other hand, edge embedding is computed as follows: 379
 =MLP(||󰅰||󰅰) (6) 380
where  {0,1}|| is a one-hot vector that symbolizes the type of relationship. 381
382
3.4.3.3 Prediction and training 383
Given a candidate entity for an ambiguous name mention in an NLQ, the probability of it being 384
the intended entity is estimated as follows: 385
(|)=MLP(Dropout(|| ||) (7) 386
where   is the LM representation of NLQ text;  denotes the hidden states of 387
context node in the final RGAT layer; and is a multi-head attention pooling over hidden 388
states of all nodes in OG subgraph (). The concatenated features are passed through a dropout 389
layer [78] and a 2-layer MLP with a layer normalization [79] before outputting the final probability 390
scores. The activation function adopts the GELU (Gaussian Error Linear Units) [80] function. 391
Since a probability score of each candidate entity can be estimated, the name mention is linked to 392
the entity with the highest score. During the training phase, the graph data are obtained by 393
converting NLQs and ontology graphs into working graphs. The model parameters are optimized 394
in backward propagation. The loss function is cross entropy loss, which is defined as: 395
(,)=(
)log ( ) (8) 396
where ( ) and () are ground truth and the estimated probability of candidate entity  397
being the referent entity in the NLQ context. 398
15
3.5 Relational path extraction for second-stage semantic parsing 399
After resolving the IFC entities in NLQs, they are passed on to the second-stage model for relation 400
extraction (RE). Prior to RE, the nearby entities that refer to the same object instance (e.g., the 401
IfcSpace class entity and space instance entity “S101” for “room S101”) are merged into one entity 402
as their relationship was deterministic. 403
Based on the scope in Section 3.1, this study considers a total of 36 types of directed relationships 404
in the following aspects: 405
(a) dependency: if two entities in NLQs do not have any connections, they have a "No relation" 406
relationship. 407
(b) logical relationship: logical disjunction relationship (inclusive Logic-OR) between the two 408
entities. The entities on both sides can be data values (e.g., “list objects with phase created at 409
construction phase or operation phase”), object instances (e.g., “find walls at level 1 or level 2”), 410
or object attributes (e.g., “retrieve walls whose area is less than 8 and length less than 4 m”). 411
(c) semantic relationships between contextual objects, such as containment and spatial adjacency. 412
(d) attribute value constraints that require relationships like “hasProperty” for object entities and 413
“hasPropertyValue” for property entities. 414
Each relationship can be mapped to an object property in INLE ontology. Given an NLQ and a pair 415
of head entity  and tail entity , a GNN model firstly predicts the major relationship
416
for (, ) from among 36 relationships in Section 3.5.1. Afterward, the supplementary 417
relationships are sought to trace an entire relational path between entities in Section 3.5.2. The full 418
list of relationships and definitions is presented in Appendix I. 419
420
3.5.1 Graph neural networks for link prediction 421
In contrast to the first stage GNN-EL model, which uses a context node to bridge the NLQ context 422
and ontology graph, the second stage GNN-based relation extraction (GNN-RE) model uses a DP 423
graph to represent NLQ contextual information, which more accurately capture the dependencies 424
between entities in an NLQ. A DP graph comes from dependency parsing, which analyzes the 425
syntactic structure of a sentence and establishes grammatical relations between words [81]. The 426
resulting DP graph, as shown in Fig.7, consists of a set of nodes representing sentence tokens and 427
16
edges denoting dependency relations between head and dependent. Additionally, extra edges were 428
appended between neighborhood tokens to encode sequential information of words into DP graph. 429
The resulting DP graph is concatenated with the OG subgraph to form a new working graph, which 430
is then passed through GNNs to predict the edge types (relationships) between every pair of head 431
entities and tail entities in NLQ context. 432
433
3.5.1.1 Working graph construction 434
For each entity pair (, ), a working graph is constructed that combines the OG subgraph 435
and the DP graph. Here, the disambiguated entities are used to extract a new OG subgraph using 436
the same method as shown in Section 3.4.2. 437
The strategy for constructing a working graph construction is as follows. As shown in Fig. 8, the 438
head node and tail node are linked to their corresponding tokens with relation types and . Since 439
the content between name mentions of two entities is informative for inferring relationships, the 440
Fig. 7. The DP graph of the NLQ that represents the grammatical relation between tokens.
head, tail, and question nodes, respectively
17
tokens sandwiched between the two mentions are linked to head and tail entities with relation types 441
 and , respectively (see the red and blue dotted lines in Fig. 8). In addition, other extracted 442
entities, termed as question nodes , are linked to their mapped tokens in NLQs with 443
relation type
. 444
3.5.1.2 GNN architecture for relation classification 445
As shown in Fig.8, the proposed GNN-RE model is a link prediction model that predicts the types 446
of relationship between head and tail nodes. In general, the architecture of GNN-RE is close to the 447
GNN-EL, but it differs in the working graph structure, node type, and final output layer. 448
(a) Initial embeddings of nodes in working graph 449
In the GNN-RE model, initial node embeddings that represent NLQ context utilize the embeddings 450
of each word in NLQ context from RoBERTa outputs, noted as = {
,
,…,
}. All 768-451
dimensional word vectors are converted into 100-dimensional vectors through a linear 452
transformation. However, the initial hidden states of entity nodes in the OG subgraph are the same 453
as the setup in the first stage. 454
(b) Node type 455
In contrast to GNN-EL, the GNN-RE model has five node types: token node, head entity node, tail 456
entity node, question entity node, and extra entity node. 457
(c) Output layer for multi-class classification 458
The hidden states of all nodes in are updated through a L-layer RGAT. After message passing, 459
the hidden states of the head entity node and tail entity node at the last RGAT layer are extracted, 460
which are noted as 
, 
. The two vectors are then concatenated and passed through 461
a 2-layer MLP and a layer normalization (see Fig. 8). The output size is the number of relation 462
types (36), which indicates the probability score of each relation type for entity pair (, ). 463
Finally, the multi-class classifier makes edge prediction by choosing the relationship with highest 464
score. 465
In the training data, each entity pair is labelled with one ground truth relationship. If multi-hop 466
relationships exist between entities, the major relationship is adopted. Herein, the major 467
relationship is defined as the mainly intended relationship in NLQs, when compared with 468
supplementary relationships (
) due to the lack of complete mention of variables. Consider the 469
NLQ example in Fig.8, the major relationship between the second “base offset” entity and “184944” 470
18
is “isPropertyOf”, which implies that the “base offset” property is possessed by an object instance. 471
In contrast, “hasTag” is the supplementary relationship that arises from the lack of a mention of the 472
wall entity. In this case, the GNN-based classifier is responsible for predicting the major 473
relationships between entities. The model parameters are optimized using cross entropy loss. 474
475
3.5.2 Extra relation finding for multi-hop relational path 476
Once the major relationship between entities has been found by the GNN-RE model, the 477
connectivity between the relationship and head/tail entity nodes in ontology graph must be checked 478
to identify whether supplementary relationships are needed. If so, the extra relationships are 479
extracted based on the ontology. 480
3.5.2.1 Connectivity checking 481
The connectivity between a relationship and an entity node depends on whether the entity is an 482
instance of or is inherited from the domain/range of the relationship. In the semantic web, domain 483
and range assert that the subjects and objects of object property statements must belong to the class 484
extension of the indicated class description [82]. Therefore, it is important to check if the class 485
restrictions are violated to prevent invalid output queries. 486
Given the major relationship for an entity pair, the connectivity to head and tail entities is checked 487
separately. Specifically, the head entity is compared with relationship’s domain class, and the tail 488
entity is compared with the range class. As shown in Fig. 9, the head entity “base offset” is 489
connected to
, whereas the tail entity “184944” is disconnected because “Tag” is not inherited 490
from the range class “Object” in the ontology. 491
Fig. 9. Process of supplement relationship extraction.
19
3.5.2.2 Supplementary relationship extraction 492
Following the connectivity checking, the disconnected side(s) of the
must acquire 493
supplementary relationships. In this study, only one more extra relationship is taken for each 494
disconnected side, because the experiment’s results showed that this was enough to process the 495
multi-hop relationships in the NLQs. 496
The process of extraction starts by iterating all relationships with their domain and range classes. 497
The connectivity of each candidate’s supplementary relationship is evaluated as follows: 498
(a) For a disconnected head entity, the supplementary relationship should connect with the head 499
entity and the domain class of major relationship. 500
(b) For a disconnected tail entity, the supplementary relationship should connect with the tail entity 501
and the range class of major relationship. 502
Finally, the candidate relationship is selected if it passes connectivity checking on both sides. As 503
shown in Fig. 9, “hasTag” is chosen as the supplementary relationship because it connects with 504
“Object” (range of
) and “184944” (instance of “Tag” entity). If more than one candidate 505
relationship is returned, a graph distance score [83] is employed to prioritize relationships. 506
507
3.6 Automatic BIM query language generation 508
As all the entities and relational paths are extracted from the NLQ, the graph-based logical form 509
(see Fig. 10) can be derived and easily transformed into different types of query languages for 510
retrieving IFC-based BIM models. This study employs the template-based method [11] to 511
automatically generate SPARQL queries by filling in the slots of the prepared query templates with 512
the identified variables, classes, instances, object properties, and data values. The resulting queries 513
are executed by the BIMSPARQL framework [84] to retrieve ifcOWL-based BIM model instances. 514
The ifcOWL format is used to represent IFC models because of its recent popularity [85], and 515
BIMSPARQL provides extension functions to load and process ifcOWL instances. 516
Fig. 10 shows an example of the resulting SPARQL query. All recognized entities are regarded as 517
variables with the asserted classes. The extracted semantic relationships are replaced with the 518
corresponding SPIN (SPARQL Inference Notation) inference rules [49] from BIMSPARQL[84] 519
and NLQ4BIM [11]. 520
20
521
4. Approach evaluation and validation 522
This research assesses the efficiency and effectiveness of the proposed approach in two aspects. 523
First, a laboratory experiment is conducted for performance evaluation, in comparison with the 524
SOTA models. Second, a case study is carried out in a real-world project to validate the 525
practicability of the approach. In the following parts, Section 4.1 introduce the implementation 526
details. Section 4.2 and 4.3 illustrate the experiment design and test results, respectively. Section 527
4.4 demonstrates the real case. 528
529
4.1 Implementation 530
Our study implemented all algorithms in the Python (ML model development) and Java (ontology 531
and BIM model processing) languages. The BIM models in IFC2×3 TC1 specification [86] were 532
converted into ifcOWL instances [87] in the RDF format for ontology population and SPARQL 533
query execution, based on the Apache Jena Framework [88]. The INLE ontology was converted 534
into graph data using the RDFlib toolkit [89] and was further manipulated using the NetworkX 535
package [90] for graph simplification. Regarding text processing, the SuPar toolkit [91] was used 536
to carry out dependency parsing. Finally, all GNN models were established using the Pytorch 537
library [92]. The performance evaluation of EL and RE models was based on the Scikit-learn library 538
[93]. 539
Fig. 10. The automatically generated SPARQL query.
21
4.2 Experiment design 540
To evaluate the proposed method for BIM model retrieval, our study extends the BIM-NLQ dataset 541
published in [11,29]. The scale of the dataset was enlarged from around 200 queries to 786 queries 542
over five architectural and structural BIM models. The newly added NLQs were manually created 543
by the first and sixth authors with an emphasis on ambiguous entities and multi-hop relations. The 544
comparison with the only large publicly available BIM-NLQ dataset, the iBISDS dataset [53], is 545
outlined in Table 1. 546
Table 1. Description of the developed dataset. 547
Aspect
iBISDS dataset
Our dataset
The number of
mentioned variables
1-3 variables per query, with an
average number of 2.57.
1-8 variables per query, with an
average number of 3.02.
Data annotation Entity type and question type.
Mentioned entities in ontology,
relationships and SPARQL queries
Logical connections
Not addressed.
logical conjunction and disjunction.
Semantic relations
Not addressed.
34 types of semantic relationships
Attribute value
restrictions Not addressed.
Literal, quantitative, and Boolean
value restrictions
548
The dataset was split into a training set, a development set, and a test set with at a ratio of 8:1:1. 549
The training set was used to train ML models; the development set was applied to tune model 550
parameters; and the test set was used to evaluate the final performance. As shown in Fig. 11, since 551
the proposed method consists of two GNN models, the raw BIM-NLQ dataset must be further 552
transformed into two datasets. Thereafter, the models that were trained on these two datasets were 553
put together to form a holistic two-stage semantic parsing model. 554
In the following sections, Sections 4.2.1 and 4.2.2 introduce the preprocessing for deriving EL and 555
RE datasets. Section 4.3.1 firstly reports the overall performance of the proposed two-stage SP 556
method based on the test set of the BIM-NLQ dataset. Afterward, Section 4.3.2 and Section 4.3.3 557
demonstrate the performance at two stages respectively. 558
22
559
4.2.1 Dataset preprocessing for first-stage entity linking 560
In the EL dataset, each NLQ has an ambiguous name mention, a set of candidate entities, and a 561
ground truth entity. To obtain this, the NLQs were processed via the candidate generation programs 562
introduced in Section 3.4.1. Although a total of 911 ambiguous name mentions were found in 786 563
queries, this scale of dataset was too small to train GNN models. Therefore, additional training data 564
was synthesized using the following strategy. Each NLQ that did not have ambiguous name 565
mentions, had a grounded entity randomly selected, and three other entities randomly extracted 566
from the INLE ontology. These four entities formed the candidate entities of the corresponding 567
name mention. As a result, the final EL dataset was expanded to include 1116 data examples, and 568
Table 2 shows their statistical properties. 569
Table 2. Statistical properties of the EL dataset. “Avg.” denotes “average”; “Num.” denotes “number”. 570
Dataset division Size
Avg. word count
Avg. Num. of candidate nodes
(excluding synthesized data)
Avg. Num. of
question nodes
Avg. Num. of extra
nodes
Training set
893
11.59
2.34
3.19
30.99
Development set
111
13.14
2.22
3.92
30.44
Test set
112
12.12
2.48
3.67
31.39
571
4.2.2 Dataset preprocessing for second-stage relation extraction 572
In the RE dataset, each data example has a head entity, a tail entity, and an annotated relationship. 573
Through pairing all entities in the NLQ dataset, the RE dataset contained 2113 training data, 252 574
development data, and 271 test data. Nevertheless, it was observed that there was an imbalanced 575
Fig. 11. Experiment design.
23
distribution of classes in the dataset. Among 36 classes, 13 classes have less than 10 training data, 576
which would cause a multi-class classifier to be biased towards major classes. To alleviate this 577
problem, this study applies an oversampling technique [94] that randomly replicates examples of 578
the minority classes to at least 100. A synonym dictionary was prepared to replace words in 579
duplicated data with their alternative expressions. Consequently, the training set of the RE dataset 580
was expanded with 2564 new examples. Finally, statistical properties of the GNN-RE dataset are 581
shown in Table 3. 582
Table 3. Statistical properties of the RE dataset. “Avg.” denotes “average”; “Num.” denotes “number”. 583
Dataset division Size
Avg. word
count
Avg. Num. of head-tail
entity pair
Avg. Num. of question
nodes
Avg. Num. of extra
nodes
Training set
4667
11.59
1
1.49
31.03
Development set
252
13.14
1
1.95
30.06
Test set
271
12.12
1
2.24
29.87
584
4.2.3 Training details 585
This study sets the dimension of node embeddings to 100 and the number of RGAT layers (L) to 5 586
for both GNN models. Rectified Adam (RAdam) [95] was used as the optimizer, with a batch size 587
of 32 and a learning rate of 2e-5 and 1e-3 for the LM (RoBERTa) and GNN modules, respectively. 588
The maximum epoch is 200, and the training will terminate when there is no more performance 589
improvement in the past 50 epochs. All hyperparameters are determined based on tuning 590
experiments. 591
592
4.3 Test result 593
4.3.1 The accuracy of the overall query results 594
This part evaluates the overall performance of the integrated two-stage SP method based on the test 595
set of the BIM-NLQ dataset. The trained GNN-EL and GNN-RE models are combined into one 596
program to parse the input NLQs. In terms of value restrictions in NLQs, the numerical and Boolean 597
data values are extracted based on [11] due to its perfect performance. 598
The accuracy of the semantic parsing depends on whether the resulting SPARQL query is valid and 599
the correct retrieval results can be obtained, which are manually checked against ground truth 600
answers. NLQ4BIM [11] is chosen as the comparison method owing to its suitable functional scope 601
for the dataset and high performance. 602
24
Table 4. Accuracy of the overall query results. “QResult” stands for query result. 603
Model
Correct QResult
Accuracy
GSP4BIM (proposed method)
64
81.01%
NLQ4BIM [11]
49
62.03%
604
The overall SP results are shown in Table 4. The total accuracy of the proposed SP method was 605
81.01% over 79 queries. By contrast, NLQ4BIM only achieved 62.03% accuracy due to the 606
frequent occurrence of ambiguous mentions and multi-hop relationships. Table 5 presents several 607
qualitative results that compare our method with NLQ4BIM. In the first sample, GSP4BIM 608
correctly identified “level 2” and “up to level:Roof” as property values of “base constraint” and 609
“top constraint”, whereas NLQ4BIM superficially treated them as IfcBuildingStorey and IfcRoof, 610
which resulted in an invalid query. This shows the strength of GSP4BIM in value restriction 611
extraction due to its disambiguation function. In the second sample, GSP4BIM could extract the 612
two-hop relationship between “ceiling” and “TotalThickness”, where “ceiling” is a predefined type 613
of IfcCovering that has the property “TotalThickness”. In contrast, NLQ4BIM cannot predict more 614
than one relationship between entities because of its rigid rule-based principle. 615
616
Table 5. Sample predictions from our method and the comparison method. The recognized entities (colored) 617
and relationships (directional connectors) between entities are presented. 618
NLQ Sentence
NLQ4BIM [11]
GSP4BIM (ours)
Ground truth
Return walls
with a base
constraint of
Level 2 and a
top constraint of
up to level:
Roof.
1(a)
1(b)
1(c)
Get all ceilings
with a
TotalThickness
equal to 0.052.
2(a)
2(b)
2(c)
Get curtain
walls composed
of mullions with
a span larger
than the span of
1227576 .
3(a)
3(b)
3(c)
25
Show me the
levels
that have
Floor:Hangover
Shading_100m
m .
4(a)
4(b)
4(c)
619
All errors in GSP4BIM stem from both the EL and RE stages. In the first stage, it was found that 620
the GNN-EL model makes mistakes in distinguishing between literal data values and class entities. 621
For example, one error occurs in disambiguating the mention “roof” in NLQ “the beams with 622
reference level to be roof”. While the ground truth answer is a literal value for the property 623
“reference level”, the GNN-EL model wrongly classified it as an IfcRoof entity. In the second stage, 624
it was observed that the GNN-RE model occasionally predicts supplementary relationships instead 625
of major relationships, meaning that the full query path cannot be traced. As shown in Table 5.4(a), 626
the GNN-RE model returns “hasObjecType” between “levels” and “Floor:Hangover 627
Shading_100mm”. In this example, “hasContainedProduct” is the major relationship, while 628
“hasObjectType” is the supplementary relationship. However, the former cannot be extracted 629
because the latter has already passed connectivity checking. 630
Provided that the major relationships were correctly classified by GNN-RE, there were no errors in 631
extracting the rest of the supplementary relationships. The performance of two proposed GNN 632
models is further elaborated in the next two sections. 633
634
4.3.2 Performance of first-stage entity linking 635
As illustrated in Fig. 11, the performance of the GNN-EL model was tested on 111 data examples. 636
The standard metrics of micro accuracy and macro accuracy from [96] are adopted to evaluate EL 637
models, which are defined as: 638
= 
 (9) 639
 =
()
()

 (10) 640
where NumMentions, NumEntities and NumCorrect denote the total number of mentions, total 641
number of entities, and total number of correct predictions, respectively. The second metric 642
measures EL accuracy averaged over all entities. 643
26
Most SOTA EL models mainly work on document-level texts, or leverage Wikipedia’s knowledge 644
base [28] and knowledge graph [97] as supplementary data sources. Hence, there is a technical gap 645
applying these models for entity linking in Text-to-BIMQL. 646
This study selects two baseline models that are compatible with our dataset: 647
Baseline 1: generative entity-mention model [98] that leverages entity popularity 648
knowledge, name knowledge, and context knowledge to estimate the likelihood. The EL 649
training set serves as the resource to train the probabilistic models. 650
Baseline 2: an ontology-based EL model [99] that uses rich semantic information and 651
ontology structures to rank entities. The INLE ontology was used as the source ontology 652
for calculating entropy scores of candidate entities. 653
The test results are presented in Table 6. Our GNN-EL model achieved a micro accuracy of 91.45% 654
and a macro accuracy of 75.87%, which significantly outperforms baseline models by at least 29.06% 655
in micro accuracy and at least 41.49% in macro accuracy. These results demonstrate the strength 656
of the proposed GNN model in handling EL tasks in this domain-specific context. Moreover, 657
because the expanded EL dataset is still small, we used 10-fold cross validation to evaluate the 658
GNN-EL model's generalization ability. The entire EL dataset was re-divided into 10 parts, and 659
each part was used as a test set at each iteration, while the remaining parts were used for model 660
training and convergence. As shown in Table 7, the average micro and macro accuracy over 10 661
iterations are 89.21% and 74.6%, respectively, with a small variance (standard deviation < 5%). 662
This indicates that the model can generalize well to new data. 663
Table 6. Micro and macro accuracy of the GNN-EL model and the baseline models. 664
Model
Micro Accuracy
Macro Accuracy
Baseline 1
62.39%
34.38%
Baseline 2
51.28%
25.43%
GNN-EL (ours)
91.45%
75.87%
665
Table 7. Results of 10-fold cross validation for evaluating the GNN-EL model. 666
Iteration
Micro Accuracy
Macro Accuracy
1
89.29%
74.35%
2
91.96%
80.45%
3
91.07%
76.38%
4
88.39%
70.64%
27
5
86.61%
75.5%
6
90.18%
77.92%
7
85.71%
69.04%
8
89.29%
75.02%
9
89.29%
73.69%
10
90.35%
72.79%
667
Although the generative entity-mention model performs well (around 80% accuracy) in large-scale 668
common domain datasets [98], it does not work well in our domain-specific EL task, because the 669
semantics of different candidate entities vary widely in common EL tasks. By contrast, the 670
candidate entities of ambiguous mentions in the BIM environment are close and are usually posed 671
by varied data representations. For example, “level 2” could represent an IfcBuildingStorey instance 672
or data value of the property “base offset”, depending on whether it is associated with property 673
entities. This makes it difficult for traditional EL models to distinguish candidate entities when 674
based purely on NLQ context. By contrast, our model captures the relevance between entities over 675
the BIM ontological structure through graph learning. 676
Despite the fact that the second baseline model [99] also utilizes ontologies, its original focus was 677
on biomedical ontologies, so that the model lacks sufficient adaptivity to the BIM ontology, which 678
contains lots of schema-level constructs. By comparison, our supervised learning-based model can 679
better capture the features of ontology structures because the representation of nodes can be updated 680
based on the objectives of EL. 681
Table 6 shows that the macro accuracy is lower than the micro accuracy for both the GNN-EL and 682
baseline models, a situation that can be attributed to the imbalanced dataset where some types of 683
entities are much more abundant than others. To investigate whether our model was trained to 684
simply prioritize more frequently occurring entities, the entity-mention model [98] was only 685
modified and tested using the popularity score that awards the most frequently mentioned entities 686
in the training set. The resulting micro-accuracy and macro-accuracy values were 82.05% and 687
53.5%, respectively, showing that GNN-EL learns features of minor types of entities for entity 688
linking. Apart from the dataset problem, it was observed that the current GNN architecture is weak 689
in capturing complex logical assertions in the ontology. An error occurs in The floor where the 690
column called Pile - 011 locate”, where “floor” indicates IfcBuildingStorey but the prediction is an 691
enumerated type of IfcSlab. Here, the machine incorrectly interprets that “floor” describes the type 692
28
of column, whereas IfcSlab and IfcColumn are disjoint in the ontology so that their types cannot be 693
shared. 694
695
4.3.2.1 Ablation study 696
An ablation study was conducted to investigate the contribution of the components to the GNN-EL 697
model, including extra nodes, context nodes, position edges, and distance edges. The performance 698
of the model after the removal of different components is shown in Table 8. Without the use of 699
extra entity nodes, the model's micro and macro accuracy decline by 6.84% and 23.86%, 700
respectively, demonstrating the importance of the ontology-guided OG subgraph extraction. 701
The influence of LM is studied by removing the context node from , and the final output is 702
obtained by applying 2-layer MLP to the final hidden state of the candidate entity node. 703
Consequently, the micro accuracy and macro accuracy are reduced by 3.54% and 11.17%, 704
respectively. The model performance without position edges and distance edges has also declined. 705
Table 8. Micro and macro accuracy of our GNN-EL model and the baseline models. 706
Model setup
Micro Accuracy
Macro Accuracy
GNN-EL (origin)
91.45%
75.87%
Without extra nodes
84.61%
52.11%
Without LM
88.03%
64.7%
Without position edges
90.6%
75.2%
Without distance edges
87.18%
68.94%
707
4.3.3 Performance of second-stage relation extraction 708
The GNN-RE model for the second-step semantic parsing predicts the major relationships between 709
entities. Its performance was evaluated based on 271 data examples that specified head entities and 710
tail entities in queries. Following [100], the metrics to measure RE models are accuracy and macro 711
F1 score: 712
Accuracy = 
 (11) 713
F1 score =  ()

 (12) 714
29
where NumRelations represents the total number of entity pairs to be predicted; NumClasses stands 715
for the number of relationship types. The F1 score() denotes the F1 score for classification of a 716
certain type of relationship in the test data. 717
Two SOTA models for the RE tasks were adopted as baseline models in this study: 718
Baseline 1: R-BERT [101] that leverages the pretrained BERT model and incorporates 719
target entities’ information for relation classification. 720
Baseline 2: GNNs with Generated Parameters (GP-GNNs) [100] also rely on GNNs to 721
conduct relational reasoning on unstructured texts. 722
The test results are shown in Table 9. The proposed GNN-RE model yielded an accuracy of 90.04% 723
and a macro F1 score of 78.84%, which outperformed all the baseline models, demonstrating the 724
superior performance of our model in dealing with relationship extraction for NL-based BIM model 725
retrieval. Although GP-GNNs also employ the GNN architecture, it only models name mentions as 726
nodes in the working graph, which lacks sufficient background information of the associated 727
entities. By contrast, our GNN-RE model combines name mentions and entity nodes in a unified 728
graph representation for joint inference, which is more efficient because it makes the RE model 729
aware of the semantics of the entities and how they interact in the ontology. 730
Table 9. Performance of our GNN-RE model and the baseline models. 731
Model
Accuracy
Macro F1 score
Baseline 1
86.71%
68.19%
Baseline 2
65.35%
53.61%
GNN-RE (ours)
90.04%
78.84%
732
Like the test results of EL models, the macro F1 scores of three models are generally less than the 733
accuracy, which result from the imbalanced class distribution in the dataset. Table 9 shows that this 734
is a pervasive problem that hampers all RE models. 735
Fig. 12 presents a confusion matrix that details the results in classifying major relationships of 736
entity pairs in the test dataset. As can be seen, the false positives (FPs) and false negatives (FNs) 737
relevant to “No relationrelationships take up most errors. This arises because the number of entity 738
pairs that have “No relation” in the test dataset is large. Moreover, when the NLQ sentence is long 739
and contains many variables, the GNN-RE model is more error-prone in identifying the dependency 740
relationships between distant entities. For example, an error occurs in identifying the relationship 741
30
between “column” and “width” in the sentence “the columns that satisfy its gross volume < 0.3 742
cubic meter, depth > 200 mm, width < 300 mm”. The model wrongly outputs “No relation” because 743
of too many intermediate words between entities. Another source of error stems from DP. A 744
problematic DP graph would lead to incorrect messages passing between the nodes. In addition to 745
misidentification of “No relation, it can be observed that there are seldom FP and FN errors when 746
identifying semantic relationships, which shows that the model can make accurate classifications 747
for entity pairs that have confident dependencies. Typical errors include mistaking "lessThan" for 748
the correct answer of "largerThan" and "equalTo". These three relationships are confused because 749
they have similar semantics regarding quantitative comparison. More importantly, too little training 750
data (less than 10 examples) for the “less than” and “equal to” relations led to incomplete model 751
training. 752
Fig. 12. Confusion matrix of multi-class classification for relationship extraction. The vertical axis and
horizontal axis represent the ground truth relationships and the predicted relationships, respectively.
T
he
classes that do not appear in test set are not sh
own.
31
4.3.3.1 Ablation study 753
The ablation analysis is presented in Table 10. Without extracting extra entity nodes, the accuracy 754
and macro F1 score of the model decreased by 5.54% and 9.84%, respectively. This again 755
demonstrates the effectiveness of GNN-based reasoning over the BIM ontology graph. On the other 756
hand, if the DP graph is replaced with a single context node in the working graph, the accuracy and 757
macro F1 score reduce to 72.69% and 36.74%, respectively. This dramatic decline illustrates the 758
importance of combining DP graph and OG to reveal the dependencies between entities. In addition, 759
without connecting token nodes of middle text segment with head entity () and tail entity (), 760
the accuracy and macro F1 scores drop to 87.82% and 75.49%, respectively. The degradation is not 761
obvious because the messages of the middle text segment can also be passed through nearby token 762
nodes, but  and  can pass more significant messages. 763
Table 10. Micro and macro accuracy of our GNN-RE model and baseline models. 764
Model setup
Accuracy
Macro F1 score
GNN-RE (origin)
90.04%
78.84%
Without extra nodes
84.50%
69%
Without DP graph
72.69%
36.74%
Without

and

87.82%
75.49%
765
4.3.4 Computation cost 766
The computation cost is evaluated in terms of training time, loading time, and model inference time. 767
All the formalized models were trained and tested on a server computer with an AMD EPYC 7252 768
8-Core Processor (3.09 GHz), the Windows Server 2019 system, and a NVIDIA A30 display card. 769
Table 11. Computation time. “s” and “h” denote seconds and hours. 770
Category
Item
Duration
Training time
Ontology embedding model training
0.83 h
GNN-EL model training
3 h
GNN-RE model training
20 h
Loading time
Data loading for text, graph, and ontology embedding
20 s
Loading language models and GNN models
21 s
Inference time
First-stage subgraph extraction
0.75 s/graph
First-stage inference for entity linking
0.017 s/graph
32
Second-stage subgraph extraction
0.6 s/graph
Second-stage inference for relation extraction
0.091 s/graph
Extra-relationship finding
0.06 s
Standard SPARQL query generation
5 s
771
Table 11 reports the computation time at each phase. Training an ontology embedding model with 772
the populated INLE ontology takes approximately 0.83 h. The GNN-EL model converges in 3 h, 773
with its optimal test accuracy occurring on the 8th epoch. The GNN-RE model took around 20 h to 774
reach convergence, with its best test accuracy occurring on the 72nd epoch. The latter takes more 775
time because of more training data and a more complex working graph structure. 776
The total loading time for processing a single NLQ takes around 41 s for text data and graph ML 777
models. Batch processing a group of NLQs can substantially reduce the average loading time. In 778
contrast, the proposed approach is efficient in model inference. The average time for the GNN-EL 779
model to extract an OG subgraph and process a graph for each candidate entity is 0.75 s and 0.017 780
s. The total disambiguation time for a query depends on the number of ambiguous names and 781
candidate entities. In the second step, the average computation time of subgraph extraction and 782
model inference is around 0.6 s and 0.091 s per graph (entity pair). Finally, it takes around 5 s to 783
automatically generate a SPARQL query based on the SP results. In comparison, the average 784
loading time and inference time of the NLQ4BIM [11] system are 30 s and 25 s, respectively. 785
Overall, the total processing times of the two systems are close. 786
787
4.4 Case study 788
To demonstrate the practicability of the proposed method, a case study was conducted in a real-789
world construction project about a public library building [102] located in Hong Kong, as shown 790
in Fig. 13. The three-story building has a gross building area of around 600 . The Revit BIM 791
model used in the project coordination was collected, which integrates architecture, structure, and 792
MEP parts. In all, the library BIM model comprises 1776 building and facility components. Two 793
project engineers, who have backgrounds in architectural engineering and mechanical engineering, 794
were asked to create NLQs based on their respective information needs. Based on the open-source 795
BIMSPARQL-GUI framework [84], a web-based NLI prototype was developed, with the trained 796
SP models deployed in the server. As shown in Fig 14(a), the participants use the NLI by simply 797
33
inputting the NLQs in the left-side textbox. The retrieval results are then returned in the form of a 798
table and a graphical representation (see Fig. 14(b)). 799
As presented in Table 12, a total of 10 NLQs were created by the participants. Most questions are 800
about searching for elements with several conditions, which can be useful in various management 801
and engineering tasks. For example, the 2nd query is used to identify walls that have poor thermal 802
insulation, and the 9th query diagnoses the leaking pipe segments. Also, there are questions raised 803
to count elements (the 4th and 5th queries) and return attributes of physical or spatial elements (the 804
8th query), both of which improve the situational awareness of the building. 805
Table 12. Natural language queries generated by participants. 806
Queries
Result
1. Search the mullion with a thickness of 175 mm.
True
2. Search the exterior wall with a thermal transmittance higher than 10 W/(m2*K).
True
3. Search the walls that contain 150 mm aluminum on the second floor.
True
4. Count the number of the glazed panels with a thickness of 25 mm on the first
floor.
False
5. Count the number of risers of the stairs on the second floor.
True
6. Search the walls with a height greater than 400 mm on the UR floor.
True
7. Search the double flush panel doors on the ground floor whose material is glass.
True
8. Return the gross area of floor slabs on the ground floor.
True
9. Pipes with the system type of Supply Air whose friction pressure is lower
than 5 pa/m.
False
10. Select the windows that have a width > 1 m and have the minimum offset on
the first floor.
True
807
As a result, 8 out of 10 queries were correctly parsed and executed in 1-1.5 minutes. The 808
performance of GNN-EL model and GNN-RE model is evaluated based on the metrics employed 809
in Section 4.3.2 and Section 4.3.3, respectively. In terms of EL, a total of 15 ambiguous mentions 810
Fig. 13. The case study of the library building. (a) the real photograph [102]; (b) the rendered BIM model.
34
were processed, with micro and macro accuracy of 93.33% and 95%, respectively. The error occurs 811
in the 4th query when distinguishing whether “panel” refers to a property or IfcPlate, probably 812
because of the scarce training data relevant to the plate entity. 813
The evaluation of the GNN-RE model is based on 37 head-tail entity pairs, arising from the entities 814
correctly identified in the first stage. Consequently, the accuracy and macro F1 score are 94.59% 815
and 96.6%, respectively. The error occurs in the 9th query when predicting relationships for “pipes”, 816
because the training data for developing GNN models in Section 4.2 do not cover any NLQs related 817
to MEP concepts. This suggests the limitation of the GNN-based approach in processing NLQs 818
with unseen domain concepts, which will be tackled by zero-shot learning [103] in future studies. 819
Compared with searching manually for objects in the multi-domain BIM model with thousands of 820
elements, the NLI can help the engineers retrieve model information much more quickly. Both 821
participants recognize that they prefer to use an NLI to search for building elements if there are 822
constraints on attributes or relationships. In comparison, programming languages are considered 823
by participants as an impractical way to use in the project due to the tedious process and the lack 824
of IT skills. Moreover, it is found that the proposed SP method efficiently interprets questions 825
submitted by users, which conform to general expression habits but are often not in line with IFC 826
semantics. For example, in the first query, “mullion” is an enumerated type of IfcMember, but users 827
who do not know would not make NLQs like “search mullion members”. In this scenario, the 828
proposed method successfully extracts the multi-hop relational path between entities when 829
IfcMember is missing in the query. 830
In summary, the assessment results of the proposed SP method in this case study are plausible, with 831
the two GNN models having over 90% performance based on the different metrics. The result 832
Fig. 14. Web-based NLI for BIM model retrieval; (a) interface with the uploaded BIM model; (b) the retrieval
results
after inputting the second query in Table 12.
35
verifies the practicability of the graph ML-based approach in real-world projects, implying that 833
practitioners can effectively retrieve BIM models by using an NLI that deploys the developed 834
models. 835
836
5. Discussion 837
5.1 Effectiveness of GNN-based Text-to-BIMQL semantic parsing 838
The task of transforming NL texts into common BIM query languages encounters two critical 839
problems: name ambiguity and relational reasoning. Both must look at the BIM ontology and the 840
NLQ context together to figure out the entities and relationships that BIM queries talk about. The 841
proposed GNN-based approach solves these problems by putting both kinds of information into a 842
single graph representation for joint reasoning. The test results in Section 4.3 show that our method 843
is accurate when parsing text-based queries with different ambiguous name descriptions and 844
complex constraints. They also show the benefits of GNN-based models for automatically learning 845
the BIM ontology and estimating the logical form of unstructured texts. 846
By deploying the GNN models in NLIs or voice assistant systems, the proposed SP method can be 847
effectively used to translate NLQs and retrieve BIM models. The retrieval results can be further 848
represented in graphics or NL responses to address the different information needs of BIM users in 849
construction projects. 850
851
5.2 Limitations and future works 852
While the research’s achievements are promising, several limitations should also be noted. 853
(a) The GNN-EL model still makes mistakes due to confusing entities. For example, the EL model 854
often fails to distinguish whether “area” refers to a property or a space in test data. A potential 855
reason is that position edges and distance edges are not sufficient for capturing the position 856
information of entities in an NLQ and the dependencies between entities, respectively. 857
Consequently, the working graph for GNN-EL cannot effectively pass contextual information 858
to entity nodes. In future work, a better strategy to set up working graphs and score subgraphs 859
for candidate entities will be explored. 860
(b) Affected by the features of the head entity or tail entity, the GNN-RE model sometimes predicts 861
supplementary relationships rather than major relationships between entities in multi-hop 862
36
relational paths. Even though the prediction is reasonable, it is difficult to conversely 863
distinguish major relationships from supplementary relationships since the latter often describe 864
the identification of objects only. Future work will attempt to solve this problem by looking 865
into multi-label classification that instantly returns multiple-hop relationships. 866
(c) This study mainly focuses on extracting entities and their relationships from NLQs, but pays 867
less attention to the code generation. Although the obtained logical form of NLQ contains 868
entities and relational paths, the proposed method cannot extract and represent much more 869
complex logics and operations in NLQs, such as negation (e.g., “return spaces that are not on 870
the first floor”), and summation (e.g., “Does the sum of floor areas of Bath RM and Kitchen 871
exceed 50 ?”). In the future, how to identify these implicit operations and turn them into 872
logical forms of a query will be explored. 873
(d) Since the method involves two GNN models, the total computation cost is higher than that of 874
the Seq2Seq models, which utilize one holistic neural network. There are several factors that 875
slow down the computation time. First, the large LMs were loaded twice in two stages. Second, 876
the sum of the parameters in two GNN models is large. Third, the OG subgraphs were re-877
extracted during the second stage. Future studies will explore how to devise a GNN architecture 878
that can accomplish both tasks simultaneously. Also, more annotated data needs to be made so 879
that deep learning-based SP models can be better trained, which would also make it possible to 880
develop Seq2Seq models that can directly output codes. 881
882
6. Conclusion 883
As the building process gets more complicated, project practitioners need to be able to quickly and 884
flexibly composite ad hoc views and extract partial subsets of BIM models. Emerging natural 885
language-based query interface systems have the potential to allow BIM users to retrieve BIM 886
models in a time- and cost-efficient manner. However, the existing method cannot successfully 887
predict the logical forms of natural language queries that contain various user-specified conditions. 888
Name ambiguity and multi-hop relational path extraction are two formidable problems. Therefore, 889
this study proposes a novel graph neural network-based semantic parsing method for NL-based 890
BIM model retrieval. The method consists of two stages. In the first stage, the candidate entities in 891
NLQs are recognized in a coded ontological context. The ambiguous name mentions are then 892
processed by the proposed GNN-based entity linking model to match the correct entities. GNN-EL 893
conducts joint reasoning over the NLQ context and ontology graph for each candidate entity. 894
37
Having linked all mentions to ontological entities, the second stage extracts the relational paths 895
between entities in the NLQ. GNN-based link prediction is exploited to extract relationships 896
between each entity pair based on a heterogeneous graph that concatenates a dependency parsing 897
graph and an ontology graph. Finally, logical forms of NLQs are derived and transformed into 898
standard SPARQL queries for retrieving BIM models. 899
The proposed approach was developed and evaluated based on a new BIM-NLQ dataset containing 900
786 queries over five BIM models. The overall accuracy of semantic parsing was 81.01%, which 901
outperformed the existing NL-based BIM query systems. Furthermore, a case study was carried out 902
on a real-world building project. It was found that a natural language interface that deploys the 903
developed models can be used by project engineers to retrieve BIM models with different constraint 904
conditions. 905
The main contributions of this research are acknowledged in three aspects. 906
(a) A new GNN-based entity-linking model is given to automatically align ambiguous name 907
mentions in natural language texts with the BIM ontology. 908
(b) A novel GNN link prediction approach is presented to integrate ontologies to parse the NL-909
based BIM queries by extracting multi-categorical relationships. 910
(c) Multi-hop relational paths between IFC entities in NLQs can be fully extracted to generate 911
executable queries for retrieving BIM models. 912
Based on the above innovations, complex NLQs that include different constraint conditions can be 913
made to perform more fine-grained queries over BIM models. Finally, the proposed method has 914
several limitations. First, the proposed method cannot recognize complex logics (e.g., negation) 915
and engage the relevant operators in the queries. Second, the use of two GNN models poses heavy 916
computation costs. In future studies, end-to-end SP architecture that can handle various tasks 917
simultaneously will be devised to improve the performance. 918
919
Acknowledgements 920
We sincerely thank Architectural Technology and Innovation Services Limited for providing us 921
with the data used in the case study. 922
923
924
38
925
References 926
[1] C.C.M.C. Eastman, C.C.M.C. Eastman, P. Teicholz, R. Sacks, K. Liston, BIM handbook: 927
A guide to building information modeling for owners, managers, designers, engineers and 928
contractors, 2nd ed., John Wiley & Sons, Hoboken, NJ, USA, (2011). ISBN: 0470541377. 929
[2] Y. Hu, D. Castro-Lacouture, C.M. Eastman, Holistic clash detection improvement using a 930
component dependent network in BIM projects, Automation in Construction. 105 (2019) 931
pp.102832. https://doi.org/10.1016/j.autcon.2019.102832. 932
[3] I. Motawa, A. Almarshad, A knowledge-based BIM system for building maintenance, 933
Automation in Construction. 29 (2013) pp.173182. 934
https://doi.org/10.1016/j.autcon.2012.09.008. 935
[4] buildingSMART International Ltd., Industry Foundation Classes: Version 4.2 bSI Draft 936
Standard IFC Bridge proposed extension. 937
https://standards.buildingsmart.org/IFC/DEV/IFC4_2/FINAL/HTML/, 2017 (accessed 938
December 16, 2022). 939
[5] buildingSMART International Ltd., IFC4.3 RC2 - Release Candidate 2 [Draft]. 940
https://standards.buildingsmart.org/IFC/DEV/IFC4_3/RC2/HTML/, 2020 (accessed 941
December 16, 2022). 942
[6] W. Mazairac, J. Beetz, BIMQL - An open query language for building information 943
models, Advanced Engineering Informatics. 27 (2013) pp.444456. 944
https://doi.org/10.1016/j.aei.2013.06.001. 945
This is the manuscript version of the paper:
Mengtian Yin, Llewellyn Tang, Chris Webster, Jinyang Li, Haotian Li, Zhuoqian Wu, Reynold Cheng.
(2023) ‘Two-stage Text-to-BIMQL semantic parsing for building information model extraction using
graph neural networks’. Automation in Construction. Elsevier, 152, p. 104902. doi:
https://doi.org/10.1016/j.autcon.2023.104902.
The final version of this paper is available at: https://doi.org/10.1016/j.autcon.2023.104902
The use of this file must follow the Creative Commons Attribution Non-Commercial No Derivatives
License, as required by Elsevier’s policy.
39
[7] buildingSMART International Ltd., Model View Definitions (MVD). 946
https://www.buildingsmartusa.org/standards/bsi-standards/model-view-definitions-mvd/, 947
2021 (accessed December 16, 2022). 948
[8] E.W. East, S. O’Keeffe, R. Kenna, E. Hooper, Delivering COBie Using Autodesk Revit 949
(Perfect Bound), Lulu. com, (2017). ISBN: 1387200917. 950
[9] H. Ying, S. Lee, Generating second-level space boundaries from large-scale IFC-951
compliant building information models using multiple geometry representations, 952
Automation in Construction. 126 (2021) pp.103659. 953
https://doi.org/10.1016/j.autcon.2021.103659. 954
[10] M. Venugopal, C.M. Eastman, R. Sacks, J. Teizer, Semantics of model views for 955
information exchanges using the industry foundation class schema, Advanced Engineering 956
Informatics. 26 (2012) pp.411428. https://doi.org/10.1016/j.aei.2012.01.005. 957
[11] M. Yin, L. Tang, C. Webster, S. Xu, X. Li, H. Ying, An ontology-aided, natural language-958
based approach for multi-constraint BIM model querying, ArXiv Preprint 959
arXiv:2303.15116. (2023). https://doi.org/10.48550/arXiv.2303.15116. 960
[12] C. Preidel, S. Daum, A. Borrmann, Data retrieval from building information models based 961
on visual programming, Visualization in Engineering. 5 (2017) pp.114. 962
https://doi.org/10.1186/s40327-017-0055-0. 963
[13] J.R. Lin, Z.Z. Hu, J.P. Zhang, F.Q. Yu, A Natural-Language-Based Approach to 964
Intelligent Data Retrieval and Representation for Cloud BIM, Computer-Aided Civil and 965
Infrastructure Engineering. 31 (2016) pp.1833. https://doi.org/10.1111/mice.12151. 966
[14] S. Wu, Q. Shen, Y. Deng, J. Cheng, Natural-language-based intelligent retrieval engine 967
for BIM object database, Computers in Industry. 108 (2019) pp.73–88. 968
https://doi.org/10.1016/j.compind.2019.02.016. 969
[15] N. Wang, R.R.A. Issa, C.J. Anumba, NLP-based Query Answering System for 970
Information Extraction from Building Information Models, Journal of Computing in Civil 971
Engineering. 36 (2022). https://doi.org/10.1061/(ASCE)CP.1943-5487.0001019. 972
[16] I. Motawa, Spoken dialogue BIM systemsan application of big data in construction, 973
Facilities. (2017). https://doi.org/10.1108/F-01-2016-0001. 974
40
[17] M. Hoy, Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants, Medical 975
Reference Services Quarterly. 37 (2018) pp.8188. 976
https://doi.org/10.1080/02763869.2018.1404391. 977
[18] A. Kamath, R. Das, A Survey on Semantic Parsing, ArXiv Preprint ArXiv:1812.00978. 978
(2018). https://doi.org/10.48550/arXiv.1812.00978. 979
[19] P. Pasupat, P. Liang, Compositional semantic parsing on semi-structured tables, ArXiv 980
Preprint ArXiv:1508.00305. (2015). https://doi.org/10.48550/arXiv.1508.00305. 981
[20] T. Yu, R. Zhang, K. Yang, M. Yasunaga, D. Wang, Z. Li, J. Ma, I. Li, Q. Yao, S. Roman, 982
Spider: A large-scale human-labeled dataset for complex and cross-domain semantic 983
parsing and text-to-sql task, ArXiv Preprint ArXiv:1809.08887. (2018). 984
https://doi.org/10.48550/arXiv.1809.08887. 985
[21] J. Guo, Z. Zhan, Y. Gao, Y. Xiao, J.-G. Lou, T. Liu, D. Zhang, Towards complex text-to-986
sql in cross-domain database with intermediate representation, ArXiv Preprint 987
ArXiv:1905.08205. (2019). https://doi.org/10.48550/arXiv.1905.08205. 988
[22] C. Finegan-Dollak, J.K. Kummerfeld, L. Zhang, K. Ramanathan, S. Sadasivam, R. Zhang, 989
D. Radev, Improving text-to-sql evaluation methodology, ArXiv Preprint 990
ArXiv:1806.09029. (2018). https://doi.org/10.48550/arXiv.1806.09029. 991
[23] M. Bevilacqua, R. Blloshmi, R. Navigli, One SPRING to rule them both: Symmetric 992
AMR semantic parsing and generation without a complex pipeline, in: Proceedings of the 993
AAAI Conference on Artificial Intelligence, (2021): pp. 1256412573. ISBN: 2374-3468. 994
[24] I. Konstas, S. Iyer, M. Yatskar, Y. Choi, L. Zettlemoyer, Neural amr: Sequence-to-995
sequence models for parsing and generation, ArXiv Preprint ArXiv:1704.08381. (2017). 996
https://doi.org/https://doi.org/10.48550/arXiv.1704.08381. 997
[25] N. Wang, R.R.A. Issa, C.J. Anumba, A Framework for Intelligent Building Information 998
Spoken Dialogue System (iBISDS), in: EG-ICE 2021 Workshop on Intelligent Computing 999
in Engineering, Universitätsverlag der TU Berlin, (2021): p. 228. ISBN: 3798332118. 1000
[26] F. Elghaish, J. Chauhan, S. Matarneh, F. Pour Rahimian, Artificial intelligence-based 1001
voice assistant for BIM data management, Automation in Construction. (2022). 1002
https://doi.org/10.1016/j.autcon.2022.104320. 1003
41
[27] R. Zhang, N. El-Gohary, Transformer-based approach for automated context-aware IFC-1004
regulation semantic information alignment, Automation in Construction. 145 (2023) 1005
pp.104540. https://doi.org/10.1016/j.autcon.2022.104540. 1006
[28] X. Han, L. Sun, J. Zhao, Collective Entity Linking in Web text: A graph-based method, 1007
SIGIR’11 - Proceedings of the 34th International ACM SIGIR Conference on Research 1008
and Development in Information Retrieval. (2011) pp.765774. 1009
https://doi.org/10.1145/2009916.2010019. 1010
[29] M. Yin, L. Tang, C. Webster, S. Xu, X. Li, Data repository of the reviewed article “An 1011
ontology-aided, natural language-based approach for multi-constraint BIM model 1012
querying” https://github.com/MengtianYin/BIM-NLQI, 2021 (accessed December 16, 1013
2022). 1014
[30] S. Shin, R.R.A. Issa, BIMASR: Framework for Voice-Based BIM Information Retrieval, 1015
Journal of Construction Engineering and Management. 147 (2021) pp.4021124. 1016
https://doi.org/10.1061/(ASCE)CO.1943-7862.0002138. 1017
[31] P. Pauwels, W. Terkaj, EXPRESS to OWL for construction industry: Towards a 1018
recommendable and usable ifcOWL ontology, Automation in Construction. 63 (2016) 1019
pp.100133. https://doi.org/10.1016/j.autcon.2015.12.003. 1020
[32] J. Beetz, J. Van Leeuwen, B. De Vries, IfcOWL: A case of transforming EXPRESS 1021
schemas into ontologies, Artificial Intelligence for Engineering Design, Analysis and 1022
Manufacturing: AI EDAM. 23 (2009) pp.89. 1023
https://doi.org/10.1017/S0890060409000122. 1024
[33] K. Janowicz, M.H. Rasmussen, M. Lefrançois, G.F. Schneider, P. Pauwels, “BOT: the 1025
Building Topology Ontology of the W3C Linked Building Data Group, Semantic Web. 12 1026
(2019) pp.143161. https://doi.org/10.3233/SW-200385. 1027
[34] G.F. Schneider, M.H. Rasmussen, P. Bonsma, J. Oraskari, P. Pauwels, Linked building 1028
data for modular building information modelling of a smart home, in: 12th European 1029
Conference on Product and Process Modelling (ECPPM), CRC Press, (2018): pp. 4071030
414. ISBN: 042950621X. 1031
[35] J. Zhang, N.M. El-Gohary, Automated information transformation for automated 1032
regulatory compliance checking in construction, Journal of Computing in Civil 1033
Engineering. 29 (2015) pp.116. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000427. 1034
42
[36] P. Zhou, N. El-Gohary, Ontology-based automated information extraction from building 1035
energy conservation codes, Automation in Construction. 74 (2017) pp.103117. 1036
https://doi.org/10.1016/j.autcon.2016.09.004. 1037
[37] X. Xu, H. Cai, Ontology and rule-based natural language processing approach for 1038
interpreting textual regulations on underground utility infrastructure, Advanced 1039
Engineering Informatics. 48 (2021) pp.101288. https://doi.org/10.1016/j.aei.2021.101288. 1040
[38] Z. Zheng, Y.-C. Zhou, X.-Z. Lu, J.-R. Lin, Knowledge-informed semantic alignment and 1041
rule interpretation for automated compliance checking, Automation in Construction. 142 1042
(2022) pp.104524. https://doi.org/10.1016/j.autcon.2022.104524. 1043
[39] R. Zhang, N. El-Gohary, A deep neural network-based method for deep information 1044
extraction using transfer learning strategies to support automated compliance checking, 1045
Automation in Construction. 132 (2021) pp.103834. 1046
https://doi.org/10.1016/j.autcon.2021.103834. 1047
[40] J.B. Hamrick, V. Bapst, A. Sanchez-gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, 1048
D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, 1049
A. Vaswani, K. Allen, C. Nash, V. Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. 1050
Botvinick, Relational inductive biases, deep learning, and graph networks, ArXiv Preprint 1051
ArXiv:1806.01261. pp.140. https://doi.org/10.48550/arXiv.1806.01261. 1052
[41] ISO (International Organization for Standardization), ISO 16739:2018, Industry 1053
Foundation Classes (IFC) for Data Sharing in the Construction and Facility Management 1054
Industries Part 1: Data Schema. https://www.iso.org/standard/70303.html, 2018 1055
(accessed December 16, 2022). 1056
[42] MMXXI © RDF ltd, IFC Engine. https://rdf.bg/product-list/ifc-engine/, 2006 (accessed 1057
December 16, 2022). 1058
[43] S. Lockley, C. Benghi, M. Cerny, Xbim. Essentials: a library for interoperable building 1059
information applications, The Journal of Open Source Software. 2 (2017) pp.473. 1060
https://doi.org/10.21105/joss.00473. 1061
[44] J.K. Lee, Building environment rule and analysis (BERA) language and its application for 1062
evaluating building circulation and spatial program, Georgia Institute of Technology, 1063
2011. https://smartech.gatech.edu/bitstream/handle/1853/39482/Lee_Jin-1064
Kook_201105_PhD.pdf?sequence=1 (accessed March 29, 2023). 1065
43
[45] S. Daum, A. Borrmann, Processing of topological BIM queries using boundary 1066
representation based methods, Advanced Engineering Informatics. 28 (2014) pp.272286. 1067
https://doi.org/10.1016/j.aei.2014.06.001. 1068
[46] W. Terkaj, A. Šojić, Ontology-based representation of IFC EXPRESS rules: An 1069
enhancement of the ifcOWL ontology, Automation in Construction. 57 (2015) pp.1881070
201. https://doi.org/10.1016/j.autcon.2015.04.010. 1071
[47] M. Bonduel, J. Oraskari, P. Pauwels, M. Vergauwen, R. Klein, The IFC to linked building 1072
data converter: current status, in: 6th Linked Data in Architecture and Construction 1073
Workshop, CEUR Workshop Proceedings, 2018: pp. 3443. 1074
[48] O. Lassila, R.R. Swick, Resource description framework (RDF) model and syntax 1075
specification. https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/, 1998 (accessed 1076
December 16, 2022). 1077
[49] Holger Knublauch, J.A. Hendler, K. Idehen, SPIN - SPARQL Inferencing Notation. 1078
https://spinrdf.org/, 2011 (accessed December 16, 2022). 1079
[50] BuildingSMART, IFD library white paper. https://www.buildingsmart.org/standards/bsi-1080
standards/standards-library/, 2008 (accessed December 16, 2022). 1081
[51] A. Chaudhary, A. Battan, Natural Language Interface to Databases-An Implementation., 1082
International Journal of Advanced Research in Computer Science. 5 (2014). 1083
https://doi.org/10.26483/ijarcs.v5i6.2248. 1084
[52] N.V. Divin, BIM by using Revit API and Dynamo. A review, AlfaBuild. (2020) pp.1404. 1085
https://doi.org/10.34910/ALF.14.4. 1086
[53] N. Wang, R.R.A. Issa, C.J. Anumba, Transfer learning-based query classification for 1087
intelligent building information spoken dialogue, Automation in Construction. 141 (2022) 1088
pp.104403. https://doi.org/10.1016/J.AUTCON.2022.104403. 1089
[54] N. Wang, R.R.A. Issa, C.J. Anumba, Named Entity Recognition Algorithm for iBISDS 1090
Using Neural Network, Construction Research Congress 2022. (2022) pp.521529. 1091
https://doi.org/10.1061/9780784483961.055. 1092
[55] J.A. Bondy, U.S.R. Murty, Graph theory with applications, Macmillan London, (1976). 1093
ISBN: 0444194517. 1094
44
[56] L. Tang, H. Liu, Graph mining applications to social network analysis, in: Managing and 1095
Mining Graph Data, Springer, (2010): pp. 487513. ISBN: 978-1-4419-6044-3. 1096
[57] Z. Liu, J. Zhou, Introduction to Graph Neural Networks, Synthesis Lectures on Artificial 1097
Intelligence and Machine Learning. 14 (2020) pp.1127. 1098
https://doi.org/10.2200/S00980ED1V01Y202001AIM045. 1099
[58] M. Yasunaga, H. Ren, A. Bosselut, P. Liang, J. Leskovec, QA-GNN: Reasoning with 1100
Language Models and Knowledge Graphs for Question Answering, ArXiv Preprint 1101
ArXiv:2104.06378. (2021). https://doi.org/10.48550/arXiv.2104.06378. 1102
[59] S.-X. Zhang, X. Zhu, J.-B. Hou, C. Liu, C. Yang, H. Wang, X.-C. Yin, Deep relational 1103
reasoning graph network for arbitrary shape text detection, in: Proceedings of the 1104
IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: pp. 96991105
9708. 1106
[60] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun, Graph Neural 1107
Networks: A Review of Methods and Applications, ArXiv. (2018) pp.1–22. 1108
https://doi.org/10.1016/j.aiopen.2021.01.001. 1109
[61] M. Schlichtkrull, T.N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, M. Welling, Modeling 1110
relational data with graph convolutional networks, in: European Semantic Web 1111
Conference, Springer, 2018: pp. 593607. 1112
[62] J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, Neural message passing for 1113
quantum chemistry, in: International Conference on Machine Learning, PMLR, (2017): 1114
pp. 1263–1272. ISBN: 2640-3498. 1115
[63] Z. Wang, R. Sacks, T. Yeung, Exploring graph neural networks for semantic enrichment: 1116
Room type classification, Automation in Construction. (2021) pp.104039. 1117
https://doi.org/10.1016/j.autcon.2021.104039. 1118
[64] W.L. Hamilton, R. Ying, J. Leskovec, Inductive representation learning on large graphs, 1119
ArXiv Preprint ArXiv:1706.02216. (2017). https://doi.org/10.48550/arXiv.1706.02216. 1120
[65] F.C. Collins, A. Braun, M. Ringsquandl, D.M. Hall, A. Borrmann, Assessing ifc classes 1121
with means of geometric deep learning on different graph encodings, in: Proceedings of 1122
the 2021 European Conference on Computing in Construction, 2021: pp. 332341. 1123
https://doi.org/10.35490/EC3.2021.168. 1124
45
[66] Y. Hu, X. Cheng, S. Wang, J. Chen, T. Zhao, E. Dai, Times series forecasting for urban 1125
building energy consumption based on graph convolutional network, Applied Energy. 307 1126
(2022) pp.118231. https://doi.org/10.1016/j.apenergy.2021.118231. 1127
[67] J. Kim, S. Chi, Graph neural network-based propagation effects modeling for detecting 1128
visual relationships among construction resources, Automation in Construction. 141 1129
(2022) pp.104443. https://doi.org/10.1016/j.autcon.2022.104443. 1130
[68] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation. 9 (1997) 1131
pp.1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. 1132
[69] J. Chen, P. Hu, E. Jiménez-Ruiz, O. Holter, D. Antonyrajah, I. Horrocks, OWL2Vec*: 1133
Embedding of OWL Ontologies, Machine Learning, 2020. 1134
https://doi.org/10.1007/s10994-021-05997-6. 1135
[70] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed Representations of 1136
Words and Phrases and Their Compositionality, in: Proceedings of the 26th International 1137
Conference on Neural Information Processing Systems - Volume 2, Curran Associates 1138
Inc., Red Hook, NY, USA, 2013: pp. 31113119. 1139
[71] P. Veličković, A. Casanova, P. Liò, G. Cucurull, A. Romero, Y. Bengio, Graph attention 1140
networks, in: 6th International Conference on Learning Representations, ICLR 2018 - 1141
Conference Track Proceedings (2018), 2018: pp. 112. https://doi.org/10.1007/978-3-031-1142
01587-8_7. 1143
[72] D. Busbridge, D. Sherburn, P. Cavallo, N.Y. Hammerla, Relational Graph Attention 1144
Networks, ArXiv Preprint ArXiv:1904.05811. (2019) pp.1–21. 1145
https://doi.org/10.48550/arXiv.1904.05811. 1146
[73] R. Cao, L. Chen, Z. Chen, Y. Zhao, S. Zhu, K. Yu, LGESQL: line graph enhanced text-to-1147
SQL model with mixed local and non-local relations, ArXiv Preprint ArXiv:2106.01093. 1148
(2021). https://doi.org/10.48550/arXiv.2106.01093. 1149
[74] K. Wang, W. Shen, Y. Yang, X. Quan, R. Wang, Relational graph attention network for 1150
aspect-based sentiment analysis, ArXiv Preprint ArXiv:2004.12362. (2020). 1151
https://doi.org/10.48550/arXiv.2004.12362. 1152
46
[75] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional 1153
Transformers for Language Understanding, ArXiv Preprint ArXiv:1810.04805. (2018). 1154
https://doi.org/10.48550/arXiv.1810.04805. 1155
[76] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. 1156
Polosukhin, Attention is all you need, Advances in Neural Information Processing 1157
Systems. 2017-Decem (2017) pp.59996009. https://doi.org/10.48550/arXiv.1706.03762. 1158
[77] S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing 1159
internal covariate shift, in: The 32nd International Conference on Machine Learning, 1160
PMLR, 2015: pp. 448–456. https://doi.org/10.48550/arXiv.1502.03167. 1161
[78] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a 1162
simple way to prevent neural networks from overfitting, The Journal of Machine Learning 1163
Research. 15 (2014) pp.19291958. ISSN: 1532-4435. 1164
[79] J.L. Ba, J.R. Kiros, G.E. Hinton, Layer normalization, ArXiv Preprint ArXiv:1607.06450. 1165
(2016). https://doi.org/10.48550/arXiv.1607.06450. 1166
[80] D. Hendrycks, K. Gimpel, Gaussian error linear units (gelus), ArXiv Preprint 1167
ArXiv:1606.08415. (2016). https://doi.org/10.48550/arXiv.1606.08415. 1168
[81] J. Pollock, E. Waller, R. Politt, Speech and language processing, Day-to-Day Dyslexia in 1169
the Classroom. (2010) pp.1628. https://doi.org/10.4324/9780203461891_chapter_3. 1170
[82] W.W.W. Consortium, OWL 2 web ontology language document overview, 2012. 1171
https://www.w3.org/TR/owl2-overview/ (accessed March 29, 2023). 1172
[83] V. Tablan, D. Damljanovic, K. Bontcheva, A natural language query interface to 1173
structured information, in: European Semantic Web Conference 2008: The Semantic Web: 1174
Research and Applications, 2008: pp. 361375. https://doi.org/10.1007/978-3-540-68234-1175
9_28. 1176
[84] C. Zhang, J. Beetz, B. De Vries, BimSPARQL: Domain-specific functional SPARQL 1177
extensions for querying RDF building data, Semantic Web. 9 (2018) pp.829855. 1178
https://doi.org/10.3233/SW-180297. 1179
[85] P. Pauwels, S. Zhang, Y.C. Lee, Semantic web technologies in AEC industry: A literature 1180
overview, Automation in Construction. 73 (2017) pp.145165. 1181
https://doi.org/10.1016/j.autcon.2016.10.003. 1182
47
[86] buildingSMART International Ltd., IFC2x Edition 3 Technical Corrigendum 1. 1183
https://standards.buildingsmart.org/IFC/RELEASE/IFC2x3/TC1/HTML/, 2007 (accessed 1184
December 16, 2022). 1185
[87] P. Pauwels, IFCtoRDF Converter. https://github.com/pipauwel/IFCtoRDF, 2017 (accessed 1186
December 16, 2022). 1187
[88] The Apache Software Foundation, Apache Jena. https://jena.apache.org/ (accessed 1188
December 16, 2022). 1189
[89] D. Krech, Rdflib: A python library for working with rdf. 1190
https://github.com/RDFLib/rdflib, 2006 (accessed December 16, 2022). 1191
[90] A. Hagberg, D. Conway, NetworkX: Network Analysis with Python. 1192
https://networkx.github.io, 2020 (accessed December 16, 2022). 1193
[91] Y. Zhang, SuPar. https://github.com/yzhangcs/parser, 2020 (accessed December 16, 1194
2022). 1195
[92] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. 1196
Gimelshein, L. Antiga, Pytorch: An imperative style, high-performance deep learning 1197
library, Advances in Neural Information Processing Systems. 32 (2019). 1198
[93] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, 1199
P. Prettenhofer, R. Weiss, V. Dubourg, Scikit-learn: Machine learning in Python, The 1200
Journal of Machine Learning Research. 12 (2011) pp.2825–2830. 1201
https://doi.org/10.1145/2786984.2786995. 1202
[94] M.S. Shelke, P.R. Deshmukh, V.K. Shandilya, A review on imbalanced data handling 1203
using undersampling and oversampling technique, International Journal Of Recent Trends 1204
In Engineering & Research. 3 (2017) pp.444449. 1205
https://doi.org/10.23883/ijrter.2017.3168.0uwxm. 1206
[95] L. Liu, H. Jiang, P. He, W. Chen, X. Liu, J. Gao, J. Han, On the variance of the adaptive 1207
learning rate and beyond, ArXiv Preprint ArXiv:1908.03265. (2019). 1208
https://doi.org/10.48550/arXiv.1908.03265. 1209
[96] P. McNamee, H.T. Dang, Overview of the TAC 2009 knowledge base population track, 1210
in: Text Analysis Conference (TAC), 2009: pp. 111113. 1211
48
[97] G. Zhu, C.A. Iglesias, Exploiting semantic similarity for named entity disambiguation in 1212
knowledge graphs, Expert Systems with Applications. 101 (2018) pp.8–24. 1213
https://doi.org/10.1016/j.eswa.2018.02.011. 1214
[98] X. Han, L. Sun, A generative entity-mention model for linking entities with knowledge 1215
base, ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for 1216
Computational Linguistics: Human Language Technologies. 1 ((2011)) pp.945–954. 1217
ISBN: 9781932432879. 1218
[99] J.G. Zheng, D. Howsmon, B. Zhang, J. Hahn, D. McGuinness, J. Hendler, H. Ji, Entity 1219
linking for biomedical literature, BMC Medical Informatics and Decision Making. 15 1220
(2015) pp.19. https://doi.org/10.1186/1472-6947-15-S1-S4. 1221
[100] H. Zhu, Y. Lin, Z. Liu, J. Fu, T.S. Chua, M. Sun, Graph neural networks with generated 1222
parameters for relation extraction, in: ACL 2019 - 57th Annual Meeting of the Association 1223
for Computational Linguistics, Proceedings of the Conference, 2020: pp. 13311339. 1224
https://doi.org/10.18653/v1/p19-1128. 1225
[101] S. Wu, Y. He, Enriching pre-trained language model with entity information for relation 1226
classification, in: International Conference on Information and Knowledge 1227
Management,Proceedings, 2019: pp. 23612364. 1228
https://doi.org/10.1145/3357384.3358119. 1229
[102] LEO ARCHITECTS, The North Lamma Public Library & Heritage and Cultural 1230
Showroom. https://www.leighorange.com/project/north-lamma-public-library-heritage-1231
cultural-showroom/, 2019 (accessed December 16, 2022). 1232
[103] Y. Xian, B. Schiele, Z. Akata, Zero-shot learning-the good, the bad and the ugly, in: 1233
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1234
pp. 4582–4591. https://doi.org/10.48550/arXiv.1703.04394. 1235
1236
Appendix I. List of relationships 1237
Table 13 presents the list of relationships and their definitions for the second-stage RE in Section 1238
3.5. Note that * represents the relation definitions which come from BIMSPARQL [84]. 1239
Table 13. List of relationships and definitions for the second-stage relation extraction. 1240
49
Relationship
Definition
No relation
There is no dependency relationship between entities.
Logic-OR
The conditions of two entities have a logical disjunction relation (inclusive-OR).
hasProperty*
Relationship between an object instance and its property.
isPropertyOf
Relationship between a property and an object instance.
hasTypeEnumeration
Relationship between an object instance and a predefined object type that is a
specific enumeration (e.g., IfcSlab has predefined types FLOOR and ROOF).
isTypeEnumerationOf
Relationship between an enumerated predefined object type and an object instance.
hasSpaceBoundary*
Relationship between a space and its boundary elements.
isSpaceBoundaryOf
Relationship between boundary elements and space.
isContainedIn*
Relationship between a building element and the spatial structure that contains it.
hasContainedElement
Relationship between a spatial structure and its contained building elements.
hasSpatialDecomposition*
Relationship between a spatial element and objects that decompose it.
hasSpatialComposition
Relationship between an object and its composed spatial elements.
hasElementDecomposition*
Relationship between a building element and its composite element.
hasElementComposition*
Representing a building element has an aggregation structure that can be
decomposed into other elements (e.g., IfcStair and IfcMember).
hasQuantity*
Relationship between an object instance and its quantity.
isQuantityOf
Relationship between quantity and an object instance.
hasNextSpace*
Relationship between spaces that are next to each other.
hasObjectType
Relationship between an object instance and its ObjectType attribute.
isObjectTypeOf
Relationship between ObjectType attribute and an object instance.
hasSingleMaterial*
Relationship between an object instance and a single uniform material.
hasListMaterial
Relationship between an object instance and materials contained in a material list.
hasLayerMaterial
Relationship between an object instance and material contained in a material layer.
isMaterialOf
Relationship between material and an object instance.
hasTypeObject*
Relationship between an object instance and its object type represented as IfcType
-Object.
isTypeObjectOf
Relationship between object type (IfcTypeObject) and an object instance.
isPlacedIn*
Relationship between elements such as doors and windows and the elements in
which they are placed (e.g., walls).
hasPlacedElement
Relationship between an element and other elements that placed in it.
hasLongName
Relationship between an object instance and its attribute LongName
50
isLongNameOf
Relationship between the attribute LongName and the object instance.
hasPropertyValue
Relationship between a property and its value.
isPropertyValueOf
Relationship between property value and property.
hasTag
Relationship between an object instance and its attribute Tag
isTagOf
Relationship between attribute Tag and the object instance.
largerThan
The value of a property/quantity is greater than the value of another one.
lessThan
The value of a property/quantity is less than the value of another one.
equalTo
The value of a property/quantity is equal to the value of another one.
1241
Appendix II. Data 1242
The developed BIM-NLQ dataset can be accessed via https://github.com/MengtianYin/BIM-GNN-1243
dataset. 1244
... − to perform processing of data with complex structure. Parsing combined with the capabilities of neural networks [6][7] can be used for image retrieval, deep analysis of texts written in natural language. Neural network can help in extracting meaningful features of the ...
Article
Full-text available
As a rule, data parsing is used to quickly obtain information from various web resources for further study and use. For parsing, you can use both specialized online services and desktop applications. Unfortunately, existing parsing technologies have some limitations. For example, it is often difficult to parse dynamic web pages and classify information obtained through parsing. New approaches are needed in implementing data collection and analysis - using language models and software (web driver) that simulate human actions when working with websites. The web driver assists in accessing data from dynamically updated sites, while artificial intelligence technologies help correctly recognize and classify data. This technology can be used to create parsers for real estate agencies, employment services, university admission committees, advertising campaigns, and financial organizations.
... In light of these issues, this study will investigate the approaches, strategies, and applications of financial statement text information mining and key information extraction model development [7]. They hope to develop unique solutions for automating the extraction of critical financial insights from textual data by conducting a thorough examination of advanced NLP approaches, machine learning algorithms, and transdisciplinary ideas [8] [9]. ...
Article
Full-text available
Financial statement text information mining and key information extraction model design are critical areas of research that aim to use advanced computational approaches to extract important insights from textual data contained in financial documents. In this work, they look at methodologies, techniques, and applications that combine natural language processing (NLP) and machine learning to automate financial statement interpretation. To lay the groundwork for the research, researchers first conduct a thorough examination of existing literature in interdisciplinary domains such as computational linguistics, information retrieval, and finance. Building on insights from earlier studies, they design and use unique NLP approaches, such as named entity identification, syntactic parsing, sentiment analysis, and topic modelling, to extract essential financial metrics from textual data. Additionally, they create machine learning models that are suited to the peculiarities of financial terminology and reporting standards, combining domain-specific knowledge with linguistic experience to improve accuracy and reliability. They demonstrate the efficacy and scalability of the technique in automating the extraction of crucial financial information, such as revenue trends, cost patterns, and risk factors, through rigorous testing on real-world financial data. These results highlight the transformative power of natural language processing and machine learning in financial analysis, providing stakeholders in finance and accounting with actionable intelligence for informed decision-making, risk assessment, and compliance monitoring. By bridging the gap between computational linguistics and financial analysis, this study advances financial text analysis and provides the framework for future research and innovation in this emerging field.
Article
Full-text available
Industry Foundation Classes (IFCs), as the most recognized data schema for Building Information Modeling (BIM), are increasingly combined with ontology to facilitate data interoperability across the whole lifecycle in the Architecture, Engineering, Construction, and Facility Management (AEC/FM). This paper conducts a bibliometric analysis of 122 papers from the perspective of data, model, and application to summarize the modes of IFC and ontology integration (IFCOI). This paper first analyzes the data and models of the integration from IFC data formats and ontology development models to the IfcOWL data model. Next, the application status is summed up from objective and phase dimensions, and four frequent applications with maturity are identified. Based on the aforementioned multi-dimensional analysis, three integration modes are summarized, taking into account various data interoperability requirements. Accordingly, ontology behaves as the representation of domain knowledge, an enrichment tool for IFC model semantics, and a linkage between IFC data and other heterogeneous data. Finally, this paper points out the challenges and opportunities for IFCOI in the data, domain ontology, and integration process and proposes a building lifecycle management model based on IFCOI.
Article
While the adoption of open Building Information Modeling (open BIM) standards continues to grow, the inherent complexity and multifaceted nature of the built asset lifecycle data present a critical bottleneck for effective information retrieval. To address this challenge, the research community has started to investigate advanced natural language-based search for building information models. However, the accelerated pace of advancements in deep learning-based natural language processing research has introduced a complex landscape for domain-specific applications, making it challenging to navigate through various design choices that accommodate an effective balance between prediction accuracy and the accompanying computational costs. This study focuses on the semantic tagging of user queries, which is a cardinal task for the identification and classification of references related to building entities and their specific descriptors. To foster adaptability across various applications and disciplines, a semantic annotation scheme is introduced that is firmly rooted in the Industry Foundation Classes (IFC) schema. By taking a comparative approach, we conducted a series of experiments to identify the strengths and weaknesses of traditional and emergent deep learning architectures for the task at hand. Our findings underscore the critical importance of domain-specific and context-dependent embedding learning for the effective extraction of building entities and their respective descriptions.
Article
Text mining (TM) and natural language processing (NLP) have stirred interest within the construction field, as they offer enhanced capabilities for managing and analyzing text-based information. This highlights the need for a systematic review to identify the status quo, gaps, and future directions from the perspective of construction management. A review was conducted by aligning the objectives of 205 publications with the specific domains, areas, tasks, and processes outlined in construction management practices. This review reveals multiple facets of the construction sector empowered by TM/NLP approaches and highlights essential voids demanding consideration for automation possibilities and minimizing manual tasks. Ultimately, following identified obstacles, the review results indicate potential research opportunities: (1) strengthening overlooked construction aspects, (2) coupling diverse data formats, and (3) leveraging pre-trained language models and reinforcement learning. The findings will provide vital insights, fostering further progress in TM/NLP research and its applications in academia and industry.
Article
Full-text available
Construction project stakeholders often have to retrieve the required information in Building Information Models (BIMs) to support their design, engineering, and management activities. Natural language interface (NLI) systems are emerging as a time- and cost-effective way to query complex BIM models. However, the existing attempts cannot logically combine different constraints to perform fine-grained queries, dampening the usability of BIM-oriented NLIs. This paper presents a novel ontology-aided semantic parser to automatically map natural language queries (NLQs) that contain different attribute and relational constraints into computer-readable codes for BIM model retrieval in the context of building project development. A modular ontology was first developed to represent natural language expressions of Industry Foundation Classes (IFC) concepts, relationships, and reasoning rules; it was then populated with entities from target BIM models to assimilate project-specific information. After that, the ontology-aided semantic parser progressively extracts concepts, relationships, and value restrictions from NLQs to identify multi-level constraint conditions, resulting in standard SPARQL queries to successfully retrieve IFC-based BIM models. The approach was evaluated based on 225 NLQs collected from BIM users, with a 91% accuracy rate. Finally, a case study about the design-checking of a real-world residential building demonstrates the practicability of the proposed method in the construction industry.
Article
Full-text available
As an essential prodecure to improve design quality in the construction industry, automated rule checking (ARC) requires intelligent rule interpretation from regulatory texts and precise alignment of concepts from different sources. However, there still exists semantic gaps between design models and regulatory texts, hindering the exploitation of ARC. Thus, a knowledge-informed framework for improved ARC is proposed based on natural language processing. Within the framework, an ontology is first established to represent domain knowledge, including concepts, synonyms, relationships, constraints, etc. Then, semantic alignment and conflict resolution are introduced to enhance the rule interpretation process based on predefined domain knowledge and unsu-pervised learning techniques. Finally, an algorithm is developed to identify the proper SPARQL function for each rule, and then to generate SPARQL-based queries for model checking purposes, thereby making it possible to interpret complex rules where extra implicit data needs to be inferred. Experiments show that the proposed framework and methods successfully filled the semantic gaps between design models and regulatory texts with domain knowledge, which achieves a 90.1% accuracy and substantially outperforms the commonly used keyword matching method. In addition, the proposed rule interpretation method proves to be 5 times faster than the manual interpretation by domain experts. This research contributes to the body of knowledge of a novel framework and the corresponding methods to enhance automated rule checking with domain knowledge.
Article
Full-text available
Existing systems that employ Automatic Speech Recognition (ASR) technology to retrieve information from the BIM model fail to provide remote interaction, retrieve a wide range of data, and automate the entire process. This is particularly a problem for users with disabilities. The paper offers a two-way, automated, and agnostic solution to this theoretical and methodological gap. A ‘Proof of Concept’ prototype was developed using Amazon Alexa – as the AI voice assistant platform – to test the applicability. The outcome shows that the created and the retrieved information is valid. Furthermore, there is a high level of interoperability among the components of the proposed solution, including the AI voice assistant interface and mediation environment to convert verbal requests and retrieve information to CSV files. Future research will extend the created solution to retrieve and access information from a BIM cloud model.
Conference Paper
Full-text available
Conversational Artificial Intelligence (AI) systems have become more and more popular to provide information support for human daily life. However, the construction industry still lags other industries in developing a conversational AI system to support construction activities. The developed intelligent Building Information Spoken Dialogue System (iBISDS) is a conversational AI system that provides a speech-based virtual assistant for construction personnel with considerable building information to support construction activities. The iBISDS enables construction personnel to use flexible spoken natural language queries instead of detecting exact keywords. To build an iBISDS, it is necessary to understand the intents of natural language queries for building information. This research aims to develop a named entity recognition (NER) algorithm for iBISDS to recognize and classify keywords within natural language queries. A dataset with 2,008 building information-related natural language queries was developed and manually annotated for training and testing. A Neural Network (NN) deep learning method was trained to recognize named entities within natural language queries. After training, the developed NER algorithm was applied to the testing dataset which achieved a precision of 99.74, a recall of 99.87, and an F1-score of 99.81. The preliminary result indicated that the developed NER algorithm can recognize named entities within the natural language queries accurately. This research will facilitate the further development of conversational AI systems in the construction industry.
Article
Full-text available
Semantic enrichment of Building Information Modeling (BIM) models supplements models with the implicit semantics for further applications. In this paper, we use the room classification task to develop, test and illustrate a novel approach to semantic enrichment of BIM models-representation of models as graphs and application of graph neural networks (GNNs). A dedicated graph dataset consisting of 224 apartment layouts with nine room types and node/edge features was compiled. An improved GNN algorithm, SAGE-E, was developed for processing both node and edge features and a batch method was used to improve efficiency. The experiments showed that 1) The novel approach of adopting graphs and GNNs was feasible. 2) SAGE-E achieved higher accuracy (79%) and more balanced prediction (F1 = 0.79) when compared with other machine learning algorithms. 3) SAGE-E shortened the training and validation process. This work 2 pioneers the application of GNNs for semantic enrichment and opens the door to other possible applications. The dataset and source code are available for public access at https://github.com/ZijianWang1995/SAGE-E.
Article
One of the main challenges of automated compliance checking systems is aligning the semantics of the building information models (BIMs), in Industry Foundation Classes (IFC) format, and the semantics of the regulations, in natural language, to allow for checking the compliance of the BIM with the regulations. Existing information alignment methods typically require intensive manual effort and their ability to deal with the complex regulatory concepts in the regulations is limited. To address this gap, this paper proposes a deep learning method for IFC-regulation semantic information alignment. The proposed method uses a relation classification model to relate and align the IFC and regulatory concepts. The method uses a transformer-based model and leverages the definitions of the concepts and an IFC knowledge graph to provide additional contextual information and knowledge for improved classification and alignment. The proposed method was evaluated on IFC concepts from IFC 4 and regulatory concepts from different building codes and standards. The experimental results showed good information alignment performance.
Article
In Text-to-AMR parsing, current state-of-the-art semantic parsers use cumbersome pipelines integrating several different modules or components, and exploit graph recategorization, i.e., a set of content-specific heuristics that are developed on the basis of the training set. However, the generalizability of graph recategorization in an out-of-distribution setting is unclear. In contrast, state-of-the-art AMR-to-Text generation, which can be seen as the inverse to parsing, is based on simpler seq2seq. In this paper, we cast Text-to-AMR and AMR-to-Text as a symmetric transduction task and show that by devising a careful graph linearization and extending a pretrained encoder-decoder model, it is possible to obtain state-of-the-art performances in both tasks using the very same seq2seq approach, i.e., SPRING (Symmetric PaRsIng aNd Generation). Our model does not require complex pipelines, nor heuristics built on heavy assumptions. In fact, we drop the need for graph recategorization, showing that this technique is actually harmful outside of the standard benchmark. Finally, we outperform the previous state of the art on the English AMR 2.0 dataset by a large margin: on Text-to-AMR we obtain an improvement of 3.6 Smatch points, while on AMR-to-Text we outperform the state of the art by 11.2 BLEU points. We release the software at github.com/SapienzaNLP/spring.
Article
Detecting visual relationships among construction resources plays a pivotal role in understanding complex construction scenes and performing vision-based site monitoring and digitalization. Despite extensive efforts, the propagation effects of different resource-to-resource interactions were overlooked and thus, it is still challenging to precisely detect entangled and intertwined visual relationships from actual construction images. To address the challenge, this study proposes a semantic graph neural network approach that structuralizes construction resources and their entangled interactions in the form of a graph, and simulates the propagation effects using a neural message passing mechanism. The experimental results showed that the proposed approach achieved 77.1% F-score—11.5% higher than the performance of the baseline model. This suggests the positive impacts of the propagation effects and the applicability of the proposed approach. These findings can help understand what are actually happening at construction sites automatically and provide valuable insights for future vision-based monitoring studies.
Article
Retrieving queried information from building information models (BIM) requires experience in structured query languages and manipulation of BIM software. Artificial Intelligence (AI)-based spoken dialogue systems provide more opportunities for information retrieval from building information models via natural language queries. This research developed a transfer learning-based text classification (TC) method to classify different queries into pre-defined categories for an intelligent building information spoken dialogue system (iBISDS), a virtual assistant that provides information retrieval support for construction project team members. The architecture of the TC neural network (NN) was built based on the pre-trained Robustly Optimized BERT Pretraining Approach (RoBERTa). After the training process, the re-trained and fine-tuned RoBERTa NN achieved a precision of 99.76%, a recall of 99.76%, and an F1 score of 99.76% on the testing dataset. The experimental results indicated that the developed NN algorithm for TC can relatively accurately classify different building information-related queries into pre-defined TC categories.
Article
The construction industry is information-intensive, and building information modeling (BIM) has been proposed as an information source for supporting decision making by construction project team members in the architecture, engineering, construction, and operation (AECO) industry. Because building information models contain more building data, further use of the aggregated building information to support construction and operation activities has become important. In Industry 4.0, similar-to-real-life virtual assistants, e.g., Apple’s Siri and Google Assistant, are becoming ever more popular. This research developed a query-answering (QA) system for BIM information extraction (IE) by using natural language processing (NLP) methods to build a virtual assistant for construction project team members. The architecture of the developed QA system for BIM IE consists of three major modules: natural language understanding, IE, and natural language generation. A Python-based prototype application was developed based on the architecture of the QA system for BIM IE to evaluate functionalities of the developed QA system using several BIM/industry foundation classes (IFC) models. Seven building information models and 127 test queries were utilized to evaluate the accuracy of the developed QA system for BIM IE. The experimental results indicated that the developed QA system for BIM IE achieved an 81.9 accuracy score. The developed NLP-based QA system for BIM is valid to provide relatively accurate answers based on natural language queries. The contributions of this research facilitate the development of virtual assistants in the AECO industry, and the architecture of the developed QA system can be extended to queries in other areas.