ArticlePDF Available

Deep-Learning-Based Scenario Recognition With GNSS Measurements on Smartphones

December 2022
IEEE Sensors Journal PP(99)

December 2022
PP(99)

DOI:10.1109/JSEN.2022.3230213

Authors:

Zhiqiang Dai

Sun Yat-Sen University

Chunlei Zhai

Sun Yat-Sen University

Smartphones are in everyone’s hands for applications including navigation and localization-based services, and scenario recognition is critical for seamless indoor and outdoor navigation. How to use smartphone sensing data to recognize different scenarios is a meaningful but challenging problem. To address this issue, we propose a structured grid-based deep-learning scenario recognition technique that uses smartphone GNSS measurements (satellite position, pseudorange, Doppler shift, and C/N0). In this work, the scenarios are grouped into four categories: deep indoors, shallow indoors, semi-outdoors, and open outdoors. The proposed approach utilizes Voronoi tessellations to obtain structured-grid representations from satellite positions and performs computations using convolutional neural networks (CNNs) and convolutional long short-term memory (ConvLSTM) networks. With only spatial information being considered, the CNN model is used to extract the features of Voronoi tessellations for scenario recognition, achieving a high accuracy of 98.82%. Then, to enhance the robustness of the algorithm, the ConvLSTM network is adopted, which treats the measurements as spatiotemporal sequences, improving the accuracy to 99.92%. Compared with existing methods, the proposed algorithm is simple and efficient, using only GNSS measurements without the need for additional sensors. Furthermore, the latencies of the CNN and ConvLSTM models on a CPU are only 16.82ms and 27.94ms, respectively. Therefore, the proposed algorithm has potential for real-time applications.

Distribution of the data collection time. The width of each bar represents the amount of data. Note that the training and testing data were not collected on the same day.

…

Accuracy and loss of the CNN at different epochs.

…

The impact of the image resolutions on the accuracy.

…

Stacked bar chart for the contribution of multi-constellations. All: GPS, BDS, Galileo, GLONASS, and QZSS.

…

Figures - uploaded by Chunlei Zhai

Content may be subject to copyright.

Content uploaded by Chunlei Zhai

Content may be subject to copyright.

IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022 1

Deep Learning-Based Scenario Recognition

with GNSS Measurements on Smartphones

Zhiqiang Dai ˙

ID , Chunlei Zhai ˙

ID , Fang Li ˙

ID , Weixiang Chen ˙

ID , Xiangwei Zhu ˙

ID , and Yanming Feng ˙

Abstract—Smartphones are in every-

one’s hands for applications including

navigation and localization-based ser-

vices, and scenario recognition is critical

for seamless indoor and outdoor navi-

gation. How to use smartphone sensing

data to recognize different scenarios is a

meaningful but challenging problem. To

address this issue, we propose a struc-

tured grid-based deep-learning scenario

recognition technique that uses smart-

phone GNSS measurements (satellite po-

sition, pseudorange, Doppler shift, and

C/N0). In this work, the scenarios are

grouped into four categories: deep indoors, shallow indoors, semi-outdoors, and open outdoors. The proposed approach

utilizes Voronoi tessellations to obtain structured-grid representations from satellite positions and performs computations

using convolutional neural networks (CNNs) and convolutional long short-term memory (ConvLSTM) networks. With only

spatial information being considered, the CNN model is used to extract the features of Voronoi tessellations for scenario

recognition, achieving a high accuracy of 98.82%. Then, to enhance the robustness of the algorithm, the ConvLSTM

network is adopted, which treats the measurements as spatiotemporal sequences, improving the accuracy to 99.92%.

Compared with existing methods, the proposed algorithm is simple and efﬁcient, using only GNSS measurements without

the need for additional sensors. Furthermore, the latencies of the CNN and ConvLSTM models on a CPU are only 16.82ms

and 27.94ms, respectively. Therefore, the proposed algorithm has potential for real-time applications.

Index Terms—Scenario recognition, GNSS measurements, CNN, ConvLSTM, Smartphone

I. INTRODUCTION

HUMAN activities in modern cities emphasize the impor-

tance of mobile terminals in navigation and positioning

services, placing high requirements on accurate, robust, and ef-

ﬁcient positioning in diverse environments. Smartphone-based

seamless indoor and outdoor navigation and localization has

received increasing attention [1]. However, a single technology

cannot satisfy all navigation and positioning requirements

in different scenarios. For example, the widely used Global

Navigation Satellite System (GNSS) can achieve accurate

This work was supported by Science and Technology Planning

Project of Guangdong Province of China (Grant No.2021A0505030030).

The associate editor coordinating the review of this article and approving

it for publication was Fabrizio Santi. (Corresponding author : Zhiqiang

Dai.)

Zhiqiang Dai, Chunlei Zhai, Fang Li, and Xiangwei Zhu are

with the School of Electronic and Communication Engineering,

Sun Yat-sen University, Guangzhou 510006, China (e-mail:

daizhiqiang@mail.sysu.edu.cn; zhaichlei@mail2.sysu.edu.cn;

lifang63@mail2.sysu.edu.cn; zhuxw666@mail.sysu.edu.cn).

Weixiang Chen is with the Department of Electronic

Engineering, Tsinghua University, Beijing 100084, China (e-mail:

cwx22@mails.tsinghua.edu.cn).

Yanming Feng is with the School of Computer Science, Queens-

land University of Technology, Brisbane, QLD 4000, Australia (e-mail:

y.feng@qut.edu.au).

positioning in open environments. However, in some complex

environments (such as urban canyons, wooded areas, and

tunnels), GNSS signals can easily be blocked, reﬂected or

attenuated by the surrounding buildings and vegetation [2]–

[4]. Thus, the inherent vulnerability of GNSS in complex

environments severely restricts the navigation and positioning

performance of these systems.

A practical solution to this problem is to design a multi-

source fusion positioning system that employs additional sen-

sors to obtain auxiliary information to improve positioning

accuracy and robustness [5]–[9]. Although such a scheme

performs well in challenging environments, considerable hard-

ware resources and power are wasted in open environments

that do not require these additional sensors. Therefore, a more

efﬁcient method is to switch between different positioning

technologies in different navigation environments. Navigation

environments are usually divided into four categories: deep

indoors, shallow indoors, semi-outdoors, and open outdoors

[10]–[12]. Many algorithms have been developed to adapt

to these scenarios. In open outdoor scenarios, GNSS is the

most widely used technology and can meet the needs of high-

precision positioning. In semi-outdoor scenarios, such as in

urban canyons and under tree canopies, the shadow matching

2 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022

method has been proposed to achieve meter-level positioning

precision [13]. In shallow indoor scenarios, positioning can

usually be based on cellular signals [14] or the fusion of

satellite and terrestrial wireless systems [15]. In deep indoor

scenarios, WiFi and Bluetooth-based positioning methods are

the most widely used technologies [16]–[18]. Ultrawideband

(UWB) and visible light positioning also provide practical

solutions for indoor positioning problems [19]–[21]. However,

choosing the appropriate positioning strategy according to the

given scenario depends heavily on the location of the user.

Thus, scenario recognition plays a key role in seamless indoor

and outdoor positioning technology.

Scenario recognition methods can be divided into two

categories: vision-based methods and signal-based methods.

Vision-based methods identify the scenario type by extracting

features from photos of the environment [22]–[24]. Although

vision-based methods can achieve high accuracy, they need

cameras to continuously capture environmental images, which

is not suitable for smartphones. Some signal-based recognition

methods use various types of sensors that can be integrated

into smartphones [25]–[27], including GNSS, accelerometers,

gyroscopes, barometers, magnetometers and light sensors.

However, these multi-sensor scenario recognition algorithms

may misjudge when switching sensors to adapt to changes in

scenarios. Therefore, a desirable solution is to use a single

sensor for scenario recognition. Modern smartphones capable

of receiving GNSS measurements hold rich information about

surrounding environments. Therefore, GNSS-based scenario

recognition methods offer great potential. Several GNSS-based

scenario recognition algorithms have been developed in recent

years. Chen and Tan [28] simply extracted the number of

visible satellites from raw GPS data as a feature for scenario

recognition. Experimental results from various environments

showed a recognition accuracy of 85.6%. Gao and Groves

[29] extracted the number of visible satellites and the sum

of C/N0 over 25 dB-Hz as features and then used a hidden

Markov model (HMM) for scenario recognition. This method

was tested in various locations across the city of London and

achieved an overall accuracy of 88.2%. Lai et al. [30] input

eight multi-satellite GNSS measurements features to a support

vector machine (SVM) to divide the positioning environment

into three categories: open outdoors, occluded outdoors, and

indoors. The recognition accuracy of this method on the testing

dataset reached 90.3%. Zhu et al. [11] proposed a scenario

detection method that used satellite geometric distributions, the

number of visible satellites, the dilution of precision (DOP),

and C/N0 as the observed variables. Moreover, they utilized

a stacked machine learning model and an HMM for scenario

detection, achieving an accuracy of 97.02%. Xia et al. [12]

introduced a deep learning method that utilized a feature

vector containing the number of visible satellites, C/N0, and

its statistics. This vector was then fed into an LSTM network,

achieving an overall accuracy of 98.65%.

The above GNSS-based scenario recognition methods have

achieved excellent results. However, the recognition accuracy

can be further improved because they do not utilize both the

spatial and temporal characteristics of GNSS measurements.

Due to the spatial distribution of satellites and the continuity

in their measurements, various kinds of GNSS measurements

can be regarded as spatiotemporal sequences. Making full

use of this spatiotemporal information can effectively improve

the accuracy and robustness of scenario recognition algo-

rithms. Thus, in this paper, we utilize four types of GNSS

measurements: satellite position, pseudorange, Doppler shift,

and C/N0. These features are mapped to image sequences

using Voronoi tessellations to fully consider the spatiotemporal

characteristics of the measurements. Then, the images are fed

into a neural network, a convolutional neural network (CNN)

or a convolutional long short-term memory network (ConvL-

STM), to recognize the navigation scenarios. The results of

extensive experiments demonstrate that the proposed algorithm

performs well in terms of accuracy, robustness, and real-time

performance.

The remainder of this paper is organized as follows. Section

II introduces the proposed scenario recognition method, as

well as the Voronoi tessellations, the CNN model, the ConvL-

STM model, and the scenario categories. Section III presents

the dataset and an analysis of the results of the proposed

method. Section IV summarizes our conclusions and provides

directions for future work.

II. METHODOLOGY

In this section, we ﬁrst present the ﬂow of our scenario

recognition algorithm. Then, we introduce the scenario cate-

gories and the GNSS measurements used in the algorithm.

A. Overview

To optimize the spatiotemporal information contained in

the GNSS measurements, we propose a deep learning-based

scenario recognition method. Fig. 1 presents the concept and

processing ﬂow of the proposed method. First, a satellite pro-

jection module maps visible satellites (white dots) and blocked

satellites (black dots) onto a square image, which serves as the

fourth channel in the 4-channel images. Then, the Voronoi

tessellation module utilizes three GNSS measurements(the

pseudorange, Doppler shift, and C/N0) to interpolate the

satellite projection image and obtain the other three channels.

Finally, the 4-channel images are fed into a neural network

(CNN or ConvLSTM) to identify four typical scenarios.

B. Satellite Projection Module

This module generates the projection images that are used

as the fourth channel in the 4-channel images and assists the

Voronoi tessellations. During each epoch, all received and

blocked satellites are projected onto a square image plane

based on the elevation and azimuth, which are computed

according to the broadcast ephemeris. As shown in Fig. 2,

the coordinates of a satellite on the image plane xs= [x, y]T

can be expressed as

x=rcos θsin φ

y=rcos θcos φ(1)

where rrepresents half of the side length of the square

image and θand φare the elevation and azimuth of the

AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (DECEMBER 2022) 3

G01

C01 E01

R01

Convolutional Neural Network

G01

C01 E01

R01

Convolutional LSTM

deep indoor

shallow indoor

semi-outdoor

open outdoor

Projection

Pseudorange

C/N0

Doppler

Deep learning based scenario recognition

Voronoi tessellation

Satellites projection

cos sin

cos cos

xr θ φ

yr θ φ

Fig. 1. Concept and processing ﬂow of the proposed system, which maps GNSS measurements to 4-channel images. The images are then fed

into a CNN or ConvLSTM to identify the four typical scenarios.

satellite, respectively. Then, the projection image IP=

IP({xsi}n

i=1)∈R2r×2r, i = 1,2,· · · , n (Fig. 3(a)), which

indicates the projection of the satellite on the square image,

is deﬁned as

IP({xsi}n

i=1) = 1if x=xsi for any i

0otherwise.(2)

satellite

projectio n

X(E)

Y(N)

Z(U)

Fig. 2. Satellite projection method, where the blue dot is the satellite

and the green dot is its projection. The gray areas represent locations

without satellite distribution.

C. Voronoi Tessellation Module

This module utilizes three GNSS measurements to inter-

polate the projection image and determine the other three

channels in the 4-channel image. The tool we employ is the

Voronoi tessellation, which is widely used in meteorology

and geography. Voronoi tessellation is a simple and spatially

optimal projection of GNSS measurements onto the spatial

domain [31]. We ﬁrst give the typical generic deﬁnition of

this tool. Let Sdenote a set of npoints si(called sites)

in the Euclidean plane E[32] and pbe any point in E.

This tessellation approach optimally partitions space Einto n

regions G={g1,g2,· · · ,gn}using boundaries determined

by distances dbetween pand si. Using a distance function d,

the Voronoi tessellation can be expressed as

gi={p∈E|d(p, si)≤d(p, sj), i =j}(3)

Then, let the projection of each satellite in IPbe a

site. The GNSS measurements from satellite sican then be

interpolated for each region gi. Since there are no satellites

outside the circle of radius r, we deﬁne the Voronoi image

IV∈ {IR,ID,IC}(Fig. 3) as follows

IV=G⊙M(4)

where ⊙denotes the Hadamard product, IR,ID,IC∈

R2r×2rdenote the pseudorange, Doppler shift, and C/N0

images, respectively, and Mis the masked image, which can

be expressed as

M(x, y) = 1if x2+y2≤r2

0otherwise.(5)

(a) Projection (b) Pseudorange (c) Doppler (d) C/N0

Fig. 3. Example of a 4-channel image, which consists of a projection

image, a pseudorange image, a Doppler image, and a C/N0 image.

Then, we obtain a 4-channel image I= [IP,IR,ID,IC]∈

R2r×2r×4, which is fed into a neural network (CNN or

ConvLSTM) to recognize four scenarios.

D. CNN Model

The transformation from GNSS raw measurements to

image-like data allows us to apply a convolutional neural

network (CNN) [33]. A CNN is a powerful neural network

that is designed for image processing and dominates computer

vision ﬁeld. A CNN can be intuitively understood as a stack

of convolutional (conv), pooling, and fully connected (FC)

layers. The main operation of the CNN is convolution which

extracts the features of the input image.

j(m, n)=Ψ"bj+

Cl−1−1

i=0

K−1

p=0

K−1

q=0

ijmn

∗Il−1

i(m+p−s, n +q−s)#(6)

4 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022

Convolution1

Kernal size: 3×3

Stride: 1

Padding: 1

Max Pooling1

Kernal size: 2×2

Stride: 2

Padding: 0

Convolution2

Kernal size: 3×3

Stride: 1

Padding: 1

Max Pooling2

Kernal size: 2×2

Stride: 2

Padding: 0

FC1 FC2 Yi

BN BN

4@100×100 16@100×100 16@50×50 32@50×50 32@25×25 2048 512 4

Fig. 4. Block diagram of the CNN model. A 4-channel image is fed into the network. Then two conv layers and two max pooling layers are used

for feature extraction and two FC layers are added to the network. Finally, a one-hot vector Yithat represents scenario label is obtained.

where s=ﬂoor(k/2),Wdenotes a convolution kernel of

size K×Kand bjindicates the bias. In the lth layer, Il−1

iis

the ith channel in the input feature map, and Il

jis the output

of the jth channel. Ψrepresents the activation function. Our

CNN model is shown in Fig. 4.

In the input layer, the radius ris set to 50 pixels, yielding

an image size of 100×100×4. To reduce the algorithm com-

plexity, we reduce the number of conv layers and the number

of channels of the feature map as much as possible without

losing accuracy. The reason is that deeper and wider networks

have higher hardware requirements and larger latencies. To

ensure the real-time ability of the algorithm, we add only 2

conv layers. The numbers of feature map channels in the ﬁrst

and second layers are 16 and 32, respectively. The conv kernel

size in each layer is set to 3×3. The conv layers are followed

by two FC layers. The output of the network Yiis a one-hot

vector corresponding to a particular scenario. We use softmax

as the nonlinear activation function in the last layer and the

rectiﬁed linear unit (ReLU) function in all other layers.

ReLU (x) = max (x, 0) (7)

Our model uses the cross-entropy loss function, which

usually performs well in multi-classiﬁcation tasks. The pa-

rameters (weights and biases) of the network are optimized

by stochastic gradient descent (SGD). We train the network

on a training and validation dataset for approximately 2000

epochs. Throughout the training process, a batch size of 32 and

a learning rate of 10−6are used. To prevent overﬁtting, batch

normalization (BN) is added after each conv layer. Another

regularization technique is early stopping with a patience of

15, which means that if the validation loss is not reduced

within 15 epochs, the training is forced to stop early.

E. ConvLSTM Model

Although CNNs exhibit excellent performance in scenario

recognition tasks, CNNs do not consider correlations among

consecutive scenarios over time. GNSS measurements are

sequential data, and 4-channel images are generated at each

sampling time. Therefore, these correlations can be included

in the algorithm to further improve the robustness. We deﬁne

scenario recognition as a spatiotemporal sequential classiﬁca-

tion problem. To model the spatiotemporal relationship, we

utilize ConvLSTM [34], which embeds conv structures into

an LSTM network [35]. In the ConvLSTM model, a processor

known as a cell determines whether the information is useful

(Fig. 5). Three gates are included in each cell: an input gate

It, a forget gate Ft, and an output gate Ot.Ht−1is the

hidden state, and Ctis the memory passed to the next cell.

The key expression of the ConvLSTM cell is shown below.

It=σ(Wxi ∗Xt+Whi ∗Ht−1+Wci ⊙Ct−1+bi)

Ft=σWxf ∗Xt+Whf ∗Ht−1+Wcf ⊙Ct−1+bf

Ct=Ft⊙Ct−1+It⊙tanh (Wxc ∗Xt+Whc ∗Ht−1+bc)

Ot=σ(Wxo ∗Xt+Who ∗Ht−1+Wco ⊙Ct−1+bo)

Ht=Ot⊙tanh (Ct)

(8)

where ⊙is the Hadamard product and ∗represents the

convolution operation. Wxi,Wxf ,Wxc and Wxo are the

weight matrices connected to the input vector Xtin each

layer; Whi,Whf ,Whc and Who are the weight matrices

connected to the previous hidden state Ht−1;Wci,Wcf and

Wco are the weight matrices connected to Ct−1in each layer;

and bi,bf,bcand boare the biases.

σ σ tanh σ

tanh

FtOt

Ct-1

Ht-1

Activation function

Operational character

Copy

Join

Convolution

Hadamard product

Fig. 5. Block diagram of the ConvLSTM cell. The difference between

ConvLSTM and LSTM models is that ConvLSTM introduces the convo-

lution operation.

To perform scenario recognition with the ConvLSTM net-

work, the 4-channel image sequence needs to be transformed

into time segments with a sliding window. Consider the

following sequences:

{I0,I1,I2,· · · ,It,It+1,It+2 ,· · ·} (9)

Let the sliding window size be T. The tth (t≥0) time

segment is deﬁned as

{It,It+1,· · · ,It+T−2,It+T−1}(10)

This approach has a drawback: scenario recognition cannot

be performed within the ﬁrst T−1sampling times. However,

AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (DECEMBER 2022) 5

as we discuss later, Tis usually very small. Therefore, this

factor has little effect on the algorithm.

A ConvLSTM network can be built based on ConvLSTM

cells. Its structure is shown in Fig. 6. A 4-channel image

sequence with a sliding window width Tis fed into TCon-

vLSTM cells with the same parameters. These Tcells form a

layer in the network. A stronger feature extraction capability

can be obtained by stacking nlayers . After several FC layers

and a softmax layer, the ﬁnal one-hot vector representing the

scenario category is obtained.

ConvLSTM1ConvLSTM1ConvLSTM1

ConvLSTM2ConvLSTM2ConvLSTM2

Fully connected

m neurons

Fully connected

4 neurons

ConvLSTMnConvLSTMnConvLSTMn

Softmax

I0I1IT-1

Fig. 6. Block diagram of the ConvLSTM network with time steps T,

including nConvLSTM layers, 2 FC layers, a softmax layer, and an

output layer.

To ensure the real-time performance of the algorithm, we

add only two ConvLSTM layers. The numbers of feature

map channels in the ﬁrst and second layers are 8 and 16,

respectively. The size of the conv kernel in each layer is set

to 3×3. The sliding window size Tis set to 4. To prevent

overﬁtting, we add a BN layer after each ConvLSTM layer.

Similar to the CNN, we still use the cross-entropy loss function

and optimize the parameters of the network with SGD. The

network is trained on training and validation datasets for

approximately 1,000 epochs. The batch size is set to 32 and

the learning rate is set to 10−6. Moreover, the early stopping

regularization technique with a patience of 15 is applied.

F. Scenario Categories

Complex environments are divided into four categories:

deep indoors, shallow indoors, semi-outdoors, and deep out-

doors, as shown in Fig. 7.

Deep indoors are almost completely closed environments

in which only a few satellite signals are received. It should

be noted that scenarios where no satellites are visible do

not need to be considered in the algorithm. Once a scenario

has been identiﬁed as not receiving any signals, this sce-

nario is immediately determined to be deep indoors. Shallow

indoors are scenarios near windows, balconies and doors.

Semi-outdoors include half-obstructed outdoor areas that are

typically near tall buildings. Open outdoors are scenarios with

almost no obstruction; thus, open outdoors have the highest

GNSS signal strength and the most visible satellites. In the

proposed algorithm, we encode the scenario categories as one-

hot vectors.

(a) Deep indoors (b) Shallow indoors

Fig. 7. Scenario categories. Complex environments are divided into

four categories: deep indoors, shallow indoors, semi-outdoors, and deep

outdoors.

Y0= [1,0,0,0]T

Y1= [0,1,0,0]T

Y2= [0,0,1,0]T

Y3= [0,0,0,1]T

(11)

G. Feature Vector Deﬁnition

In the proposed algorithm, the satellite position (elevation

and azimuth), pseudorange, Doppler shift, and C/N0 are uti-

lized as features for scenario recognition. In the above four

scenarios, the number of visible satellites and their distribu-

tions vary substantially due to different locations and occlu-

sions (Fig. 8). For example, satellite signals are rarely received

in the deep indoor scenario, while dozens of satellites are

visible in the open outdoor scenario. The pseudorange between

the satellite and the smartphone also varies when a smartphone

is at different latitudes, longitudes or altitudes. Moreover, the

Doppler shift is sensitive to the relative velocity between the

satellite and the smartphone. Different relative velocities, that

is, different Doppler shifts, can be used to distinguish distinct

scenarios to some extent. In terms of C/N0, when the scenario

changes, the average C/N0 varies considerably due to the

difference in the number of visible satellites. In addition, due

to the multipath effect and NLOS signal, diverse occlusions

also have a large impact on C/N0 [11] [12]. Therefore, C/N0

is an ideal feature for scenario recognition. Another reason

for using these measurements and not more is that they

can be read directly from the RINEX format ﬁle received

by the smartphone without additional calculations. Obtaining

other measurements requires additional computation, which

increases algorithm complexity. These measurements already

allow our algorithm to have a high accuracy. So adding another

one will only increase the amount of computation without any

improvement in algorithm performance.

Crucially, both visible and blocked satellites are used in

Voronoi tessellations. The measurements of the latter are set

to 0; that is, the Voronoi polygons containing these blocked

satellites are black after interpolation. The value ranges of the

features vary widely. To ensure that the network converges,

6 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022

Fig. 8. Sky plots of the four scenarios.

the training and testing sets need to be normalized before the

Voronoi tessellations are performed [36]. The normalization

formula is deﬁned as follows:

′=mt−mmin

mmax −mmin

(12)

where mmax and mmin are the maximum and minimum values

of the features among all theoretically visible satellites at time

t, respectively, and mtis the original feature value at that time.

III. EXPERIMENTS AND ANA LYSIS

In this section, extensive experiments are conducted to

analyze the performance of our algorithm. First, the dataset

is introduced. Then, we study the results of the CNN and

ConvLSTM models. Finally, we analyze their real-time per-

formance.

A. Dataset

The experimental data were collected using an Android

smartphone (HONOR 30) with a sampling interval of 1 s.

The data were divided into training and testing datasets. The

training set was used for training and validation, and the testing

set was used to evaluate the generalizability of the proposed

algorithm. To increase the robustness of the algorithm, the

training set needs to be obtained over a wide time range.

Therefore, we collected data over three time periods in a day.

During each time period, we collected data for more than 10

minutes for each scenario, yielding a total duration of 2 hours.

The ratio of training and validation sets is 7:3; that is, 1.4 hours

of data was used for model training, and the remainder of the

data were used for validation. For the testing set, 10 minutes

of data were collected for each scenario, yielding a total time

of more than 40 minutes. The collection time distribution of

the dataset is shown in Fig. 9.

Deep indoor

S h a l l o w i n d o o r

S e m i - o u td o o r

O pen outdoor

00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00

T r a i n i n g d a t a

T e s t i n g d a t a

T i m e

Fig. 9. Distribution of the data collection time. The width of each bar

represents the amount of data. Note that the training and testing data

were not collected on the same day.

To demonstrate the generalizability of our method, we

consider two challenging measures. First, the testing set was

collected a few days after the training set. Moreover, as shown

in Fig. 9, the collection time points of the testing set did

not coincide with those of the training set. Second, the data

were not static. The volunteers stopped and walked freely in

the various scenarios while collecting the data. Therefore, our

algorithm has strong adaptability to time and space.

B. CNN-Based Scenario Recognition Results

In this section, we utilize all constellations (GPS, BDS,

Galileo, GLONASS, and QZSS) and all measurements (po-

sition, pseudorange, Doppler shift, and C/N0) to determine

the optimal scenario recognition accuracy. The accuracy and

loss curves of the training and validation sets are shown in

Fig. 10.

0 200 400 600 800 1000 1200 1400 1600 1800 2000

0 . 2

0 . 3

0 . 4

0 . 5

0 . 6

0 . 7

0 . 8

0 . 9

1 . 0

A c c u r a c y

E p o c h s

T r a i n i n g A c c u r a c y

V a l i d a t i o n A c c u r a c y

0 . 0

0 . 5

1 . 0

1 . 5

2 . 0

2 . 5

3 . 0

T r a i n i n g L o s s

V a l i d a t i o n L o s s

L o s s

Fig. 10. Accuracy and loss of the CNN at different epochs.

Fig. 10 shows the accuracy and loss at different epochs.

As the epoch increases, the loss curve falls smoothly, which

means that the recognition results of the CNN model tend

to be consistent with the ground truth, and the model tends

to converge. Therefore, the accuracy is gradually improved.

This is why the accuracy and loss curves are almost sym-

metrical. This phenomenon indicates that our model is stable.

The training and validation curves are essentially consistent,

demonstrating that the model is not overﬁtted. The accuracy

of the validation set converges to 99.51% when the CNN was

AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (DECEMBER 2022) 7

trained for approximately 100 epochs. However, since the loss

tends to decrease slowly, the training continued to improve

the generalizability of the model. After training, the testing

data were fed into the model, and the confusion matrix of the

testing set (Table I) was obtained.

TABLE I

CONFUSION MATRIX OF THE TESTING SET BASED ON CNN

Deep indoor Shallow indoor Semi-outdoor Open outdoor

Deep indoor 100.00% 0.00% 0.00% 0.00%

Shallow indoor 0.00% 95.74% 4.26% 0.00%

Semi-outdoor 0.00% 0.49% 99.51% 0.00%

Open outdoor 0.00% 0.00% 0.00% 100.00%

The recognition accuracy of the deep indoor and open

outdoor scenarios is 100%, which is consistent with the

expected results. The deep indoor scenario is an almost com-

pletely closed environment with very few visible satellites;

thus, channels in the images of this scenario are expected

to be nearly black. In contrast, the open outdoor scenario is

almost entirely unobstructed and can usually receive dozens

of satellite signals. Therefore, these two scenarios are easier

to identify than the other two scenarios. The recognition

accuracies of the shallow indoor and semi-outdoor scenarios

are 95.74% and 99.51%, respectively. The overall accuracy

of the CNN model reaches 98.82%, which is higher than

that of existing scenario recognition algorithms. It is worth

mentioning that the testing set was acquired several days after

the training set, and the smartphone was randomly moving

and stopping during data collection. Therefore, the proposed

algorithm is robust in both time and space.

C. Optimal Image Resolution

One of the key steps in our approach is to use the satellites’

elevation and azimuth to project them onto an image. The

resolution of this image has a great impact on the sce-

nario recognition accuracy. If its resolution is too low, the

Voronoi polygons cannot be clearly distinguished. In contrast,

if the resolution is too high, the calculation time increases

while the accuracy is not improved. Therefore, the use of a

smaller image resolution can substantially improve the real-

time performance of our algorithm while maintaining a high

recognition accuracy. Thus, we studied radii between 10 and

100 in steps of 10 pixels in our experiment. The experimental

results are shown in Fig. 11.

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

9 5

9 6

9 7

9 8

9 9

100

A c c u r a c y ( % )

R a d i u s ( p i x e l s )

Fig. 11. The impact of the image resolutions on the accuracy.

The highest recognition accuracy is obtained when r= 50.

If the radius is less than 50, the accuracy may be reduced

because the neighboring edges of the Voronoi polygons are

not clear, thereby decreasing the ability of the model to

characterize spatial information. If the radius is larger than

50, higher-resolution images require a larger perceptual ﬁeld

to extract effective features; thus, the network depth must be

increased. To ensure that the algorithm can meet the real-time

requirements, increasing the network depth is not desirable.

Therefore, 100×100 (r= 50) is a relatively ideal image

resolution. The subsequent experiments are all performed

based on this resolution.

D. Contribution of Multi-Constellations

In this section, we continuously increase the number of

constellations to evaluate the contribution of each constellation

in the scenario recognition task. Fig. 12 shows the stacked

bar chart of this experiment, and Fig. 13 shows the confu-

sion matrices. As the number of constellations increases, the

recognition accuracy of individual scenarios and the overall

accuracy both increase. When only GPS is used, the accuracy

of shallow indoors is only 53.68% and that of semi-outdoors

is only 73.54%. These low accuracies occur because the

number of visible GPS satellites is relatively small, while more

signals are received from the BDS satellites. Therefore, the

recognition accuracy is greatly improved by introducing BDS.

Similarly, few Galileo, GLONASS and QZSS signals were

received. However, these signals still contribute to improving

the accuracy. Moreover, even when only GPS signals are used,

the accuracy of deep indoors remains high. The reason for

this result is that the 4-channel images of this scenario are

obviously distinct because few satellites are visible in the deep

indoor scenario.

Deep indoor

S h a ll o w i n d o o r

S e m i - o u td o o r

Open outdoor

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

A c c u r a c y (% )

G P S G P S / B D S G P S / B D S / G a l i l e o A L L

Fig. 12. Stacked bar chart for the contribution of multi-constellations.

All: GPS, BDS, Galileo, GLONASS, and QZSS.

The confusion matrices show that scenario recognition er-

rors occur in mainly shallow indoor and semi-outdoor scenar-

ios. The former scenario is easily misidentiﬁed as deep indoors

or semi-outdoors, while the latter is easily misidentiﬁed as

shallow indoors and open outdoors.

E. Contribution of Multi-Measurements

In addition to exploring the contribution of multiple mea-

surements in scenario recognition, we investigated the role

of individual measurements. We increased the number of

measurements one by one in the order of satellite projection,

pseudorange, Doppler shift, and C/N0 to explore their roles.

Fig. 14 shows the stacked bar chart of the recognition accuracy,

and Fig. 15 shows the confusion matrices. When only satellite

8 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022

Fig. 13. Confusion matrices for the contribution of multi-constellations.

Heatmaps are used to characterize the confusion matrices; the darker

the color is, the higher the accuracy.

projection images are used, the recognition accuracy is very

low, reaching only 39.42%. After pseudorange images are

added, the overall accuracy improves greatly, reaching 71.68%.

This is because, for the same satellite, diverse pseudoranges

are obtained when the smartphone is at different latitudes,

longitudes, or altitudes. The Doppler shift images also differ

because of the large difference in the relative velocity between

the smartphone and the satellite when the smartphone is at

different positions. Because the distribution of occlusions dif-

fers in each scenario, the signals emitted by the same satellite

are diverse, resulting in different C/N0 values. Therefore, the

accuracy tends to increase incrementally as Doppler shift and

the C/N0 images are considered.

Deep indoor

S h a l lo w i n d o o r

S e m i -o u t d o o r

Open outdoor

0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

A c c u r a c y ( % )

P r o j P r o j / P R P r o j / P R / D o p p l e r A l l

Fig. 14. Stacked bar chart for the contribution of multi-measurements.

Proj: satellite projection, PR: pesudorange, Doppler: Doppler shift, All:

satellite projection, pesudorange, Doppler shift, and C/N0.

F. ConvLSTM-Based Scenario Recognition Results

Since CNNs do not consider correlations among consecu-

tive scenarios over time, we utilize the ConvLSTM network

for scenario recognition to ensure that spatial and temporal

information are both considered. Therefore, the GNSS mea-

surements are transformed into image sequences and fed into

Fig. 15. Confusion matrices for the contribution of multi-measurements.

Heatmaps are used to characterize confusion matrices; the darker the

color, the higher the accuracy.

a ConvLSTM network. The output of the last ConvLSTM cell

at the ﬁnal time step is used as the recognition result.

0 100 200 300 400 500 600 700 800 900 1000

0 . 4

0 . 5

0 . 6

0 . 7

0 . 8

0 . 9

1 . 0

A c c u r a c y

E p o c h s

T r a i n i n g A c c u r a c y

V a l i d a t i o n A c c u r a c y

0 . 0

0 . 2

0 . 4

0 . 6

0 . 8

1 . 0

1 . 2

T r a i n i n g L o s s

V a l i d a t i o n L o s s

L o s s

Fig. 16. Accuracy and loss of ConvLSTM network at different epochs.

Fig. 16 shows the accuracy and loss curves of the training

and validation sets for the ConvLSTM network. Although

these curves exhibit smooth convergence trends, they are insuf-

ﬁcient for proving the advantage of the ConvLSTM network

over the CNN. As shown in the confusion matrix (Table II), the

recognition accuracy of the ConvLSTM network for each sce-

nario is higher than that of the CNN. The accuracy improves

because the ConvLSTM network considers correlations among

consecutive scenarios over time.

Since the time step is a key factor affecting the performance

of the ConvLSTM network, its effect was analyzed. With the

time step Tand the current time t, the algorithm considers

the GNSS measurements from t−T+1 to t. If the time step is

too short, the information before the current moment cannot

be fully utilized. If the time step is too long, some useless

information will also be used for scenario recognition. Both

cases will reduce the recognition accuracy, so it is necessary

AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (DECEMBER 2022) 9

to ﬁnd the most suitable time step Tthrough experiments. We

adjusted the time step from 1 to 10 and trained each model

separately. The overall recognition accuracy is shown in Fig.

17.

TABLE II

CONFUSION MATRIX OF THE TESTING SET BASED ON CONVLSTM

NETWORK.

Deep indoor Shallow indoor Semi-outdoor Open outdoor

Deep indoor 100.00% 0.00% 0.00% 0.00%

Shallow indoor 0.00% 100.00% 0.00% 0.00%

Semi-outdoor 0.00% 0.33% 99.67% 0.00%

Open outdoor 0.00% 0.00% 0.00% 100.00%

012345678910

9 9

100

A c c u r a c y ( % )

T i m e s t e p s

Fig. 17. The impact of time steps on accuracy.

The highest recognition accuracy of 99.92% is achieved

when the time step is 4, which is 1.1% higher than that

of the CNN model and previous work(Table III). The use

of more time steps requires a larger computational effort.

However, even with fewer time steps, the accuracy exceeds

99%. Nonetheless, the ConvLSTM network has an unavoid-

able drawback: scenario recognition cannot be performed in

the ﬁrst T−1sampling moments. One solution to this

issue is to use the CNN model for scenario recognition at

these moments. Although the recognition accuracy decreases

slightly, the model is least usable during these moments.

TABLE III

SCENARIO RECOGNITION ALGORITHMS BASED ON GNSS

MEASUREMENTS IN RECENT YEARS.

Authors Years Methods Accuracy

Chen and Tan [28] 2017 Threshold judgment 85.6%

Gao and Groves [29] 2018 HMM 88.2%

Lai et al. [30] 2021 SVM 90.3%

Zhu et al. [11] 2019 Stacked Machine Learning 97.2%

Xia et al. [12] 2020 HMM and LSTM 98.65%

Ours 2022 CNN 98.82%

Ours 2022 ConvLSTM 99.92%

G. Real-Time Ability

In this section, we investigate the computing time of the

CNN and ConvLSTM models (Fig. 18). The times for con-

structing the images and model inference are considered. We

ﬁrst calculated the total computing time for the testing set of

approximately 2,400 data samples and determined the average

computing time per sample. It is worth mentioning that we

tested the real-time performance only on the CPU (Intel Xeon

Platinum 8160T) but not on the GPU. Nevertheless, our model

still takes only tens of milliseconds.

10 20 30 40 50 60 70 80 90 100

1 0

2 0

3 0

4 0

5 0

T im e ( m s )

R a d i u s

C N N

O p t i m a l t i m e s t e p

O p t i m a l r e s o l u t i o n

27.94

16.82

12345678910

C o n v L S T M

T im e s t e p s

Fig. 18. Real-time ability of the CNN and ConvLSTM models.

As rincreases, the number of pixels (2r×2r) grows

parabolically. The computing time of CNN also has the same

increasing trend. Hence the time complexity is O(r2). For a

ConvLSTM network of depth n, each additional time step T

means an increase of nConvLSTM cells (Fig. 6). Since the

computation time of each cell is ﬁxed , the time complexity of

ConvLSTM is O(T). Therefore, we observe a linear growth

trend.

As shown in Fig. 18, when r= 50, the computation

time of the CNN is only 16.82ms. The computation time of

ConvLSTM is longer than that of CNN (the red broken line

is above the blue dashed line). Even so, it still takes only

27.94ms. If a delay of 27.94ms is unacceptable in practical

applications, the time step can be reduced to obtain a lower

delay, and the recognition accuracy can still be guaranteed to

be above 99%. Therefore, CNN and ConvLSTM models boh

have the potential to run in real time.

IV. CONCLUSION

Using smartphones for seamless indoor and outdoor nav-

igation and positioning usually requires switching between

different positioning techniques in different indoor and out-

door scenarios. Therefore, how to use built-in GNSS mea-

surements for efﬁcient scenario recognition has become a

key issue. In this paper, we develop a deep learning-based

scenario recognition algorithm using GNSS measurements

from smartphones. It maps GNSS measurements to 4-channel

images with Voronoi tessellation. Then the images are fed

into a CNN to recognize four scenarios. Alternatively, con-

sidering correlations among consecutive scenarios over time,

the ConvLSTM model is introduced to further improve the

recognition accuracy. Experiments were designed to evaluate

the proposed algorithm in terms of recognition accuracy and

robustness. The results showed that CNN and ConvLSTM

algorithms achieved overall recognition accuracies of 98.82%

and 99.92%, respectively, which are higher than those of

existing algorithms. We also compared the computation time

of the two networks and analyzed their real-time performance.

The latencies of the CNN and ConvLSTM models on a CPU

were only 16.82ms and 27.94ms, respectively. In future work,

10 IEEE SENSORS JOURNAL, VOL. XX, NO. XX, XXXX 2022

hardware resources and energy consumption issues need to be

focused on. In addition, how to combine our algorithm with

navigation methods into a complete system requires further

research.

ACKNOWLEDGMENT

This work was supported by the Science and Technology

Planning Project of Guangdong Province of China (Grant

No. 2021A0505030030). The authors would like to thank the

developers of the Android application Geo++ RINEX Logger.

REFERENCES

[1] H. S. Maghdid, I. A. Lami, K. Z. Ghafoor, and J. Lloret, ”Seamless

outdoors-indoors localization solutions on smartphones: implementation

and challenges,” ACM Computing Surveys (CSUR), vol. 48, no. 4, pp.

1-34, 2016.

[2] A. T. Balaei, ”Statistical inference technique in pre-correlation in-

terference detection in GPS receivers,” in Proceedings of the 19th

International Technical Meeting of the Satellite Division of The Institute

of Navigation (ION GNSS 2006), 2006, pp. 2232-2240.

[3] F. Bastide, E. Chatre, and C. Macabiau, ”GPS interference detection and

identiﬁcation using multicorrelator receivers,” in Proceedings of the 14th

International Technical Meeting of the Satellite Division of The Institute

of Navigation (ION GPS 2001), 2001, pp. 872-881.

[4] Y. Dong, T. Arslan and Y. Yang, ”Real-Time NLOS/LOS Identiﬁcation

for Smartphone-Based Indoor Positioning Systems Using WiFi RTT and

RSS,” IEEE Sensors Journal, vol. 22, no. 6, pp. 5199-5209, 2022.

[5] S. Cao, X. Lu, and S. Shen, ”Gvins: Tightly coupled gnss–visual–inertial

fusion for smooth and consistent state estimation,” IEEE Transactions

on Robotics, 2022.

[6] P. D. Groves, Z. Jiang, L. Wang, and M. K. Ziebart, ”Intelligent urban

positioning using multi-constellation GNSS with 3D mapping and NLOS

signal detection,” in Proceedings of the 25th International Technical

Meeting of The Satellite Division of the Institute of Navigation (ION

GNSS 2012), 2012, pp. 458-472.

[7] F. Santi, F. Pieralice, and D. Pastina, “Joint detection and localization of

vessels at sea with a GNSS-based multistatic radar,” IEEE Transactions

on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 5894-5913, 2019.

[8] T. Li, H. Zhang, Z. Gao, Q. Chen, and X. Niu, ”High-accuracy

positioning in urban environments using single-frequency multi-GNSS

RTK/MEMS-IMU integration,” Remote sensing, vol. 10, no. 2, p. 205,

2018.

[9] K. Xu, Y. Chen, T. A. Okhai et al., “Micro optical sensors based on

avalanching silicon light-emitting devices monolithically integrated on

chips,” Optical Materials Express, vol. 9, no. 10, pp. 3985-3997, 2019.

[10] W. Wang, Q. Chang, Q. Li, Z. Shi, and W. Chen, ”Indoor-outdoor

detection using a smart phone sensor,” Sensors, vol. 16, no. 10, p. 1563,

2016.

[11] Y. Zhu et al., ”A fast indoor/outdoor transition detection algorithm based

on machine learning,” Sensors, vol. 19, no. 4, p. 786, 2019.

[12] Y. Xia et al., ”Recurrent neural network based scenario recognition with

multi-constellation GNSS measurements on a smartphone,” Measure-

ment, vol. 153, p. 107420, 2020.

[13] P. D. Groves, ”Shadow matching: A new GNSS positioning technique

for urban canyons,” The journal of Navigation, vol. 64, no. 3, pp. 417-

430, 2011.

[14] Z. Z. M. Kassas, J. Khalife, K. Shamaei, and J. Morales, ”I hear,

therefore I know where I am: Compensating for GNSS limitations with

cellular signals,” IEEE Signal Processing Magazine, vol. 34, no. 5, pp.

111-124, 2017.

[15] M. A. Caceres, F. Penna, H. Wymeersch, and R. Garello, ”Hybrid

cooperative positioning based on distributed belief propagation,” IEEE

Journal on Selected Areas in Communications, vol. 29, no. 10, pp. 1948-

1958, 2011.

[16] C. Yang and H.-R. Shao, ”WiFi-based indoor positioning,” IEEE Com-

munications Magazine, vol. 53, no. 3, pp. 150-157, 2015.

[17] D. Yu and C. Li, ”An Accurate WiFi Indoor Positioning Algorithm for

Complex Pedestrian Environments,” IEEE Sensors Journal, vol. 21, no.

21, pp. 24440-24452, 2021.

[18] J.-H. Huh and K. Seo, ”An indoor location-based control system using

bluetooth beacons for IoT systems,” Sensors, vol. 17, no. 12, p. 2917,

2017.

[19] J. Tiemann, F. Schweikowski, and C. Wietfeld, ”Design of an UWB

indoor-positioning system for UAV navigation in GNSS-denied envi-

ronments,” in 2015 international conference on indoor positioning and

indoor navigation (IPIN), 2015: IEEE, pp. 1-7.

[20] B. Yang, J. Li, Z. Shao and H. Zhang, ”Robust UWB Indoor Localiza-

tion for NLOS Scenes via Learning Spatial-Temporal Features,” IEEE

Sensors Journal, vol. 22, no. 8, pp. 7990-8000, 2022.

[21] W. Gu, M. Aminikashani, P. Deng, and M. Kavehrad, ”Impact of mul-

tipath reﬂections on the performance of indoor visible light positioning

systems,” Journal of Lightwave Technology, vol. 34, no. 10, pp. 2578-

2587, 2016.

[22] I. Tang and T. P. Breckon, ”Automatic road environment classiﬁcation,”

IEEE Transactions on Intelligent Transportation Systems, vol. 12, no. 2,

pp. 476-484, 2010.

[23] A. Bosch, A. Zisserman, and X. Munoz, ”Scene classiﬁcation using

a hybrid generative/discriminative approach,” IEEE transactions on

pattern analysis and machine intelligence, vol. 30, no. 4, pp. 712-727,

2008.

[24] L. Fei-Fei and P. Perona, ”A bayesian hierarchical model for learning

natural scene categories,” in 2005 IEEE Computer Society Conference

on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 2:

IEEE, pp. 524-531.

[25] M. Ali, T. ElBatt, and M. Youssef, ”SenseIO: Realistic ubiquitous indoor

outdoor detection system using smartphones,” IEEE Sensors Journal,

vol. 18, no. 9, pp. 3684-3693, 2018.

[26] P. Zhou, Y. Zheng, Z. Li, M. Li, and G. Shen, ”Iodetector: A generic

service for indoor outdoor detection,” in Proceedings of the 10th acm

conference on embedded network sensor systems, 2012, pp. 113-126.

[27] S. Li et al., ”A lightweight and aggregated system for indoor/outdoor

detection using smart devices,” Future Generation Computer Systems,

vol. 107, pp. 988-997, 2020.

[28] K. Chen and G. Tan, ”SatProbe: Low-energy and fast indoor/outdoor

detection based on raw GPS processing,” in IEEE INFOCOM 2017-

IEEE Conference on Computer Communications, 2017: IEEE, pp. 1-9.

[29] H. Gao, and P. D. Groves, “Environmental context detection for adaptive

navigation using GNSS measurements from a smartphone,” Navigation:

Journal of the Institute of Navigation, vol. 65, no. 1, pp. 99-116, 2018.

[30] Q. Lai et al., ”Research on GNSS/INS combined positioning method in

urban environment based on scenario detection,”. Navigation Positioning

and Timing, vol. 8, no. 1, pp. 151-162, 2021.

[31] K. Fukami, R. Maulik, N. Ramachandra, K. Fukagata, and K.

Taira, ”Global ﬁeld reconstruction from sparse sensors with Voronoi

tessellation-assisted deep learning,” Nature Machine Intelligence, vol. 3,

no. 11, pp. 945-951, 2021.

[32] F. Aurenhammer, ”Voronoi diagrams—a survey of a fundamental geo-

metric data structure,” ACM Computing Surveys (CSUR), vol. 23, no. 3,

pp. 345-405, 1991.

[33] Y. LeCun et al., ”Handwritten digit recognition with a back-propagation

network,” Advances in neural information processing systems, vol. 2,

1989.

[34] X. Shi et al., “Convolutional LSTM network: A machine learning

approach for precipitation nowcasting,” Advances in neural information

processing systems, vol. 28, 2015.

[35] S. Hochreiter and J. Schmidhuber, ”Long Short-Term Memory,” Neural

Computation, vol. 9, no. 8, pp. 1735-1780, 1997.

[36] K. Xu, “Silicon electro-optic micro-modulator fabricated in standard

CMOS technology as components for all silicon monolithic integrated

optoelectronic systems,” Journal of Micromechanics and Microengineer-

ing, vol. 31, no. 5, pp. 054001, 2021.

Zhiqiang Dai received his Ph.D. degree from

Wuhan University, China. He is currently an as-

sistant professor at the School of Electronics

and Communication Engineering, Sun Yat-Sen

University. He is mainly engaged in the theory

of GNSS/SBAS precise data processing, real-

time PPP, multi-sensor navigation data fusion,

and algorithm and software development.

AUTHOR et al.: PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS (DECEMBER 2022) 11

Chunlei Zhai received his B.S. degree in elec-

tronic information science and technology from

Sun Yat-Sen University, China, in 2020. He

is currently pursuing an M.S. degree in elec-

tronic information at the School of Electron-

ics and Communication Engineering, Sun Yat-

Sen University, China. His research interests

include intelligent navigation and vision-based

autonomous navigation of UAVs.

Fang Li received her B.S. degree from Central

South University in 2020, and her M.S. degree

from Sun Yat-Sen University in 2022. She is

currently pursuing a Ph.D. degree in communi-

cation engineering at the School of Electronics

and Communication Engineering, Sun Yat-Sen

University, China. Her current research interests

include reliable navigation in complex scenarios

and multi-source fusion navigation.

Weixiang Chen received his B.S degree from

the School of Electronic and Communication

Engineering, Sun Yat-Sen University, China in

2021. He is currently pursing an M.S. degree in

the Department of Electronic Engineering from

Tsinghua University, China. His research inter-

ests include pseudolite positioning, muti-sensors

information fusion and SLAM.

Xiangwei Zhu received his Ph.D. degree from

the National University of Defense Technology,

China. He is currently a professor at the School

of Electronics and Communication Engineering,

Sun Yat-Sen University. His current research

interests include global navigation satellite sys-

tem, time synchronization, intelligent signal pro-

cessing and instrument design.

Yanming Feng received his Ph.D. degree in

satellite geodesy and spatial science from

the Wuhan Technical University of Surveying

and Mapping (Wuhan University since 2000),

Wuhan, China. He is currently a professor in

data science and navigation with the School

of Computer Science, Queensland University of

Technology, Brisbane, Australia. His active re-

search areas include global navigation satellite

system (GNSS) algorithms, geodetic data an-

alytics, satellite orbit determination and space

debris monitoring, the Internet of Things, precise positioning and de-

formation monitoring, vehicular networks and communications, and

machine learning applications. He has published articles in journals

on topics, such as geodesy, navigation, aerospace, sensors, remote

sensing, vehicular networks, and intelligent transport systems.

GNSS Measurement-Based Context Recognition for Vehicle Navigation Using Gated Recurrent Unit

Conference Paper

Full-text available

Oct 2023

Recent years, people have put forward higher and higher requirements for context-adaptive navigation (CAN). CAN system realizes seamless navigation in complex environments by recognizing the ambient surroundings of vehicles, and it is crucial to develop a fast, reliable, and robust navigational context recognition (NCR) method to enable CAN systems to operate effectively. Environmental context recognition based on Global Navigation Satellite System (GNSS) measurements has attracted widespread attention due to its low cost because it does not require additional infrastructure. The performance and application value of NCR methods depend on three main factors: context categorization, feature extraction, and classification models. In this paper, a finegrained context categorization framework comprising seven environment categories (open sky, tree-lined avenue, semi-outdoor, urban canyon, viaduct-down, shallow indoor, and deep indoor) is proposed, which currently represents the most elaborate context categorization framework known in this research domain. To improve discrimination between categories, a new feature called the C/N0-weighted azimuth distribution factor, is designed. Then, to ensure real-time performance, a lightweight gated recurrent unit (GRU) network is adopted for its excellent sequence data processing capabilities. A dataset containing 59,996 samples is created and made publicly available to researchers in the NCR community on Github. Extensive experiments have been conducted on the dataset, and the results show that the proposed method achieves an overall recognition accuracy of 99.41% for isolated scenarios and 94.95% for transition scenarios, with an average transition delay of 2.14 seconds.

A Survey of Machine Learning Techniques for Improving Global Navigation Satellite Systems

Preprint

Full-text available

Mar 2024

Global Navigation Satellite Systems (GNSS)-based positioning plays a crucial role in various applications, including navigation, transportation, logistics, mapping, and emergency services. Traditional GNSS positioning methods are model-based and they utilize satellite geometry and the known properties of satellite signals. However, model-based methods have limitations in challenging environments and often lack adaptability to uncertain noise models. This paper highlights recent advances in Machine Learning (ML) and its potential to address these limitations. It covers a broad range of ML methods, including supervised learning, unsupervised learning, deep learning, and hybrid approaches. The survey provides insights into positioning applications related to GNSS such as signal analysis, anomaly detection, multi-sensor integration, prediction, and accuracy enhancement using ML. It discusses the strengths, limitations, and challenges of current ML-based approaches for GNSS positioning, providing a comprehensive overview of the field.

An Adaptive Stochastic Model Based on Scene Recognition and Satellite Classification with Low-Cost GNSS Devices

Conference Paper

May 2024

Challenges and Future Applications for IRNSS with Machine Learning Algorithms

Conference Paper

Mar 2024

The interest in machine learning (ML) research and its potential applications in many fields has also led to several studies on its use in Indian Regional Navigation Satellite Systems (IRNSS). The traditional IRNSS survey algorithms and models are further developed to improve their reliability and efficiency using machine learning techniques. We review how ML can improve the efficiency and usability of IRNSS, and also discuss areas of IRNSS where ML algorithms have been applied. Potential areas of IRNSS where ML can be applied to improve efficiency, accuracy, and robustness are explored, providing fertile ground for new research. The results show reasonable performance of the machine learning techniques for several IRNSS applications. However, the use of ML models in industry is still limited. In addition, we discuss the application areas, challenges, risks, and futures of using ML techniques in IRNSS.

Intelligent Urban Positioning Using Smartphone-Based GNSS and Pedestrian Network

Article

Jun 2024

Sidewalk-level positions are required for a growing number of pedestrian applications. However, in urban canyons, buildings along both sides of the street severely obstruct Global Navigation Satellite System (GNSS) signals, and the lack of redundant fault-free measurements leads to the poor accuracy in the cross-street direction, posing challenges in determining the side of the street solely based on GNSS positions. While 3D building models have been utilized to improve position accuracy, particularly in the cross-street direction, techniques relying on these models face issues such as position ambiguity, high computational load, and low accuracy in the along-street direction. In this study, we aim to develop a novel intelligent urban positioning system using smartphone sensors and pedestrian network. An algorithm is proposed to determine the side of the street by analyzing which half of the sky most of LOS signals are observed. The additional virtual measurement derived from the sidewalk is combined with real measurements to solve GNSS position. It can achieve sidewalk-level positioning since the redundancy in the cross-street direction is significantly improved. The proposed system offers several advantages including elimination of the need for LOS/NLOS signal identification for each satellite and elimination of the need for 3D building models. Extensive datasets were utilized to train the classification model and evaluate the system’s performance. The results demonstrate a correct identification rate of better than 96% using single epoch GNSS observations. More importantly, the proposed positioning system achieves the accuracy of better than 5 meter in urban canyons.

Automated Method for SLAM Evaluation in GNSS-Denied Areas

Article

Full-text available

Oct 2023

The automated inspection and mapping of engineering structures are mainly based on photogrammetry and laser scanning. Mobile robotic platforms like unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs), but also handheld platforms, allow efficient automated mapping. Engineering structures like bridges shadow global navigation satellite system (GNSS), which complicates precise localization. Simultaneous localization and mapping (SLAM) algorithms offer a sufficient solution, since they do not require GNSS. However, testing and comparing SLAM algorithms in GNSS-denied areas is difficult due to missing ground truth data. This work presents an approach to measuring the performance of SLAM in indoor and outdoor GNSS-denied areas using a terrestrial scanner Leica RTC360 and a tachymeter to acquire point cloud and trajectory information. The proposed method is independent of time synchronization between robot and tachymeter and also works on sparse SLAM point clouds. For the evaluation of the proposed method, three LiDAR-based SLAM algorithms called KISS-ICP, SC-LIO-SAM, and MA-LIO are tested using a UGV equipped with two light detection and ranging (LiDAR) sensors and an inertial measurement unit (IMU). KISS-ICP is based solely on a single LiDAR scanner and SC-LIO-SAM also uses an IMU. MA-LIO, which allows multiple (different) LiDAR sensors, is tested on a horizontal and vertical one and an IMU. Time synchronization between the tachymeter and SLAM data during post-processing allows calculating the root mean square (RMS) absolute trajectory error, mean relative trajectory error, and the mean point cloud to reference point cloud distance. It shows that the proposed method is an efficient approach to measure the performance of SLAM in GNSS-denied areas. Additionally, the method shows the superior performance of MA-LIO in four of six test tracks with 5 to 7 cm RMS trajectory error, followed by SC-LIO-SAM and KISS-ICP in last place. SC-LIO-SAM reaches the lowest point cloud to reference point cloud distance in four of six test tracks, with 4 to 12 cm.

Global field reconstruction from sparse sensors with Voronoi tessellation-assisted deep learning

Article

Full-text available

Nov 2021

Achieving accurate and robust global situational awareness of a complex time-evolving field from a limited number of sensors has been a long-standing challenge. This reconstruction problem is especially difficult when sensors are sparsely positioned in a seemingly random or unorganized manner, which is often encountered in a range of scientific and engineering problems. Moreover, these sensors could be in motion and could become online or offline over time. The key leverage in addressing this scientific issue is the wealth of data accumulated from the sensors. As a solution to this problem, we propose a data-driven spatial field recovery technique founded on a structured grid-based deep-learning approach for arbitrary positioned sensors of any numbers. It should be noted that naive use of machine learning becomes prohibitively expensive for global field reconstruction and is furthermore not adaptable to an arbitrary number of sensors. In this work, we consider the use of Voronoi tessellation to obtain a structured-grid representation from sensor locations, enabling the computationally tractable use of convolutional neural networks. One of the central features of our method is its compatibility with deep learning-based super-resolution reconstruction techniques for structured sensor data that are established for image processing. The proposed reconstruction technique is demonstrated for unsteady wake flow, geophysical data and three-dimensional turbulence. The current framework is able to handle an arbitrary number of moving sensors and thereby overcomes a major limitation with existing reconstruction methods. Our technique opens a new pathway toward the practical use of neural networks for real-time global field estimation.

Micro optical sensors based on avalanching silicon light-emitting devices monolithically integrated on chips

Article

Full-text available

Sep 2019

Silicon avalanche light-emitting devices (Si Av LEDs) offer various possibilities for realizing micro- and even nano- optical biosensors directly on chip. The light-emitting devices (LEDs) operate in the wavelength range of about 450-850nm, and their optical power emitted is of the order of a few hundreds of nW/µm². These LEDs could be fabricated in micro- and nano- dimensions by using modern semiconductor fabrication processing technologies through the mainstream of silicon material. Through a series of experiments, the dispersion phenomena in the Si Av LED are observed. Also, its light emission point was proved to locate at about one micron just below the silicon-silicon oxide interface. Subsequently, a micro-fluidic channel sensor was designed by using the dispersion characteristics owned by the Si Av LED. The analytes flowing through a micro-fluidic channel could be studied by their specific transmittance and absorption spectra. Moreover, simulations verify that a novel designed waveguide-based sensor could be fabricated on chip between the Si optical source and the Si P-I-N detector.

A Fast Indoor/Outdoor Transition Detection Algorithm Based on Machine Learning

Article

Full-text available

Feb 2019
SENSORS-BASEL

The widespread popularity of smartphones makes it possible to provide Location-Based Services (LBS) in a variety of complex scenarios. The location and contextual status, especially the Indoor/Outdoor switching, provides a direct indicator for seamless indoor and outdoor positioning and navigation. It is challenging to quickly detect indoor and outdoor transitions with high confidence due to a variety of signal variations in complex scenarios and the similarity of indoor and outdoor signal sources in the IO transition regions. In this paper, we consider the challenge of switching quickly in IO transition regions with high detection accuracy in complex scenarios. Towards this end, we analyze and extract spatial geometry distribution, time sequence and statistical features under different sliding windows from GNSS measurements in Android smartphones and present a novel IO detection method employing an ensemble model based on stacking and filtering the detection result by Hidden Markov Model. We evaluated our algorithm on four datasets. The results showed that our proposed algorithm was capable of identifying IO state with 99.11% accuracy in indoor and outdoor environment where we have collected data and 97.02% accuracy in new indoor and outdoor scenarios. Furthermore, in the scenario of indoor and outdoor transition where we have collected data, the recognition accuracy reaches 94.53% and the probability of switching delay within 3 s exceeds 80%. In the new scenario, the recognition accuracy reaches 92.80% and the probability of switching delay within 4 s exceeds 80%.

Robust UWB Indoor Localization for NLOS Scenes via Learning Spatial-Temporal Features

Article

Apr 2022

Ultra-wide band (UWB) localization system suffers from deteriorating performance in complex scenes, especially in non-line-of-sight (NLOS) conditions. In order to improve the accuracy and robustness of the localization system in NLOS environments, we propose an end-to-end deep neural network with both distance and received signal strength (RSS) measurements. On one hand, high-level spatial-temporal features can be learned through the proposed network from both RSS and distance data, which benefits the localization performance. On the other hand, the proposed network is robust to the variance in the number of available anchors, leading to high adaptability to different scenes. Specifically, three modules are designed in the deep network: 1) a module based on convolutional neural network (CNN) is presented to extract the local spatial features from the input data, and the structure of this module lends itself to varying-dimension input. 2) To manage the correlations between consecutive frames, we develop a deep long short-term memory (LSTM) model to extract temporal features and provide a high-level representation for a series of input data. 3) Finally, the fully-connected layers are utilized to estimate the 3D positions of the UWB tag. We conduct extensive experiments in three real-world scenarios to evaluate the proposed deep network. The experimental results indicate that the proposed network can significantly improve the accuracy and robustness of the UWB localization results, especially in NLOS situations.

GVINS: Tightly Coupled GNSS–Visual–Inertial Fusion for Smooth and Consistent State Estimation

Article

Jan 2022

Visual–inertial odometry (VIO) is known to suffer from drifting, especially over long-term runs. In this article, we present GVINS, a nonlinear optimization-based system that tightly fuses global navigation satellite system (GNSS) raw measurements with visual and inertial information for real-time and drift-free state estimation. Our system aims to provide accurate global six-degree-of-freedom estimation under complex indoor–outdoor environments, where GNSS signals may be intermittent or even inaccessible. To establish the connection between global measurements and local states, a coarse-to-fine initialization procedure is proposed to efficiently calibrate the transformation online and initialize GNSS states from only a short window of measurements. The GNSS code pseudorange and Doppler shift measurements, along with visual and inertial information, are then modeled and used to constrain the system states in a factor graph framework. For complex and GNSS-unfriendly areas, the degenerate cases are discussed and carefully handled to ensure robustness. Thanks to the tightly coupled multisensor approach and system design, our system fully exploits the merits of three types of sensors and is able to seamlessly cope with the transition between indoor and outdoor environments, where satellites are lost and reacquired. We extensively evaluate the proposed system by both simulation and real-world experiments, and the results demonstrate that our system substantially suppresses the drift of the VIO and preserves the local accuracy in spite of noisy GNSS measurements. The versatility and robustness of the system are verified on large-scale data collected in challenging environments. In addition, experiments show that our system can still benefit from the presence of only one satellite, whereas at least four satellites are required for its conventional GNSS counterparts.

Real-time NLOS/LOS Identification for Smartphone-based Indoor Positioning Systems using WiFi RTT and RSS

Article

Jan 2021

The accuracy of smartphone-based positioning systems using WiFi usually suffers from ranging errors caused by non-line-of-sight (NLOS) conditions. Previous research usually exploits several distribution features from a long time series (hundreds of samples) of WiFi received signal strength (RSS) or WiFi round-trip time (RTT) to achieve a high identification accuracy. However, the long time series or large sample size attributes to high power and time consumption in data collection for both training and testing. This will also undoubtedly be detrimental to user experience as the waiting time for getting enough samples is quite long. Therefore, this paper proposes three new real-time NLOS/LOS identification methods for smartphone-based indoor positioning systems using WiFi RSS and RTT distance measurement (RDM). Based on our extensive analysis of RSS and RDM dispersion features, three machine learning algorithms were chosen and developed to separate the samples for NLOS/LOS conditions. Experiments show that our best method achieves a discrimination accuracy of over 96% with a sample size of 10. Considering the theoretically shortest WiFi ranging interval of 100ms of the RTT-enabled smartphones, our algorithm is able to provide the shortest latency of 1s to get the testing result among all of the state-of-art methods.

An Accurate WiFi Indoor Positioning Algorithm For Complex Pedestrian Environments

Article

Sep 2021

This paper proposes a precise WiFi fingerprinting indoor positioning algorithm for complex pedestrian environments. We transform the disturbed received signal strength (RSS) from the original space to latent space using the improved probabilistic linear discriminant analysis (PLDA). In the latent space, Bayes rule is used to calculate the posterior probability of the similarity between the test point and the reference points, and the ${K}$ reference points with the highest posterior probability are weighted to estimate the position. Actual on-site experiments involving three floors demonstrate that the mean localization error of the proposed algorithm is 1.38 m, which outperforms the Horus algorithm by 29% under the same test conditions. In addition, by studying the variability of mean value of RSS in different pedestrian environments, the fingerprint maps in different states of personnel movement are simulated. By using which, the average localization error of the proposed algorithm increases slightly to 1.63m, while the workload required during the offline training phase is significantly reduced.

Silicon electro-optic micro-modulator fabricated in standard CMOS technology as components for all silicon monolithic integrated optoelectronic systems

Article

Mar 2021

Kaikai Xu

In this paper, optoelectronic characteristics and related switching behavior of one monolithically integrated silicon light-emitting device (LED) with an interesting wavelength range of 400–900 nm are studied. Through the comparison of two types of geometry, Si avalanche-based LED and Si field-effect LED (Si FE LED), in the same device, we establish the dimensional dependence of the switching speed of the LED. Almost-linear modulation curve implies lower distortion is shown for the Si FE LED with light emission enhancement, and technology computer aided design (TCAD) simulations are in line with the experimental results. Our findings indicate that ON–OFF keying up to GHz frequencies should be feasible with such diodes. Potential applications should include Si FE LED integrated into the micro-photonic systems.

Recurrent Neural Network based Scenario Recognition with Multi-constellation GNSS Measurements on a Smartphone

Article

Mar 2020
MEASUREMENT

As an upper layer context-aware mobile application, fast and accurate scenario recognition is essential for seamless indoor and outdoor localization and robust positioning in complex environments. With the popularity of multi-constellation smartphones, scenario recognition based on smartphone Global Navigation Satellite System (GNSS) measurements becomes desirable. In this paper, we divide the complex environments into four categories (deep indoor, shallow indoor, semi-outdoor and open outdoor) and conduct research work in two areas. Firstly, we analyze in detail the influence of multi-constellation satellite signals on scenario recognition performance based on a Hidden Markov Model (HMM) algorithm. The experimental results show that the scenario recognition accuracy is improved significantly with the increase of the number of constellations received by smartphones. Secondly, in order to solve the description degradation of the traditional model caused by scenario transitions and environmental changes around the scenario, we propose a new scenario recognition method based on Recurrent Neural Network (RNN). Considering the computational complexity and the availability of feature values, we utilize the position-independent features as the input of the RNN model, and then evaluate the performance of the model using test sets from the new places. The results indicate that our proposed algorithm has high recognition accuracy in both isolated scenarios and transition regions, with the overall accuracy of 98.65%. Especially in the scenario transitions, the recognition accuracy reaches 90.94% and in the three times of correct recognition for scenario transitions (four times in total), the maximum transition delay is only 3 s.

Joint Detection and Localization of Vessels at Sea With a GNSS-Based Multistatic Radar

Article

Apr 2019

This paper addresses the exploitation of global navigation satellite systems as opportunistic sources for the joint detection and localization of vessels at sea in a passive multistatic radar system. A single receiver mounted on a proper platform (e.g., a moored buoy) can collect the signals emitted by multiple navigation satellites and reflected from ship targets of interest. This paper puts forward a single-stage approach to jointly detect and localize the ship targets by making use of long integration times (tens of seconds) and properly exploiting the spatial diversity offered by such a configuration. A proper strategy is defined to form a long-time and multistatic range and Doppler (RD) map, where the total target power can be reinforced with respect to, in turn, the case in which the RD map is obtained over a short dwell and the case in which a single transmitter is employed. The exploitation of both the long integration time and the multiple transmitters can greatly enhance the performance of the system, allowing counteracting the low-power budget provided by the considered sources representing the main bottleneck of this technology. Moreover, the proposed single-stage approach can reach superior detection performance than a conventional two-stage process where peripheral decisions are taken at each bistatic link and subsequently the localization is achieved by multilateration methods. Theoretical and simulated performance analysis is proposed and also validated by means of experimental results considering Galileo transmitters and different types of targets of opportunity in different scenarios. Obtained results prove the effectiveness of the proposed method to provide detection and localization of ship targets of interest.

Deep-Learning-Based Scenario Recognition With GNSS Measurements on Smartphones

Abstract and Figures

Recommended publications

Recurrent Neural Network based Scenario Recognition with Multi-constellation GNSS Measurements on a...

GNSS Measurement-Based Context Recognition for Vehicle Navigation Using Gated Recurrent Unit

Environment Perception Based Seamless Indoor and Outdoor Positioning System of Smartphone

GNSS-based environmental context detection for navigation