PreprintPDF Available

Fast, Accurate Barcode Detection in Ultra High-Resolution Images

February 2021

February 2021

License
CC BY-SA 4.0

Authors:

Kehan Wang

University of California, Berkeley

Preprints and early-stage research may not have been peer reviewed yet.

Object detection in Ultra High-Resolution (UHR) images has long been a challenging problem in computer vision due to the varying scales of the targeted objects. When it comes to barcode detection, resizing UHR input images to smaller sizes often leads to the loss of pertinent information, while processing them directly is highly inefficient and computationally expensive. In this paper, we propose using semantic segmentation to achieve a fast and accurate detection of barcodes of various scales in UHR images. Our pipeline involves a modified Region Proposal Network (RPN) on images of size greater than 10k$\times$10k and a newly proposed Y-Net segmentation network, followed by a post-processing workflow for fitting a bounding box around each segmented barcode mask. The end-to-end system has a latency of 16 milliseconds, which is $2.5\times$ faster than YOLOv4 and $5.9\times$ faster than Mask RCNN. In terms of accuracy, our method outperforms YOLOv4 and Mask R-CNN by a $mAP$ of 5.5% and 47.1% respectively, on a synthetic dataset. We have made available the generated synthetic barcode dataset and its code at http://www.github.com/viplab/BSBD/.

Proposed Approach, the modified RPN is followed by Y-Net and the bounding box extractor.

…

Sample outputs of our pipeline; yellow -segmented barcode pixels; purple -segmented background pixels; boxes -bounding box extracted; (a) synthetic barcode image; (b) real barcode image; (c) prediction results on (a); (d) prediction results on (b).

…

Y-Net Architecture.

…

(a) Y-Net output; (b) Y-Net output after erosion; (c) extracted bounding boxes -red, ground truth bounding boxes -green on eroded output; (d) final bounding boxes after pixel correction margin on YNet output; (e) Y-Net output of occluded barcodes scenarios; (f) final extracted bounding boxes are grouped after pixel correction margin due to overlaping barcodes in input image.

…

Figures - available via license: Creative Commons Attribution-ShareAlike 4.0 International

Content may be subject to copyright.

Available via license: CC BY-SA 4.0

Content may be subject to copyright.

FAST, ACCURATE BARCODE DETECTION IN ULTRA HIGH-RESOLUTION IMAGES

Jerome Quenum, Kehan Wang, Avideh Zakhor

Department of Electrical Engineering and Computer Science

University of California, Berkeley

{jquenum, wang.kehan, avz}@berkeley.edu

ABSTRACT

Object detection in Ultra High-Resolution (UHR) images has long

been a challenging problem in computer vision due to the varying

scales of the targeted objects. When it comes to barcode detection,

resizing UHR input images to smaller sizes often leads to the loss of

pertinent information, while processing them directly is highly in-

efﬁcient and computationally expensive. In this paper, we propose

using semantic segmentation to achieve a fast and accurate detec-

tion of barcodes of various scales in UHR images. Our pipeline

involves a modiﬁed Region Proposal Network (RPN) on images of

size greater than 10k×10k and a newly proposed Y-Net segmenta-

tion network, followed by a post-processing workﬂow for ﬁtting a

bounding box around each segmented barcode mask. The end-to-

end system has a latency of 16 milliseconds, which is 2.5×faster

than YOLOv4 and 5.9×faster than Mask RCNN. In terms of ac-

curacy, our method outperforms YOLOv4 and Mask R-CNN by a

mAP of 5.5% and 47.1% respectively, on a synthetic dataset. We

have made available the generated synthetic barcode dataset and its

code at http://www.github.com/viplab/BSBD/.

Index Terms—Barcode detection with deep neural networks,

barcode segmentation, Ultra High-Resolution images.

1. INTRODUCTION

Barcodes are digital signs often made of adjacent and alternating

black and white smaller rectangles that have become an intrinsic part

of human society. In administration, for example, they are used to

encode, save, and retrieve various users’ information. At grocery

stores, they are used to track sales and inventories. More interest-

ingly in e-commerce, they are used to track and speed up processing

time in warehouses and fulﬁllment centers.

In classical signal processing, ﬁlters used for detection are

image-speciﬁc since input images are not all necessarily acquired

with the same illumination, brightness, angle, or camera. Conse-

quently, adaptive image processing algorithms are required, which

can impact detection accuracy [1]. In addition, because classical

signal processing methods often run on Central Processing Units,

they tend to be much slower compared with deep learning imple-

mentations that are easily optimized on Graphics Processing Units

(GPUs).

Over the years, a number of methods have been proposed to de-

tect barcodes using classical signal processing [1, 2, 3, 4, 5], but

nearly all of them take too long to process Ultra High-Resolution

(UHR) images. More speciﬁcally, [5] used parallel segment detec-

tors which improved on their previous work [6] of ﬁnding imaginary

perpendicular lines in Hough space with maximal stable extremal

This work was supported by Amazon.com, Inc.

regions to detect barcodes. Katona et al [3] used morphological ma-

nipulation for barcode detection, but this method did not generalize

well as different barcode types have varying performances. Simi-

larly, [7] proposed using x and y derivative differences, but varying

input images yielded different outputs, and using such operation on

UHR images would be highly inefﬁcient.

With neural networks, though there has been much improvement

in barcode detection tasks, few of them have addressed the fast and

accurate detection problem in UHR images. Zamberletti et al. [8]

paved the way for using neural networks to detect barcodes by in-

vestigating Hough spaces. This was followed by [9] which adapted

the You Only Looked Once (YOLO) detector to ﬁnd barcodes in

Low Resolution (LR) images, but the YOLO algorithm is known to

perform poorly with long shaped objects such as code 39 barcodes.

Instance segmentation methods such as Mask R-CNN [10] perform

better on images of size 1024 ×1024 pixels but on smaller size im-

ages, the outputted Region of Interests (RoI) do not align well with

long, 1D-barcode structures. This is because it typically predicts

masks on 28 ×28 pixels irrespective of object size, and thereby gen-

erates ”wiggly” artifacts on some barcode predictions, losing spatial

resolution. In the same way, dedicated object detection pipelines,

such as YOLOv4 [11], though they perform well on lower Inter-

section over Union (IoU) thresholds, suffer accuracy at higher IoU

thresholds. Among those using segmentation on LR images as a

means for detection, [12] also tends to not perform well at higher

IoU thresholds.

In this paper, we propose a pipeline for detecting barcodes using

deep neural networks, shown in Fig. 1, which consists of two stages

trained separately. When compared with classical signal processing

methods, neural networks not only provide a faster inference time,

but also yield higher accuracy because they learn meaningful ﬁlters

for optimal feature extraction. As seen in Fig. 1, in the ﬁrst stage, we

expand on the Region Proposal Network (RPN) introduced in Faster

R-CNN [13] to extract high deﬁnition regions of potential locations

where barcodes might be. This stage allows us to signiﬁcantly re-

duce inference computation time that would have been required oth-

erwise in the second stage. In the second stage, we introduce Y-Net,

a semantic segmentation network that detects all instances of bar-

codes in a given outputted RoI image (400 ×400). We then apply

morphological operations on the predicted masks to separate and ex-

tract the corresponding bounding boxes as shown in Fig. 2.

One of the limitations of existing work on barcode detection is

the insufﬁcient number of training examples. ArTe-Lab 1D Medium

Barcode Dataset [8] and the WWU Muenster Barcode Database [14]

are two examples of existing available datasets. They contain 365

and 595 images respectively, with ground truth masks at a resolution

of 640 ×480. Most of the samples in the ArTe-Lab dataset have

only one EAN13 barcode per sample image, and few of them in the

Muenster database have more than one barcode instance on a given

arXiv:2102.06868v1 [cs.CV] 13 Feb 2021

Fig. 1. Proposed Approach, the modiﬁed RPN is followed by Y-Net and the bounding box extractor.

image. To address this dataset availability problem, we have released

100,000 UHR and 100,000 LR synthetic barcode datasets along with

their corresponding bounding boxes ground truths, and their ground

truth masks to facilitate further studies. The outline of this paper

is as follows: in Section 2, we describe details of our approach; in

Section 3, we summarize our experimental results and in Section 4,

we end with conclusions and future work.

2. PROPOSED APPROACH

As seen in Fig. 1, our proposed method consists of three stages:

modiﬁed Region Proposal Network, Y-Net segmentation network,

and bounding box extraction.

2.1. Modiﬁed Region Proposal Network

Region proposals have been inﬂuential in computer vision and more

so when it comes to object detection in UHR images. It is common in

UHR images that barcodes are clustered in a small region of the im-

age. To ﬁlter out most of the non-barcode backgrounds, we modiﬁed

the RPN introduced in Faster R-CNN [13] to propose regions of bar-

codes for our next stages. By ﬁrst transforming the UHR input image

to an LR input image of size 256×256, the RPN was trained to iden-

tify blobs in LR images. Once a bounding box is placed around the

identiﬁed blobs, the resulting proposed bounding box is remapped

to the input UHR image by a perspective transformation, and the re-

sulting regions are cropped out. The LR input to the RPN is chosen

to be of size 256 ×256 as a lower resolution results in the loss of

pertinent information. Non-Max Suppression (NMS) is used on the

predictions to select the most probable regions.

2.2. Y-Net Segmentation Network

As depicted in Fig. 3, Y-Net is made out of 3 main modules dis-

tributed in 2 branches: a Regular Convolutional Module shown

in blue which constitutes the left branch, and a Pyramid Pooling

Module shown in brown, along with a Dilated Convolution Mod-

ule shown in orange which after concatenation and convolution

constitute the right branch.

The Regular Convolution Module takes in 400 ×400 output

images of the RPN and consists of convolutional and pooling layers.

It starts with 64-channel 3×3kernels and doubles the number at

Fig. 2. Sample outputs of our pipeline; yellow - segmented barcode

pixels; purple - segmented background pixels; boxes - bounding box

extracted; (a) synthetic barcode image; (b) real barcode image; (c)

prediction results on (a); (d) prediction results on (b).

each layer. We alternate between convolution and max-pooling until

we reach a feature map size of 25 ×25 pixels. This module allows

the model to learn general pixel-wise information anywhere in the

input image.

The Dilated Convolution Module takes advantage of the fact

that barcodes have alternating black and white rectangles to learn

sparse features in their structure. The motivation for this module

comes from the fact that dilated convolution operators play a signif-

icant role in the ”algorithme a trous” for biorthogonal wavelet de-

composition [15]. Therefore, the discontinuities in alternating pat-

terns and sharp edges in barcodes are more accurately learned by

such ﬁlters. In addition, they leverage a multiresolution and multi-

scale decomposition as they allow the kernels to widen their recep-

128

256

384

400 x 400

200 x 200

2002

1002

502

252

Input Image

400 x 400

2002

1002

502

252

502

2002

4002

502

1002

Input Image Input Image of size 400 x 400 x 1

Output Mask of size 400 x 400 x 1

Transposed Convolution Blocks

Upsampled Blocks

Added Blocks

Max Pooling Blocks

Output Mask

32 32

400 x 400

3232

32 32

192

128

2002

1002

502

252

128

384

256

128

2002

400 x 400

Output Mask

Dilated Convolution

Transposed Convolution

Conv 3x3

Max Pool 2x2

Addition

Conv 3x3 and Up-sampling

Up-sampling

Regular Convolution Blocks

Pyramid Pooling Blocks

Dilated Convolution Blocks

Fig. 3. Y-Net Architecture.

tive ﬁelds with dilation rates from 1 up to 16. Here too a 400 ×400

input image is used and we maintain 32 – channel 3×3kernels

throughout the module while the dimensions of the layers are gradu-

ally reduced using a stride of 2 until a feature map of 25 ×25 pixels

is obtained.

The Pyramid Pooling Module allows the model to learn global

information about potential locations of the barcodes at different

scales and its layers are concatenated with the layers on the dilated

convolution module in order to preserve the features extracted from

both modules.

The resulting feature maps from the right branch are then added

to the output of the Regular Convolution Module, which allows for

the correction of features that would have been missed by either

branch. In other words, the output of each branch constitutes a resid-

ual correction for the other thereby reﬁning the result at each node

as shown in white. The nodes are then up-sampled and concatenated

with transposed convolution feature maps shown in red and yellow

of the corresponding dimension. Throughout the network, we use

ReLU as a non-linearity after each layer and add L2regularization to

account for possible over-ﬁtting scenarios that could have occurred

during training. On all datasets, we use 80% for the training set,

10% for the validation set, and the remaining 10 % for the testing

set. We use one NVIDIA Tesla V100 GPU for the training process.

Since this is a segmentation network and we are interested in classi-

fying background and barcodes, we use binary cross-entropy as loss

function.

2.3. Bounding Box Extraction

Since some images contain barcodes that are really close to each

other, their Y-Net outputs reﬂect the same conﬁguration which

makes the extraction of individual barcode bounding boxes complex

as shown in Fig. 4(a). To separate them effectively, we perform

an erosion, contour extraction, and bounding box expansion with a

pixel correction margin. As shown in Fig. 4(b), the erosion stage

allows the algorithm to widen gaps between segmented barcodes

that may be separated by 1 or more pixels. The resulting mask is

then used to infer individual barcode bounding boxes in the contour

extraction stage in Fig. 4(c) through border following. A pixel

correction margin is used to recover the original bounding boxes’

dimensions during the expansion stage as shown in Fig. 4(d). This

post-processing stage of our pipeline has an average processing time

of 1.5 milliseconds (ms) because it is made of a set of Python ma-

trix operations to efﬁciently extract bounding boxes from predicted

masks.

3. DATASETS AND RESULTS

For the synthetic dataset, we use treepoem 1and random-word 2to

generate UHR and LR barcode images. We use Code 39, Code 93,

Code 128, UPC, EAN, PD417, ITF, Data Matrix, AZTEC, and QR

among others. We model the number of barcodes in a given image

using a Poisson process and a combination of perspective transforms

is used to make the barcodes vary in shape and position from one im-

age to the other. We have also added random black blobs at random

locations on the original UHR and LR canvases. The real UHR bar-

codes dataset obtained from Amazon.com, Inc is made of 3.8 million

UHR images of resolution up to 30k×30kgrayscale images and

could not be released due to conﬁdentiality reasons. Additionally,

1https://github.com/adamchainz/treepoem

2https://github.com/vaibhavsingh97/random-word

mAP

(all)

AP50

(all)

AP75

(all)

mAP

(small)

mAP

(medium)

AR50

(all)

AR70

(all)

AR80

(all)

AR90

(all)

Latency

(ms)

Resolution

(px)

Mask R-CNN [10] .466 .985 .317 .340 .489 .990 .740 .279 .023 94.8 448 ×448

YOLOv4 [11] .882 .990 .989 .815 .897 1. 1. .995 .873 40.5 320 ×320

Ours .937 .990 .990 .903 .945 1. 1. 1. .972 16.0 400 ×400

Table 1. Average Precision for Max Detection of 100 and Average Recall for Max Detection of 10 computed using MS COCO API.

Muenster Dataset ArTe Lab Dataset

DR Precision Recall mIoU DR Precision Recall mIoU

Creusot et al. [5] .982 - - - .989 - - -

Hansen et al. [9] .991 - - .873 .926 - - .816

Namane et al. [1] .966 - - .882 .930 - - .860

Zharkov et al. [12] .980 .777 .990 .842 .989 .814 .995 .819

ours 1. .984 1. .921 1. .974 1. .934

Table 2. Mean IoU (mIoU), Precison and Recall and Detection Rate (DR) at IoU threshold of 0.5 (Muenster and ArTe-Lab Dataset).

Px Acc Px mIoU Px Prec Px Rec

Mask R-CNN [10] .993 .990 .989 .890

Ours 1. 1. .999 .999

Table 3. Pixel-wise Metrics

the Muenster and Artelab datasets are used with some data augmen-

tation schemes for more samples.

For the RPN, we accumulated the number of bounding boxes

inside the proposed regions and divided it by the total number of

ground truth bounding boxes. Our implementation yields an ac-

curacy of 98.03% on the synthetic dataset at 10 ms per image and

96.8% on the real dataset at 13 ms per image while the baseline [13]

yields the same accuracies and an average latency over 2.5 seconds

(s) per image for both datasets.

For Y-Net, we use the Microsoft (MS) COCO API, and Pixel-

wise metrics to evaluate against [10, 11]. By default, the MS COCO

API conﬁguration evaluates on small, medium and large areas ob-

jects but in our application, the largest detected barcode area is

medium. Since Y-Net is a segmentation network and does not

output conﬁdence scores for each segmented barcode, we propose

using pseudo scores, the ratio of the total number of nonzero pixels

in a predicted mask to the total number of nonzero pixels in the

corresponding ground truth mask at the location of a given object.

Table 1 shows mAP and mAR values of the models on the syn-

thetic dataset. As seen, our pipeline outperforms [10], and [11] by

amAP of 47.1% and 5.5% and AP75 of 67.3% and 0.1% respec-

tively. Also shown in Table 1, is a mAR90 improvement of 94.9%

and 9.9% on [10] and, [11] respectively which highlights that Y-Net

continues to yield better mAR results even at higher IoU thresholds.

Both our approach and [11] achieve an AR50 of 100% and outper-

form [10] by 1%. For small area barcodes, Y-Net outperforms [10]

and [11] by a mAP of 56.3% and 8.8% and for medium area bar-

codes, Y-Net displays a mAP increase of 45.6% and 4.8% on [10]

and [11] respectively. In addition, Table 3 reveals that Y-Net a has

much better semantic segmentation performance than [10]. Table 1

displays that Y-Net performs at least 2.5×faster than the fastest of

models [10] and, [11] on LR images.

Similarly, we have used the Detection Rate (DR), mIoU, Preci-

sion, and Recall, as described in [1, 5, 9, 12] on the Arte-Lab and

Muenster datasets and as can be seen in Table 2, our method outper-

forms previous works on all of the mentioned metrics. This indicates

that our bounding box extraction algorithm is working as expected

to detect accurate bounding boxes. However, while it is successful

in separating barcodes that are relatively close to each other, it has

limitations when barcodes are overlapping as shown in Fig. 4(e). For

those occlusion scenarios, the algorithm tends to group the overlap-

ping barcodes into one bounding box instead of separate bounding

boxes as shown in Fig. 4(f).

Fig. 4. (a) Y-Net output; (b) Y-Net output after erosion; (c) extracted

bounding boxes –red, ground truth bounding boxes –green on eroded

output; (d) ﬁnal bounding boxes after pixel correction margin on Y-

Net output; (e) Y-Net output of occluded barcodes scenarios; (f) ﬁnal

extracted bounding boxes are grouped after pixel correction margin

due to overlaping barcodes in input image.

4. CONCLUSION

In this paper, we showed that barcodes can be efﬁciently, accurately,

and speedily detected using Y-Net on UHR images. With pseudo

scores as conﬁdence scores, our approach outperforms existing de-

tection pipelines with a much better latency. In future work, we aim

to extend this method to the multi-class detection task for small ob-

jects in UHR images and videos in a weakly supervised fashion.

5. REFERENCES

[1] A. Namane and M. Arezki, “Fast real time 1d barcode detec-

tion from webcam images using the bars detection method,”

in Proceedings of the World Congress on Engineering (WCE),

2017, vol. 1.

[2] L. Hock, H. Hanaizumi, and E. Ohbuchi, “Barcode readers

using the camera device in mobile phones,” in 2013 Interna-

tional Conference on Cyberworlds. 2004, pp. 260–265, IEEE

Computer Society.

[3] M. Katona and L. G Ny ´

ul, “Efﬁcient 1d and 2d barcode detec-

tion using mathematical morphology,” in Mathematical Mor-

phology and Its Applications to Signal and Image Processing.

2013, pp. 464–475, Springer Berlin Heidelberg.

[4] G. S ¨

or¨

os and C. Fl¨

orkemeier, “Blur-resistant joint 1d and 2d

barcode localization for smartphones,” in Proceedings of the

12th International Conference on Mobile and Ubiquitous Mul-

timedia. 2013, MUM ’13, Association for Computing Machin-

ery.

[5] C. Creusot and A. Munawar, “Low-computation egocentric

barcode detector for the blind,” in 2016 IEEE International

Conference on Image Processing (ICIP), 2016, pp. 2856–2860.

[6] C. Creusot and A. Munawar, “Real-time barcode detection in

the wild.,” IEEE Winter Conference on Applications of Com-

puter Vision, p. 239–245, 2015.

[7] O. Gallo and R. Manduchi, “Reading 1d barcodes with mobile

phones using deformable templates.,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 33, pp. 1834–

1843, 2011.

[8] A. Zamberletti, I. Gallo, M. Carullo, and E. Binaghi, “Neu-

ral image restoration for decoding 1-d barcodes using common

camera phones.,” 01 2010, vol. 1, pp. 5–11.

[9] D. K. Hansen, K. Nasrollahi, C. B. Rasmussen, , and T. B.

Moeslund, “Real-time barcode detection and classiﬁcation us-

ing deep learning.,” IJCCI, vol. 1, pp. 321–327, 2017.

[10] P. Doll´

ar K. He, G. Gkioxari and R. Girshick, “Mask R-CNN,”

in 2017 IEEE International Conference on Computer Vision

(ICCV), 2017, pp. 2980–2988.

[11] A. Bochkovskiy, C. Wang, and H. M. Liao, “YOLOv4: Opti-

mal speed and accuracy of object detection,” 2020.

[12] A. Zharkov and I. Zagaynov, “Universal barcode detector via

semantic segmentation,” 2019 International Conference on

Document Analysis and Recognition (ICDAR), pp. 837–843,

2019.

[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN:

Towards real-time object detection with region proposal net-

works,” IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.

[14] X. Jiang S. Wachenfeld, S. Terlunen, Robust 1-D Barcode

Recognition on Camera Phones and Mobile Product Informa-

tion Display, pp. 53–69, Springer Berlin Heidelberg, Berlin,

Heidelberg, 2010.

[15] M. Holschneider, R. Kronland-Martinet, J. Morlet, and Ph.

Tchamitchian, “A real-time algorithm for signal analysis

with the help of the wavelet transform,” In Wavelets: Time-

Frequency Methods and Phase Space. Proceedings of the In-

ternational Conference, 1987.

ResearchGate has not been able to resolve any citations for this publication.

Fast Real Time 1D Barcode Detection From Webcam Images Using the Bars Detection Method

Conference Paper

Full-text available

Jul 2017

The detection of 1D barcode from blurry, low contrast and low resolution images is still a challenging problem. 1D barcodes are constituted by a number of bars. In this work, a new bar detection method (BDM) is developed for accurate and very fast 1D barcode detection. The least square method is introduced to determine accurately the orientation of the detected bars. Two probability density functions (PDF) of the bars length and orientation are used and applied to the detected bars in order to determine the dominant orientation and length of the existing bars. Finally, the Hough transform (HT) is applied through these detected bars centroids, in order to the bars that are next to each other, and lie on the same line support which corresponds to the barcode orientation. Excellent results were achieved on still images from two standard one-dimensional barcode datasets: WWU Muenster Barcode Database and ArTe-Lab 1D Medium Barcode Dataset. Experimental results show that our algorithm can obtain better 1D barcode detection compared with existing methods. They show that the ability of the model yield relevant and robust barcode detection even with low resolution and blurred frames, and for twisted barcodes. The proposed method shows 100% accuracy in real time applications.

Efficient 1D and 2D Barcode Detection Using Mathematical Morphology

Conference Paper

Full-text available

May 2013

Barcode technology is essential in automatic identification, and is used in a wide range of real-time applications. Different code types and applications impose special problems, so there is a continuous need for solutions with improved performance. Several methods exist for code localization, that are well characterized by accuracy and speed. Particularly, high-speed processing places need reliable automatic barcode localization, e.g. conveyor belts and automated production, where missed detections cause loss of profit. Our goal is to detect automatically, rapidly and accurately the barcode location with the help of extracted image features. We propose a new algorithm variant, that outperforms in both accuracy and efficiency other detectors found in the literature using similar ideas, and also improves on the detection performance in detecting 2D codes compared to our previous algorithm.

Neural Image Restoration for Decoding 1-D Barcodes using Common Camera Phones.

Conference Paper

Full-text available

Jan 2010

The existing open-source libraries for 1-D barcodes recognition are not able to recognize the codes from images acquired using simple devices without autofocus or macro function. In this article we present an improvement of an existing algorithm for recognizing 1-D barcodes using camera phones with and without autofocus. The multilayer feedforward neural network based on backpropagation algorithm is used for image restoration in order to improve the selected algorithm. Performances of the proposed algorithm were compared with those obtained from available open-source libraries. The results show that our method makes possible the decoding of barcodes from images captured by mobile phones without autofocus.

Universal Barcode Detector via Semantic Segmentation

Conference Paper

Sep 2019

Real-Time Barcode Detection and Classification using Deep Learning

Conference Paper

Jan 2017

Low-computation egocentric barcode detector for the blind

Conference Paper

Sep 2016

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Article

Jun 2015

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. With a simple alternating optimization, RPN and Fast R-CNN can be trained to share convolutional features. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image. The code will be released.

Blur-resistant joint 1D and 2D barcode localization for smartphones

Conference Paper

Dec 2013

With the proliferation of built-in cameras barcode scanning on smartphones has become widespread in both consumer and enterprise domains. To avoid making the user precisely align the barcode at a dedicated position and angle in the camera image, barcode localization algorithms are necessary that quickly scan the image for possible barcode locations and pass those to the actual barcode decoder. In this paper, we present a barcode localization approach that is orientation, scale, and symbology (1D and 2D) invariant and shows better blur invariance than existing approaches while it operates in real time on a smartphone. Previous approaches focused on selected aspects such as orientation invariance and speed for 1D codes or scale invariance for 2D codes. Our combined method relies on the structure matrix and the saturation from the HSV color system. The comparison with three other real-time barcode localization algorithms shows that our approach outperforms the state of the art with respect to symbology and blur invariance at the expense of a reduced speed.

Reading 1D Barcodes with Mobile Phones Using Deformable Templates

Article

Oct 2011

Robust 1-D Barcode Recognition on Camera Phones and Mobile Product Information Display

Conference Paper

Jan 2008

In this paper we present a robust algorithm for the recognition of 1-D barcodes using camera phones. The recognition algorithm is highly robust regarding the typical image distortions and was tested on a database of barcode images, which covers typical distortions, such as inhomogeneous illumination, reflections, or blurs due to camera movement. We present results from experiments with over 1,000 images from this database using a MATLAB implementation of our algorithm, as well as experiments on the go, where a Symbian C++ implementation running on a camera phone is used to recognize barcodes in daily life situations. The proposed algorithm shows a close to 100% accuracy in real life situations and yields a very good resolution dependent performance on our database, ranging from 90.5% (640 ×480) up to 99.2% (2592 ×1944). The database is freely available for other researchers. Further we shortly present MobilePID, an application for mobile product information display on web-enabled camera phones. MobilePID uses product information services on the internet or locally stored on-device data.

Fast, Accurate Barcode Detection in Ultra High-Resolution Images

Abstract and Figures

Recommended publications

Attention Mask R-CNN for Ship Detection and Segmentation From Remote Sensing Images

SOLO: A Simple Framework for Instance Segmentation

Image Generation from Bounding Box-represented Semantic Labels