ArticlePDF Available

High-precision robotic assembly system using three-dimensional vision

International Journal of Advanced Robotic Systems

June 2021
18(3):172988142110270

DOI:10.1177/17298814211027029

License
CC BY 4.0

Authors:

Xian Tao

CASIA

The design of a high-precision robot assembly system is a great challenge. In this article, a robotic assembly system is developed to assemble two components with six degree-of-freedoms in three-dimensional space. It consists of two manipulators, a structured light camera which is mounted on the end-effector aside component A to measure the pose of component B. Firstly, the features of irregular components are extracted based on U-NET network training with few labeled images. Secondly, an algorithm is proposed to calculate the pose of component B based on the image features and the corresponding three-dimensional coordinates on its ellipse surface. Thirdly, the six errors including two position errors and one orientation error in image space, and one position error and two orientation errors in Cartesian space are computed to control the motions of component A to align with component B. The hybrid visual servoing method is used in the control system. The experimental results verify the effectiveness of the designed system.

Components: (a) components A and B, (b) component A and its surface structure, and (c) component B and its surface structure.

…

Assembly system configuration.

…

Image of component B: (a) original image, (b) image with manually marked ring area, and (c) image with segmented ring area. In (b) and (c), the ring area is indicated with red color.

…

U-NET network structure diagram.

…

+12

Groove feature extraction process: (a) contours detection, (b) ellipse fitting, (c) groove feature extraction, and (d) image with extracted groove features.

…

Figures - available from: International Journal of Advanced Robotic Systems

This content is subject to copyright.

Access to this full-text is provided by SAGE Publications Inc.

Learn more

Content available from International Journal of Advanced Robotic Systems

This content is subject to copyright.

Research Article

High-precision robotic assembly system

using three-dimensional vision

Shaohua Yan

1,2

, Xian Tao

1,2

and De Xu

1,2

Abstract

The design of a high-precision robot assembly system is a great challenge. In this article, a robotic assembly system is

developed to assemble two components with six degree-of-freedoms in three-dimensional space. It consists of two

manipulators, a structured light camera which is mounted on the end-effector aside component A to measure the pose of

component B. Firstly, the features of irregular components are extracted based on U-NET network training with few

labeled images. Secondly, an algorithm is proposed to calculate the pose of component B based on the image features and

the corresponding three-dimensional coordinates on its ellipse surface. Thirdly, the six errors including two position

errors and one orientation error in image space, and one position error and two orientation errors in Cartesian space are

computed to control the motions of component A to align with component B. The hybrid visual servoing method is used

in the control system. The experimental results verify the effectiveness of the designed system.

Keywords

3D vision, feature extraction, pose estimation, hybrid visual servoing, robotic assembly system

Date received: 15 February 2021; accepted: 01 June 2021

Topic Area: Vision Systems

Topic Editor: Antonio Fernandez-Caballero

Associate Editor: Grazia Cicirelli

Introduction

With the development of technology, the demand for high-

precision assembly in industrial manufacturing and space

exploration is increasing.

1–3

Industrial assembly devices are

generally divided into two categories. One is the specific

translation and rotation mechanism.

4,5

For example, Luo

et al.

used a linear drive mechanism for precision threading

operations. The translation error and rotation error of the

platform reached 3 mm and 0.005, respectively. Yu et al.

used the feature constraint relationship between components

to control translation and rotation devices completing com-

ponent assembly simulation. However, the working range of

specific translation and rotation mechanisms is small, and its

flexibility is low. The other is based on a general manipu-

lator.

6,7

For example, Wang et al.

added an elastic displace-

ment device to the manipulator to achieve peg-in-hole

assembly, which improved the success rate of each

assembly. Meng et al.

realized precise robot assembly for

large-scale spacecraft components based on computer-aided

design models of aircraft components and key geometric

features located by ranging sensors and binocular vision.

Generally, a manipulator has six degree-of-freedoms

(DOFs). Therefore, it is very helpful for manipulator-based

assembly systems to realize high-precision assembly with

six DOFs in three-dimensional (3D) space.

In the robotic assembly system, the target pose is usually

measured with vision-based methods.

8,9

Generally, the

Research Center of Precision Sensing and Control, Institute of

Automation, Chinese Academy of Sciences, Beijing, China

School of Artificial Intelligence, University of Chinese Academy of

Sciences, Beijing, China

Corresponding author:

Xian Tao, School of Artificial Intelligence, University of Chinese Academy

of Sciences, Beijing, 100080, China.

Email: taoxian2013@ia.ac.cn

International Journal of Advanced

Robotic Systems

May-June 2021: 1–12

ªThe Author(s) 2021

Article reuse guidelines:

sagepub.com/journals-permissions

DOI: 10.1177/17298814211027029

journals.sagepub.com/home/arx

Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License

(https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without

further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/

open-access-at-sage).

point feature, line feature, and circle feature are employed

in the pose estimation methods. For example, in Liu et al.,

the end position of the dispensing needle was obtained

through the point feature and then the precision dispensing

operation was completed. In Liu et al.,

line features were

used to measure the pose of a long cylindrical component.

In Liu and Xu,

a fast and effective circle detection algo-

rithm was proposed for target position estimation. How-

ever, the above pose estimation methods require two or

three cameras placed in different directions. Sun et al.

measured the target pose with a camera based on the pro-

jection relationship between the circle and the ellipse. But

the accuracy of this method relies on the ellipse fitting.

Another kind of target pose measurement method with the

structured light camera is becoming popular.

14–16

For

example, Kim et al.

accurately estimated the surface nor-

mal vector of the target based on a structured light camera

and then completed the object-grasping task. In Satorres

et al.,

the relative position relationship between the

manipulator and the object was obtained through the 3D

data in the 3D camera. Litvak et al.

assembled randomly

distributed components based on the depth camera and

convolutional neural network, and the success rate reached

91%. Therefore, pose measurement based on a structured

light camera is a better choice.

Visual servoing methods are very popular in many

applications including automatic assembly systems.

17,18

They are classified as image-based visual servoing,

position-based visual servoing,

and hybrid visual servo-

ing method.

Xu et al.

proposed an image-based visual

servoing method, in which point features and line features

are used for position control and attitude control, respec-

tively. Image-based visual servoing has certain robustness

to camera calibration errors and robot model errors.

Through comparative experiments on position-based and

image-based visual servoing systems, Peng et al.

found

that position-based visual servoing has a faster conver-

gence speed. What’s more, some advanced control meth-

ods for tracking control of mechanical servo systems help

improve convergence speed. For example, Deng and

Yao

designed a high-performance tracking controller

without velocity measurement in electrohydraulic

servomechanisms, which achieves asymptotic tracking

performance when facing time-invariant modeling uncer-

tainties. Aiming at mechanical servosystems with mis-

matched uncertainties, Deng and Yao

proposed a

novel recursive robust integral of the sign of the error

control method, which achieves excellent asymptotic

tracking performance. Therefore, it is necessary to com-

bine the advantages of image-based visual servoing and

position-based visual servoing methods to realize the pre-

cision assembly of two components.

The purpose of this article is to achieve precise assembly

of irregular components. A robotic assembly system is

developed to assemble two components with six DOFs in

3D space, which consists of two manipulators and a

structured light camera. Image-space information and 3D

space information acquired by structured light cameras are

effectively combined to measure the pose of component

B. Considering the advantages of image-based and

position-based visual servoing methods, this article pro-

poses a hybrid visual servoing method with higher con-

vergence speed and accuracy. The manipulators can

control the components of different initial positions and

postures for automatic assembly. The main contributions

of this paper are as follows:

1. A robotic assembly system with two manipulators is

developed to assemble two components with six

DOFs in 3D space. The hybrid visual servoing

method combining errors in Cartesian space and

image space is used in the control system.

2. A feature extraction algorithm for the images of

irregular components is proposed, which is based

on U-NET network training with few labeled images.

3. The pose of component B is calculated from the

image features and the corresponding 3D coordi-

nates on its ellipse surface.

The rest of this article is organized as follows. The first

part describes the assembly task and system. Secondly, an

image feature extraction and pose measurement method is

proposed. Then presents a hybrid visual servoing method to

align the two components. The details of the automated

assembly process are also introduced. The experiments and

results with the proposed assembly method are given.

Finally, this article is concluded.

Assembly task and system

Assembly task

The two components to be assembled are shown in Figure 1.

They are metal connectors with an outer diameter of about

43 mm, which are divided into component A and compo-

nent B. As shown in Figure 1(a), the left side is component

A and the right side is component B. There are five groove

areas on the inner side of component B, as shown in

Figure 1(b). The positions and sizes of the grooves are

unevenly distributed. Correspondingly, there are five pro-

truding areas on the upper surface of component A, as

shown in Figure 1(c).

When assembling, it is necessary to align the groove

area of component B to the protruding area of component

A with six DOFs, including 3D position and three-direction

angles. Our task is to realize the precise assembly of these

two components.

Assembly system

The automated precision assembly system is designed as

given in Figure 2. Manipulator 1 is a seven-DOF robot with

a clamping device and component A connected to it. A

2International Journal of Advanced Robotic Systems

structured light camera is fixed at the end of manipulator 1.

Manipulator 2 is a universal robot (UR3) with a gripping

device and component B connected to it.

Manipulator 1 can translate along and rotate around the

X,Y, and Zaxes to align component A to component B. The

poses of manipulator 1 and manipulator 2 can be adjusted

to initialize the pose of component B in the structured light

camera. The computer can control the entire assembly pro-

cess including image capture with the camera, image pro-

cessing, feature extraction, pose estimation, and alignment

and insertion of the two components.

The coordinates are established as shown in Figure 2.

is the base frame of manipulation 1, O

X-

is the base frame of manipulation 2, O

the end-effector frame of manipulation 2, O

is the

camera frame, and O

is the end-effector frame of

manipulation 1. The camera is carefully adjusted so that the

axes of the camera frame are as parallel to the axes of the

end-effector frame of manipulation 1 as possible.

Image feature extraction

Elliptic ring region extraction

Figure 3 shows the image of component B captured by the

structured light camera. To get the current pose of

component B in the camera frame, its inherent features such

as ring circles should be extracted. As shown in Figure 3(a),

there is noise in the gray image of component B, which leads

to the disturbance on the edges.

There will be a large error in detecting the ring contour

of the component through edge detection and ellipse fitting.

Another method is to obtain the ring area via threshold

segmentation.

But the gray value of the ring area is not evenly distributed

due to the influence of light. Therefore, it is difficult to accu-

rately segment the ring area with threshold segmentation.

Therefore, this article uses data labeling and deep learn-

ing methods to solve the problem of inaccurate feature

extraction. As shown in Figure 3(b), the elliptical ring area

on the surface of component B is marked, the outside of the

ring is an ellipse, and the inside is an ellipse containing the

edge of the groove. A U-NET network is designed and its

structure diagram is shown in Figure 4. It includes a contrac-

tion path forcapturing semantics and an asymmetricalexpan-

sion path for precise positioning. The contracted path part

consists of four convolutional layers and pooling layers for

down-sampling, and the extended path part consists of four

deconvolutional layers and convolutional layers for up-

sampling. This U-NET network is trained with the labeled

data. Then it is usedto segment the ring area from theimage of

component B. As shown in Figure 3 (c), the ellipticalring area

containing the groove information is accurately extracted.

Groove feature extraction

General methods cannot effectively detect the groove fea-

tures on the ring. Therefore, the inner and outer ellipses are

combined to detect the groove feature.

As shown in Figure 5(a), the contour of the elliptical

ring area is output by the U-NET network. The two con-

tours containing most edge points of the inner and outer

sides are considered as the inner ellipse and the outer

ellipse of the ring. Then the least square method is used

to fit the inner ellipse parameter equations (1) and outer

ellipse parameter equations (2), respectively. The ellipse

fitting result is shown in Figure 5(b)

Figure 1. Components: (a) components A and B, (b) component A and its surface structure, and (c) component B and its surface

structure.

Figure 2. Assembly system configuration.

Yan et al. 3

uin ¼u0þaincosq0cosq=2bin sinq0sinq=2

vin ¼v0þainsinq0cosq=2þbin cosq0sinq=2

(1)

where (u

) is the pixel coordinate value of the center

point of the ellipse, (u

) is the pixel coordinate of

the point on the inner ellipse, a

and b

are the long

and short axis lengths of the inner ellipse, q0represents

the initial angle of the ellipse, and q2ð0;2pÞis the

parameter variable

uout ¼u0þaout cosq0cosq=2bout sinq0sinq=2

vout ¼v0þaout sinq0cosq=2þbout cosq0sinq=2

(2)

where (u

out

) is the pixel coordinate of the point on the

outer ellipse and a

out

and b

out

are the long and short axis

lengths of the outer ellipse.

According to the inner and outer ellipse equations, sim-

ilar ellipse parameter equations (3) passing through the

groove area are obtained

ue¼u0þðain þkðaout ain ÞÞcosq0cosq=2

ðbin þkðbout bin ÞÞsinq0sinq=2

ve¼v0þðain þkðaout ain ÞÞsinq0cosq=2

þðbin þkðbout bin ÞÞcosq0sinq=2

(3)

Figure 3. Image of component B: (a) original image, (b) image with manually marked ring area, and (c) image with segmented ring area.

In (b) and (c), the ring area is indicated with red color.

Figure 4. U-NET network structure diagram.

Figure 5. Groove feature extraction process: (a) contours

detection, (b) ellipse fitting, (c) groove feature extraction, and (d)

image with extracted groove features.

4International Journal of Advanced Robotic Systems

where (u

) is the pixel coordinate of the point on the

similar ellipse and k2ð0;1Þrepresents the coefficient of

similar ellipse close to the outer ellipse.

The parameter angle qin the similar ellipse equation (3)

is gradually increased to find the continuous point along the

similar ellipse where the pixel value is significantly differ-

ent from the ring area. The corresponding parameter angle

set ðq11;q12 ;q1kÞis recorded. After traversing

q2ð0;2pÞ, we can get five parameter angle sets

ðq11;q11 ;q1k1Þ;ðq21 ;q21 ;q2k2Þ;ðq51;q51 ;q5k5Þfg

(4)

Feature extraction results via searching along similar

ellipses are shown in Figure 5(c). Finally, the average

ðq1;q2;q3;q4;q5Þof five parameter angle sets is considered

as the angle of each groove. The results of feature extrac-

tion on the original image are shown in Figure 5(d).

Automatic assembly

Automatic assembly is divided into three parts, namely the

desired image capture stage, the camera alignment stage,

and the component insertion stage. The whole assembly

process is given in Figure 6.

The desired image capture stage is mainly to obtain the

desired image and the displacement of the manipulator

between the alignment and insertion positions via one

assembly manually controlled. The desired image features

are extracted from the desired image. During the camera

alignment stage, the features of component B in image

space and Cartesian space are acquired. A hybrid visual

servoing control method is designed for precise alignment.

In the component insertion stage, component A is trans-

lated by displacements D

and D

. Then component A is

inserted into component B.

Desired image capture stage

As shown in stage A in Figure 6, manipulator 1 is manu-

ally controlled to complete one assembly. Manipulator 1

is translated the given displacement D

along the z-axis

in its end-effector frame to move component A away

from component B. Manipulator 1 is translated along

the x-axis in its end-effector frame until the camera can

capture the image of component B. The displacement

along the x-axisisrecordedasD

. This state is called

the camera alignment state.

The images captured in the camera alignment state are

considered as the desired image. The elliptical ring area

containing the groove of the desired image is extracted

by trained U-NET network. The image coordinates

ðu1;v1Þ;ðu2;v2Þ;ðun;vnÞ

of sampled points in the

ellipse ring area are obtained. Corresponding, the 3D

coordinates f(x

), (x

), ...,(x

)gin

the camera frame are recorded. The random sample

consensus algorithm is used to fit the ring area plane

(4)ofcomponentB

adxþbdyþcdzþed¼0(5)

where a

, and e

are the parameters of the fitting

plane.

Figure 6. The program flow chart of the assembly procedure.

Yan et al. 5

The desired normal vector [a

]

is obtained. The

desired normal vector is normalized to the desired unit

normal vector n

nd¼adbdcd

½

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

dþb2

dþc2

q¼ndx ndy ndz

½

T(6)

The desired posture angle q

and q

are calculated with

the desired plane unit normal vector by formula (6). Posture

angle q

mdz

is an angle sequence, which contains groove

angle information. It is obtained by the above groove fea-

ture extraction algorithm

qdx ¼asinndx

qdy ¼asinndy

qmcz ¼½qc1;qc2;qc3;qc4;qc5T

(7)

The desired center point image coordinate

Pd¼ðuad ;vad Þof component B is obtained through

ellipse fitting. Correspondingly, the 3D coordinate

Pad ¼ðxad ;yad ;zad Þin the camera frame is read from the

3D camera.

In this way, the desired features P

,andq

are

acquired in image space, and the desired features n

and q

mdz

are acquired in Cartesian space.

Camera alignment stage

The current image of component B is acquired in real

time. According to the above method of feature extrac-

tion, the current features Pc¼ðuac;vac Þ,q

,andq

are acquired from the current image, and the current

features n

¼[a

]

,Pac ¼ðxac;yac ;zacÞ,and

qmcz ¼½qc1;qc2;qc3;qc4;qc5Tare acquired in Cartesian

space, as described in the “Desired Image Capture

Stage” section.

A hybrid visual servoing control system is designed, in

which the features from image space and Cartesian space

are combined to realize the alignment between component

B and camera. The block diagram of the hybrid visual

servoing automatic control system is shown in Figure 7.

The pose of the end-effector of manipulator 1 is adjusted

in its end-effector frame according to formula (11). The

features from image space are used to control translations

along the x-axis and y-axis and rotation around the z-axis.

The features from Cartesian space are used to control trans-

lation of the end-effector along the z-axis and rotations

around the x-axis and y-axis

Dqx

Dqy

Dqz

k1ðuac uad Þ

k1ðvac vad Þ

k2ðzac zad Þ

k2ðqcx qdxÞ

k2ðqcy qdyÞ

k2Dqmz

(8)

where k

and k

are coefficients and Dqmz is the best angle

error calculated by qmcz ¼½qc1;qc2;qc3;qc4;qc5Tand

qmdz ¼½qd1;qd2;qd3;qd4;qd5T.

As shown in stage B in Figure 6, the camera alignment

state is achieved after hybrid visual servoing control. At

this point, the errors between the current pose and desired

pose approach 0. The displacement between component A

and component B along the z-axis in the end-effector frame

of manipulator 1 is D

. The displacement between compo-

nent A and component B along the x-axis in the end-

effector frame of manipulator 1 is D

Component insertion stage

In the component insertion stage, component alignment

and component insertion are completed. At first, as shown

in stage C in Figure 6, component A is translated the dis-

placement (D

d) along the z-axis and the displacement D

along the x-axis in the end-effector frame of manipulation

1, where dis a small displacement. After ensuring the

safety of assembly, component A is translated the displace-

ment dalong the z-axis in the end-effector frame of manip-

ulation 1. Then component A is inserted into component B.

The entire assembly is completed precisely and efficiently.

Figure 7. Block diagram of automatic control system.

6International Journal of Advanced Robotic Systems

Experiments and results

Experiment system

An experiment system was established according to the

scheme given in the “Assembly System” section, as shown

in Figure 8. In this experiment system, there were two

manipulators including one seven-DOF robotic arm and

one six-DOF manipulator. Manipulator 1 had a clamping

device and component A connected to it. Manipulator 2

was a UR3 (universal robots company) manipulator with

a gripping device and component B connected to it. A

structured light camera was fixed at the end of manipulator

1. The structured light camera was LMI Gocator3210 (LMI

technologies company) binocular snapshot sensor. The res-

olution of the camera in the x-axis and y-axis directions is

60–90 mm, the field of view is 71 98 mm–100 154 mm,

and the working distance is 164 mm. Figure 8. Experiment system.

Figure 9. Feature extraction results of images at different angles and distances: (a) image after rotating around the positive directions of

the x-axis, (b) image after rotating around the negative directions of the x-axis, (c) image after rotating around the positive directions

of the y-axis, (d) image after rotating around the negative directions of the y-axis, (e) image after translating along the positive directions

of the z-axis, and (f) image after translating along the negative directions of the z-axis.

Yan et al. 7

U-Net network and feature extraction results

The training set for the U-NET network consisted of 60

images with different angles and distances and 600 images

generated by data augmentation. Each image was a gray

image obtained by the structured light camera in an actual

environment. The size of the original images was 1251 

1925 pixels, which were resized to 512 512 pixels when

training U-NET network.

The new images of different angles and distances were

input into the trained U-NET network for test. The feature

extraction experiments with the method described in the

“Groove Feature Extraction” section were conducted. The

extracted features for the images captured at different angles

and distances are shown in Figure 9. Figure 9(a) and (b) are

the feature extraction results of the image after rotating

around the positive and negative directions of the x-axis,

respectively. Figure 9(c) and (d) are the feature extraction

results of the image after rotating around the positive and

negative directions of the y-axis, respectively. Figure 9(e)

and (f) are the feature extraction results of the image after

translating along the positive and negative directions of the

z-axis, respectively. It can be seen from Figure 9 that the five

grooves on the ring area are all accurately extracted.

Automatic assembly

Before the assembly experiment, the desired features of

component B had been obtained by the method in the

“Groove Feature Extraction” and “Desired Image Cap-

ture Stage” sections. The coefficient kof the similar

ellipse was equal to 0.25. The prior information

obtained in desired image capture stage is presented

in Table 1.

In the assembly experiments, the poses of component A

and component B were initialized randomly within a cer-

tain range, and the structured light camera obtained the

current image of component B in real time. The current

features of component B were obtained by the method in

the “Groove Feature Extraction” and “Desired Image Cap-

ture Stage” sections. The errors between the current fea-

tures and the desired features were used as the input of

hybrid visual servoing system. The coefficients k

and k

in the hybrid visual servoing system were both set to 0.6.

The error curves of component B between the current pose

and the desired pose are shown in Figure 10. It can be seen

that after about eight steps, the position error and orienta-

tion error have approached 0.

The trajectory of component B in image space during

the assembly process is shown in Figure 11(a). It can be

seen that the center point image coordinates of component

B are gradually approached the desired center point image

coordinates. The trajectory of component B in Cartesian

space during the assembly process is shown in

Figure 11(b). It can be seen that the center point 3D coor-

dinates of component B are gradually approached the

desired center point 3D coordinates.

Table 1. Prior information.

Desired center point image coordinates P

(pixel) (627.08, 1233.53)

3D coordinates of desired center point P

(mm) (0.17, 21.72, 180.04)

Desired attitude angle q

() 2.72

Desired attitude angle q

()0.86

Groove angle q

mdz

() (10.26, 39.06, 138.60, 174.96, 270.54)

The displacement D

of the evacuation (mm) 180

The displacement D

of the captured image (mm) 135

Figure 10. Error curves with the proposed method: (a) position error of component B and (b) orientation error of component A.

8International Journal of Advanced Robotic Systems

The actual scenes of the desired image capture stage

are shown in Figure 12. As shown in Figure 12(a), the

manipulator was manually controlled to complete one

assembly. Manipulator 1 was translated the given dis-

placement D

along the z-axis in its end-effector frame

to move component A away from component B. Manip-

ulator 1 was translated along the x-axis in its end-

effector frame until the camera could capture the image

of component B. The displacement along the x-axis was

recorded as D

After the desired features had been obtained, we initi-

alized the poses of component A and component B, as

shown in Figure 13(a). As shown in Figure 13(b), the cam-

era alignment state was achieved after hybrid visual servo-

ing control.

As shown in Figure 14(a), after component A had

moved up D

, it was aligned with component B. Then

component A was translated the displacements (D

d)

along the z-axis in the end-effector frame of manipulation

1, where dwas equal to 3 mm. After ensuring the safety of

assembly, component A was translated the displacement d

along the z-axis in the end-effector frame of manipulation

1. Then component A was inserted into component B, as

shown in Figure 14(b).

Figure 11. The trajectory of component B in assembly: (a) trajectory in image space and (b) trajectory in Cartesian space.

Figure 12. Desired image capture stage: (a) the direction of movement of the end-effector of the manipulator and (b) the displacement

of the evacuation.

Figure 13. The camera alignment stage: (a) initial state and (b)

camera alignment state.

Yan et al. 9

The total time cost in one assembly was about 18 s: it

was as follows, camera alignment 16 s and component

insertion 2 s. Fifty assembly experiments were conducted,

and all were successful. It can be found the alignment and

insertion achieved good results.

Comparative experiments

The position-based method in ref.

was selected as the

comparative method. The position-based visual servoing

control was realized according to formula (9), and the fea-

tures were all from Cartesian space

Dqx

Dqy

Dqz

kaðxac xad Þ

kaðyac yad Þ

kaðzac zad Þ

kbðqcx qdxÞ

kbðqcy qdyÞ

kbDqz

(9)

where the difference from formula (8) was that x

and y

were obtained by directly reading the 3D coordi-

nates of the desired point and current point in the camera,

and Dqzis calculated by 3D coordinates.

The coefficients k

and k

in equation (9) were both set

to 0.6. A series of comparative experiments were well con-

ducted. Component A was also well aligned with compo-

nent B in orientation and position and was successfully

inserted into component B to form an assembled compo-

nent with the method in ref.

In one experiment with the

comparative method, the error curves of component B

between the current pose and the desired pose are shown

in Figure 15. It can be seen that after about 10 steps, the

position error and orientation error have approached 0. The

error curves of the comparative method oscillate more

times, and our method has a faster convergence speed.

The errors and steps of eight groups of comparative

experiments in orientation alignment and position align-

ment were listed in Table 2. It can be found that the errors

of our method are in a smaller range. Because the method in

ref.

will suddenly have a large error in a certain dimen-

sion, our proposed method is more steady.

Conclusions

A robotic assembly system with two manipulators is

designed to assemble two components with six DOFs in

3D space. A feature extraction algorithm for the images of

components is designed with the U-NET network. A hybrid

visual servoing method combining the errors in image

Figure 14. The component insertion stage: (a) translation D

and

(b) component insertion state.

Figure 15. Error curves with the comparative method: (a) position error of component B and (b) orientation error of component A.

10 International Journal of Advanced Robotic Systems

space and Cartesian space is proposed. Three DOFs are

controlled in image space, which are the center’s position

on the image plane and the rotation of component B around

the z-axis. The other three DOFs are controlled in Cartesian

space, which are the depth and the rotations around the x-

axis and y-axis.

A series of complete assembly experiments have been

conducted in a real environment. The pose error is reduced

to a small range in a few steps, and the success rate in 50

assembly experiments is 100%. Subsequently, a series of

comparative experiments to compare the proposed method

with the method in ref.

are well-conducted. The error

curves of the method in ref.

oscillate more times, and our

method has a faster convergence speed. The errors of our

method are in a smaller range. Our method can improve the

steadiness and efficiency of the alignment process. The

alignment process of component A to component B is con-

verged fast and accurately with our method.

In the future, we will pay more attention to more intel-

ligent assembly control methods.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with

respect to the research, authorship, and/or publication of this

article.

Funding

The author(s) disclosed receipt of the following financial support

for the research, authorship, and/or publication of this article: This

work was supported in part by the National Key Research and

Development Program of China under Grant 2018AAA0103004,

the National Natural Science Foundation of China under Grant

61873266, the Beijing Municipal Natural Science Foundation

under Grant 4212044, and the Science and Technology Program

of Beijing Municipal Science and Technology Commission under

Grant Z191100008019004.

ORCID iDs

Xian Tao https://orcid.org/0000-0001-5834-5181

De Xu https://orcid.org/0000-0002-7221-1654

References

1. Tsenev V. Robot assembly with flexible automatic control

according to INDUSTRY 4.0. In: IEEE XXVIII international

scientific conference electronics (ET), Sozopol, Bulgaria, 12–

14 September 2019, pp. 1–4. DOI: 10.1109/ET.2019.

8878551.

2. Zeng F, Xiao J, and Liu H. Force/torque sensorless compliant

control strategy for assembly tasks using a 6-DOF collabora-

tive robot. IEEE Access 2019; 7: 108795–108805.

3. Yu Y, Xu Z, Lv Y, et al. Design and analysis of space docking

mechanism for on-orbit assembly with application to space

telescopes. In: IEEE international conference on mechatro-

nics and automation (ICMA), Changchun, China, 5–8 August

2018, pp. 1867–1871. DOI: 10.1109/ICMA.2018.8484668.

4. Luo Y, Chen M, Wang X, et al. Precision assembly system

based on position-orientation decoupling design. In: 2nd

world conference on mechanical engineering and intelligent

manufacturing (WCMEIM), Shanghai, China, 22–24 Novem-

ber 2019, pp. 685–688. DOI: 10.1109/WCMEIM48965.2019.

00145.

5. Yu H, Ma T, Wang M, et al. Feature-based pose optimization

method for large component alignment. In: 4th international

conference on control, robotics and cybernetics (CRC),

Tokyo, Japan, 27–30 September 2019, pp. 152–156. DOI:

10.1109/CRC.2019.00039.

Table 2. The errors and steps in camera alignment.

No.

(Dx,Dy,Dz) (mm) and (Dq

,Dq

) (degree) Steps

Initial

After camera alignment

Proposed method Method in ref.

1 (19.41, 10.42, 28.62)

(3.96, 0.96, 20.65)

(0.15, 0.16, 0.15)

(0.05, 0.06, 0.73)

(0.23, 0.17, 0.21)

(0.06, 0.04, 0.89)

2(21.72, 26.43, 42.40)

(1.03, 0.66, 10.23)

(0.17, 0.05, 0.26)

(0.05, 0.06, 0.23)

(0.21, 0.07, 0.15)

(0.06, 0.01, 1.52)

810

3 (30.07, 8.59, 7.68)

(3.12, 1.36, 5.03)

(0.15, 0.14, 0.23)

(0.01, 0.04, 0.41)

(0.11, 0.18, 0.06)

(0.02, 0.02, 0.07)

4(5.05, 4.26, 40.56)

(1.64, 1.87, 9.74)

(0.05, 0.23, 0.26)

(0.01, 0.01, 0.68)

(0.01, 0.10, 0.16)

(0.01, 0.01, 0.03)

5(12.11, 10.72, 7.80)

(3.86, 0.42, 30.89)

(0.03, 0.06, 0.29)

(0.01, 0.01, 0.03)

(0.05,0.09,0.16)

(0.01,0.02,0.19)

6 (1.21, 17.32, 60.63)

(6.46, 0.71, 10.51)

(0.11, 0.02, 0.04)

(0.02, 0.02, 0.51)

(0.13, 0.07, 0.14)

(0.01, 0.01, 0.29)

7(9.15, 26.29, 92.58)

(9.87, 0.03, 4.62)

(0.04, 0.08, 0.05)

(0.01, 0.01, 0.13)

(0.13, 0.14, 0.17)

(0.02, 0.02, 0.25)

8(10.08, 10.53, 76.86)

(8.87, 2.15, 10.09)

(0.04, 0.12, 0.13)

(0.01, 0.01, 0.48)

(0.25, 0.14, 0.23)

(0.02, 0.01, 0.53)

Yan et al. 11

6. Wang S, Chen G, Xu H, et al. A robotic peg-in-hole assembly

strategy based on variable compliance center. IEEE Access

2019; 7: 167534–167546.

7. Meng S, Ruiqin H, Lijian Z, et al. Precise robot assembly for

large-scale spacecraft components with a multi-sensor sys-

tem. In: 5th international conference on mechanical, auto-

motive and materials engineering (CMAME), Guangzhou,

China, 1–3 August 2017, pp. 254–258. DOI: 10.1109/

CMAME.2017.8540181.

8. Lei Y, Xu J, Zhou W, et al. Vision-based position/impedance

control for robotic assembly task. In: Chinese control confer-

ence (CCC), Guangzhou, China, 27–30 July 2019, pp.

4620–4625. DOI: 10.23919/ChiCC.2019.8865406.

9. Taptimtong P, Mitsantisuk C, Sripattanaon K, et al. Multi-

objects detection and classification using vision builder for

autonomous assembly. In: 10th International conference of

information and communication technology for embedded

systems (IC-ICTES), Bangkok, Thailand, 25–27 March

2019, pp. 1–4. DOI: 10.1109/ICTEmSys.2019.8695970.

10. Liu S, Xu D, Li Y, et al. Nanoliter fluid dispensing based on

microscopic vision and laser range sensor. IEEE Trans Ind

Electron 2017; 64(2): 1292–1302.

11. Liu S, Xu D, Liu F, et al. Relative pose estimation for alignment

of long cylindrical components based on microscopic vision.

IEEE/ASME Trans Mechatron 2016; 21(3): 1388–1398.

12. Liu S and Xu D. Fast and accurate circle detection algorithm

for porous components. J Electr Eng Electron Technol 2014;

03(01): 1–8.

13. Sun S, Yin Y, Wang X, et al. Robust landmark detection and

position measurement based on monocular vision for auton-

omous aerial refueling of UAVs. IEEE Trans Cybern 2019;

49(12): 4167–4179.

14. Kim J, Nguyen H, Lee Y, et al. Structured light camera base

3D visual perception and tracking application system with

robot grasping task. In: IEEE international symposium on

assembly and manufacturing (ISAM),Xi’an,China,30

July–2 August 2013, pp. 187–192. DOI: 10.1109/ISAM.

2013.6643524.

15. Satorres M, G´omez O, G ´amez G, et al. Visual predictive

control of robot manipulators using a 3D ToF camera. In:

IEEE international conference on systems, man, and cyber-

netics, Manchester, UK, 13–16 October 2013, pp. 3657–3662.

DOI: 10.1109/SMC.2013.623.

16. Litvak Y, Biess A, and Bar-Hillel A. Learning pose estima-

tion for high-precision robotic assembly using simulated

depth images. In: International conference on robotics and

automation (ICRA), Montreal, QC, Canada, 20–24 May 2019,

pp. 3521–3527. DOI: 10.1109/ICRA.2019.8794226.

17. Chaumette F and Hutchinson S. Visual servo control, part I:

basic approaches. IEEE Robot Autom Mag 2006; 13: 82–90.

18. Chaumette F and Hutchinson S. Visual servo control, part II:

advanced approaches. IEEE Robot Autom Mag 2007; 14:

109–118.

19. Xu D, Lu J, Wang P, et al. Partially decoupled image-based

visual servoing using different sensitive features. IEEE Trans

Syst Man Cybern Syst 2017; 47(8): 2233–2243.

20. Peng Y, Jivani D, Radke RJ, et al. Comparing position- and

image-based visual servoing for robotic assembly of large

structures. In: IEEE 16th international conference on auto-

mation science and engineering (CASE), Hong Kong, China,

20–21 August 2020, pp. 1608–1613. DOI: 10.1109/

CASE48305.2020.9217028.

21. Corke P and Hutchinson SA. A new hybrid image-based

visual servo control scheme. In: Proceedings of the 39th

IEEE conference on decision and control (Cat. No.

00CH37187), Sydney, NSW, 12–15 December 2000, pp.

2521–2526, vol. 3. DOI: 10.1109/CDC.2000.914182.

22. Deng W and Yao J. Extended-state-observer-based adaptive

control of electrohydraulic servomechanisms without velo-

city measurement. IEEE/ASME Trans Mechatron 2020;

25(3): 1151–1161.

23. Deng W and Yao J. Asymptotic tracking control of mechan-

ical servosystems with mismatched uncertainties. IEEE/

ASME Trans Mechatron 1–1. DOI: 10.1109/TMECH.2020.

3034923.

12 International Journal of Advanced Robotic Systems

Available via license: CC BY 4.0

Content may be subject to copyright.

Automatic buckling system of micro terminals combined vision and force signals

Article

Full-text available

May 2023
MEAS CONTROL-UK

Micro terminals are often used in every laptop, mobile, and other electrical product. It is challenging to automatically buckle the terminal head to its terminal base during manufacturing because of trouble in accurate positioning and gripping. A double-robots collaborative assembly system is developed to buckle millimeter-scale terminals in three-dimensional space. Robot 1 takes the terminal head horizontally by grasping its flexible line with a customized clamp, including two fingers. Robot 2 presses the aligned terminal head through a force control strategy to ensure that the terminal head and the terminal base can complete buckling accurately, even if there is a certain deviation in the vertical direction. There are two cameras to be used in the system. A horizontally placed camera is used to detect and calculate the angle between the terminal head and the horizontal plane. The angle data will drive robot 1 to make the terminal end face parallel to the horizontal plane to complete the pose correction of the terminal head. Another camera is vertically fixed at the end of industrial robot 1 and used to detect and calculate the position deviation between the terminal head and the terminal base. The position deviation will drive robot 1 to align the terminal head with the terminal base to complete the position correction. The YOLOv3, least square, and feature extraction algorithms are used in image processing. The accuracy of the YOLOv3 target detection model trained by self-made data set can reach more than 95% under different conditions. The detection period is within 65 ms. The experimental results show that the terminal assembly system designed in this paper has excellent reliability and assembly success rate. It also has a significant reference value for other terminals’ automatic buckling assemblies.

A Novel Disassembly Strategy of Hexagonal Screws Based on Robot Vision and Robot-Tool Cooperated Motion

Article

Full-text available

Dec 2022

Disassembly plays an important role in the production process. Screw automatic unfastening guided by a robot has been widely used in the fields of industrial manufacturing and maintenance. Different from the previous studies that have used a flexible effector and expensive sensors, this paper presents a novel unfastening strategy based on robot vision for a hexagonal screw with an arbitrary loose state. In a robotic system, an industrial camera and a servo unfastening tool are installed at a robotic end-effector. The main contributions of this work are as follows. A camera pose adjustment method is proposed to obtain high-quality images of a target screw. The hexagonal screw pose calculation method based on the geometric analysis is developed to complete the screw–tool engagement. The cooperated motion of a robot and an unfastening tool is planned for the screw unfastening action. The effectiveness of the proposed control strategy is verified by experiments, and the influence of the motion speed on the unfastening quality is analyzed using the torque data collected by the unfastening tool. The analysis results can provide a significant foundation for the motion parameter selection in the proposed strategy.

Research of Hand–Eye System with 3D Vision towards Flexible Assembly Application

Article

Full-text available

Jan 2022

In order to improve industrial production efficiency, a hand–eye system based on 3D vision is proposed and the proposed system is applied to the assembly task of workpieces. First, a hand–eye calibration optimization algorithm based on data filtering is proposed in this paper. This method ensures the accuracy required for hand–eye calibration by filtering out part of the improper data. Furthermore, the improved U-net is adopted for image segmentation and SAC-IA coarse registration ICP fine registration method is adopted for point cloud registration. This method ensures that the 6D pose estimation of the object is more accurate. Through the hand–eye calibration method based on data filtering, the average error of hand–eye calibration is reduced by 0.42 mm to 0.08 mm. Compared with other models, the improved U-net proposed in this paper has higher accuracy for depth image segmentation, and the Acc coefficient and Dice coefficient achieve 0.961 and 0.876, respectively. The average translation error, average rotation error and average time-consuming of the object recognition and pose estimation methods proposed in this paper are 1.19 mm, 1.27°, and 7.5 s, respectively. The experimental results show that the proposed system in this paper can complete high-precision assembly tasks.

3D vision guided stove picking based on multi-channel image fusion in complex environment

Conference Paper

Nov 2021

Application of 3D vision intelligent calibration and imaging technology for industrial robots

Article

Full-text available

Nov 2021

During modern flexible lean manufacturing, flexible operation of irregular and complex workpieces with different specifications and arbitrary placement is an essential ability of industrial robots, while it cannot be met by traditional clamping methods. Vision technology brings flexibility and convenience to industrial robots, but the common two-dimensional technology only involves three degrees of freedom (plane displacement and rotation), which hinders the positioning of arbitrarily placing workpieces (often six degrees of freedom) and disorderly sorting. In addition, for typical visual tasks in industrial environments like defect detection, accurate distinguishing of such defects as pits and scratches is challenging under two-dimensional plane imaging. The introduction of three-dimensional information provides an effective solution to this problem. Thus, in the face of increasingly complex, flexible, intelligent and personalized manufacturing needs, the acquisition and processing of 3D visual information are of much importance.

Automated Triaxial Robot Grasping System for Motor Rotors Using 3D Structured Light Sensor

Conference Paper

Dec 2023

Vision-Based High-Precision Assembly with Force Feedback

Conference Paper

Apr 2023

Flexible 3D Object Appearance Observation Based on Pose Regression and Active Motion

Conference Paper

Aug 2022

A Noncontact Control Strategy for Circular Peg-in-Hole Assembly Guided by the 6-DOF Robot Based on Hybrid Vision

Article

Jan 2022

Facing some special operating environments or conditions, existing control methods for the peg-in-hole assembly guided by robots always have their own disadvantages, for example, low efficiency or poor adaptability. For the above problem, in this article, a new circular peg-in-hole assembly control strategy is proposed for the 6-Degree of Freedom (DOF) robot based on hybrid visual measurements, avoiding peg-in-hole contacts during the robotic operation. In the strategy, the pose of the monocular camera mounted at the end-effector is adaptively adjusted to improve the image quality through an algorithm based on the rough pose measurement of the target hole by the binocular camera; the accurate 3-D pose of the hole is determined by an algorithm based on processing of high-quality images and the compensation of the orientation error. Combined with the robotic collision-free path planning, the automatic peg-in-hole assembly can be implemented in real setting. The assembly precision of the robotic system based on the proposed method is validated and discussed based on experimental results. Then, the minimal peg-in-hole interval relative to the alignment error is modeled through the spatial relation analysis to analyze the applicable condition of the robotic system with the control strategy. Also, the reliability of the proposed strategy is verified through experimental tests under some applicable conditions. Finally, suggestions and plans of future works are discussed for further extension of the application area of the proposed strategy, such as fields of precision and ultraprecision manufacturing. This contribution has the major significance on the automatic peg-in-hole assembly under 3-D operating environment.

A Robotic Peg-in-Hole Assembly Strategy Based on Variable Compliance Center

Article

Full-text available

Nov 2019

Many kinds of peg-in-hole assembly strategies for an industrial robot have been reported in recent years. Most of these strategies are realized by utilizing visual and force sensors to assist robots. However, complex control algorithms that are based on visual and force sensors will reduce the assembly efficiency of a robot. This issue is thoughtless in traditional assembly strategies but is critical to further improve the efficiency of assembly automation. In this work, a new assembly strategy that is based on a displacement sensor and a variable compliant center is proposed to improve robot performance in assembly tasks. First, an elastic displacement device for this assembly strategy is designed, and its performance is analyzed. The displacement signal generated by the displacement sensor is used to detect the contact state of the peg and hole and to guide the robot to adjust the posture. Second, an assembly strategy, including the advantages of passive compliance and active compliance, and a simple assembly control system are designed to improve the assembly efficiency. Last, the effectiveness of the proposed assembly method is experimentally verified using a robot with 6 degrees of freedom and a chamferless peg and hole with a small clearance (0.1 mm). The experimental results show that the assembly strategy can successfully complete the precision peg-in-hole assembly and assist the robot in accurate assembly in industrial applications.

Force/Torque Sensorless Compliant Control Strategy for Assembly Tasks Using a 6-DOF Collaborative Robot

Article

Full-text available

Jul 2019

The flexibility of the robot assembly process is critical, and a robot assembly system that is not flexible may damage the workpieces. Most researchers make the assembly process flexible by installing a six-dimensional force/torque sensor at the end of robots, but doing so will result in an increase in the costs of the robotic assembly system. To this end, this paper proposes an external force/torque calculation algorithm based on dynamic model identification to replace the six-dimensional force/torque sensor; the algorithm can reduce the costs while achieving a flexible assembly. In this paper, the impedance model of the environment and the dynamic model of the robot with friction are unified. Based on the unified model, the virtual contact surface is proposed to optimize the assembly. To ensure the accuracy of the assembly, the compliant control method of this paper uses the PD-based position control as the control inner loop and the impedance control as the control outer loop. To verify the accuracy of the compliant control method, a 6-DOF series collaborative robot which is developed in our laboratory is used to complete the peg-in-hole assembly experiment. The experimental results show that the algorithm has good flexibility and positional accuracy.

Comparing Position- and Image-Based Visual Servoing for Robotic Assembly of Large Structures

Conference Paper

Aug 2020

Asymptotic Tracking Control of Mechanical Servosystems With Mismatched Uncertainties

Article

Oct 2020

Uncertainties, especially mismatched uncertainties, pose great challenges to high accuracy tracking controller design for mechanical servosystems. In this paper, a novel recursive robust integral of the sign of the error (RISE) control method is proposed for mechanical servosystems with mismatched uncertainties. In the controller development, two auxiliary error signals are introduced into the recursive backstepping design framework, and then RISE feedbacks are synthesized to eliminate the matched and mismatched uncertainties simultaneously. Moreover, to reduce the design conservatism, an adaptive recursive RISE control law is also developed for mechanical servosystems suffering from both parametric uncertainties and unmodeled disturbances, in which desired-trajectory-based adaptation law is synthesized to achieve compensation for parametric uncertainties. The proposed control methods can theoretically achieve remarkable asymptotic tracking performance with zero steady-state error in spite of matched and mismatched time-variant uncertainties. The proposed controllers are applied to an actual hydraulic servosystem and comparative experiments are performed to verify their effectiveness.

Feature-Based Pose Optimization Method for Large Component Alignment

Conference Paper

Sep 2019

Precision Assembly System Based on Position-Orientation Decoupling Design

Conference Paper

Nov 2019

Extended-State-Observer-Based Adaptive Control of Electro-Hydraulic Servomechanisms without Velocity Measurement

Article

Dec 2019

Velocity signal is difficult to obtain in practical electro-hydraulic servomechanisms. Even though it can be approximately derived via numerical differentiation on position measurement, the strong noise effect will greatly deteriorate the achievable control performance. Hence, how to design a high performance tracking controller without velocity measurement is of practical significance. In this paper, a practical adaptive tracking controller without velocity measurement is proposed for electro-hydraulic servomechanisms. To estimate the unmeasurable velocity signal, an extended state observer (ESO) which also provides an estimate of the mismatched disturbance is constructed. The ESO uses the unknown parameter estimates updated by a novel adaptive law which only depends on the actual position and desired trajectory. Moreover, the matched parametric uncertainty is also handled by on-line parameter adaptation and the matched disturbance is suppressed via a robust control law. The proposed ESO-based adaptive controller theoretically achieves an excellent asymptotic tracking performance when existing time-invariant modeling uncertainties. In the presence of time-variant modeling uncertainties, guaranteed transient performance and prescribed final tracking accuracy can also be achieved. The proposed control strategy bridges the gap between adaptive control and disturbance observer based control without using velocity signal and preserves the performance results of both control methods while overcoming their practical performance limitations. Comparative experiments are performed on an actual servovalve controlled double-rod hydraulic actuator to verify the superiority of the proposed control strategy.

Robot assembly with flexible automatic control according to INDUSTRY 4.0

Conference Paper