ArticlePDF Available

Computer-Aided Comparison of Thesauri Extracted from Complementary Patent Classes as a Means to Identify Relevant Field Parameters

February 2011

February 2011

DOI:10.1007/978-3-642-15973-2-56

Authors:

Gaetano Cascini

Politecnico di Milano

Manuel Zini

University of Florence

Patents are gaining a growing importance as a complementary source of technical information, since the information they disclose is not accessible in scientific and technical literature. Text mining technologies are emerging as a possible solution to increase the efficiency of patent analysis activities; besides, most of the existing systems are derived from general purpose applications that marginally leverage patents peculiarities. The authors are developing algorithm and tools fully dedicated to patent mining, i.e. information extraction from patent literature. The present paper aims at the identification of relevant technical parameters for a certain domain, through the comparison of thesauri automatically extracted from the given field of application and from its complementary patent classes.

Excerpt from the thesaurus graph automatically built by processing 150 patents belonging to the class C02F-1/02

…

Excerpt from the thesaurus graph automatically built by processing 341 patents belonging to the class C02F-1/24

…

Excerpt from the thesaurus graph automatically built by processing 150 patents belonging to the class C02F-1/28

…

Comparison of thesauri extracted from IPC groups and subgroups of a same patent class

…

Figures - uploaded by Gaetano Cascini

Content may be subject to copyright.

Content uploaded by Gaetano Cascini

Content may be subject to copyright.

Computer-Aided Comparison of Thesauri Extracted from

Complementary Patent Classes as a Means to Identify Relevant

Field Parameters

G. Cascini and M. Zini

Abstract Patents are gaining a growing importance as a

complementary source of technical information, since the

information they disclose is not accessible in scientiﬁc and

technical literature. Text mining technologies are emerging

as a possible solution to increase the efﬁciency of patent

analysis activities; besides, most of the existing systems are

derived from general purpose applications that marginally

leverage patents peculiarities. The authors are developing

algorithm and tools fully dedicated to patent mining, i.e.

information extraction from patent literature. The present

paper aims at the identiﬁcation of relevant technical parame-

ters for a certain domain, through the comparison of thesauri

automatically extracted from the given ﬁeld of application

and from its complementary patent classes.

Keywords Patent mining ·Field thesaurus ·Patent classiﬁ-

cation ·OTSM-TRIZ ·Network of parameters ·Network of

evolutionary trends

1 Introduction

The deﬁnition of competitive R&D strategies requires mon-

itoring the evolution of technical systems in order to assess

the maturity level of current solutions and to check the emer-

gence of new technologies. Nevertheless, despite more than

ﬁfty methodologies with different characteristics and spe-

ciﬁc purposes have been proposed so far in this ﬁeld [1],

no universal methods are known. Besides, complementary

instruments must be integrated according to the speciﬁc goal

and data availability. Moreover, due to the huge amount of

G. Cascini (), M. Zini

Dipartimento di Meccanica, Politecnico di Milano, 20156 Milan, Italy

e-mail: gaetano.cascini@polimi.it

scientiﬁc and technical documentation nowadays produced

in any ﬁeld of application, these analyses are extremely time

consuming.

Among the recent research developments which deserve

a proper attention to improve the efﬁciency of innovation

related activities, TRIZ (the Theory of Inventive Problem

Solving) is gaining popularity as a means to systematize the

analysis of a technical system and to identify opportunities

of evolution.

A promising direction of research in this area is the deﬁ-

nition of structured models representing in a concise format

the challenges of a certain ﬁeld of application in the form of

networks of contradictions as proposed in [2], or in the form

of network of evolutionary trends, as described in [3].

In both cases the domain knowledge is represented in

terms of parameters and relationships between these param-

eters; the identiﬁcation of the relevant parameters and the

related links is a critical task which requires questionnaires

to subject meta-experts aimed at making their knowledge

explicit.

A complementary and extremely valuable source of infor-

mation to support these analyses is constituted by patent

databases: in facts, several studies have demonstrated that

80% of information contained in patents, is not available

in any other source [4]. A further advantage of patents is

related to their semi-structured format which allows adopting

customized text-mining techniques to improve information

extraction efﬁciency.

The authors are developing a set of complementary algo-

rithms for patent mining. The goal of the present article is to

show the preliminary results of a research aimed at building

a model of domain knowledge in the form of a network of

parameters. More in details, the paper details the process of

construction of a domain thesaurus through automatic patent

analysis and the criteria to compare thesauri extracted from

complementary patent classes as a means to identify relevant

ﬁeld parameters.

555

A. Bernard (ed.), Global Product Development,

DOI 10.1007/978-3-642-15973-2_56, © Springer-Verlag Berlin Heidelberg 2011

556 G. Cascini and M. Zini

2 Related Art

Before than detailing the methodological approach and the

computer-based system proposed in the present paper, it is

worth to recall some fundamentals about the International

Patent Classiﬁcation. This section will summarize also some

relevant outcomes of previous research activities carried out

by the authors in the ﬁeld of patent text-mining. Finally, pre-

vious works related to automatic thesaurus construction will

be critically surveyed, to highlight opportunities and limits

for their application in the patent ﬁeld.

2.1 International Patent Classiﬁcation

The International Patent Classiﬁcation (IPC) system is a lan-

guage independent hierarchical classiﬁcation of patents and

utility models according to the different areas of technology

to which they pertain.

Inventions from any ﬁeld are classiﬁed into 9 sections and

further subdivided into classes, subclasses, main groups and

subgroups (5th and lower levels).

The primary purpose of IPC is supporting patent docu-

ments retrieval, in order to establish the novelty and evaluate

the inventive step or non-obviousness of technical disclo-

sures in patent applications. The Classiﬁcation, furthermore,

has the important purposes of serving as [5]:

i. an instrument for the orderly arrangement of patent doc-

uments in order to facilitate access to the technological

and legal information contained therein;

ii. a basis for selective dissemination of information to all

users of patent information;

iii. a basis for investigating the state of the art in given ﬁelds

of technology;

iv. a basis for the preparation of industrial property statis-

tics which in turn permit the assessment of technological

development in various areas.

According to the third objective of IPC, it is assumed

that the patents belonging to a speciﬁc class constitute a

meaningful sample of documents from where to extract the

terminology of a certain ﬁeld of application and the main

technical parameters of such technological area.

2.2 The PAT-Analyzer Project

The authors are working on the development of new tech-

niques and algorithms for patent analysis and comparison

[6–8]. As a result of these previous experiences a prototype

software system (named PatAnalyzer) has been developed

with the following functionalities:

–identify the components of the invention;

–classify the identiﬁed components in terms of detail/

abstraction level and their compositional relationships in

terms of supersystem/subsystem links;

–identify positional and functional interactions between the

components both internal and external to the system;

–identify the most relevant components of each patent for

a given project according to a ranking criterion which

combines the detail level of the description with compo-

nents’ occurrences in patent claims and with the Inverse

Document Frequency, i.e. the “rarity” of each synset of

the Thesaurus.

2.3 Automatic Thesaurus Construction

The word thesaurus derives from Greek and Latin and means

“treasury or storehouse; hence, a repository, especially of

knowledge; often applied to a comprehensive work, like a

dictionary or encyclopedia”. Numerous deﬁnitions of the-

sauri exist across ﬁelds such as computer science, artiﬁcial

intelligence and library and information science [9–11].

They vary from quite modest deﬁnitions that do not spec-

ify types of conceptual relations, to more speciﬁc deﬁnitions

that clearly deﬁne the conceptual relations. In [12] there is

an example of a modest deﬁnition: “we deﬁne a thesaurus

as simply a mapping from words to other closely related

words”. In contrast, Miller gives a more elaborate deﬁnition

of a thesaurus as “a lexical-semantic model of a concep-

tual reality or its constituent, which is expressed in the form

of a system of terms and their relations, offers access via

multiple aspects and is used as a processing and search-

ing tool of an information retrieval unit” [13]. The ISO

2788:1986 (Guidelines for the establishment and develop-

ment of monolingual thesauri) standardizes Thesauri deﬁn-

ing it as a “vocabulary of a controlled indexing language,

formally organized so that a priori relationships between con-

cepts (for example as ‘broader’ and ‘narrower’ are made

explicit”). For the purpose of this work, it is adopted the

broader deﬁnition: a thesaurus is deﬁned as a structured

system of concepts identiﬁed by collections of terms and

hierarchical relationships between these concepts.

Manual thesaurus construction is a huge, time-consuming

task of term selection, conceptual analysis and relational

structuring of concepts and terms [10]; moreover, it is

subjected to problems of bias, inconsistency and limited

coverage.

In addition, thesaurus compilers cannot keep up with con-

stantly evolving language use and cannot afford to build new

Computer-Aided Comparison of Thesauri 557

thesauri for the many sub-domains that NLP techniques are

being applied to.

There is a clear need for methods to extract thesauri auto-

matically or tools that assist in the manual creation and

updating of these semantic resources.

Methods for automatic thesaurus extraction can be

roughly divided in two categories: Statistical methods;

Linguistic patterns methods.

Statistical methods rely on the observation that seman-

tically related terms will appear in similar contexts. These

systems differ primarily in their deﬁnition of context (e.g.

window of text, sentence, paragraph, grammatical context,

entire document) and the way they calculate similarity from

the contexts each term appears [14]. The simplest contexts

to extract are the words surrounding term up to some ﬁxed

distance. Some approaches take the whole document as the

context and consider term co-occurrence at the document

level.

In [15] grammatical relations are extracted such as:

• term is subject of a verb;

• term is the (direct/indirect) object of the verb;

• term is modiﬁed by noun or adjective;

• term is modiﬁed by a prepositional phrase.

The relations for each term are then collected and counted

producing a context vector for each term. Once these con-

texts have been deﬁned, these systems deﬁne measures of

similarity between context vectors and then use clustering or

nearest neighbor methods to ﬁnd related terms.

Linguistic pattern methods are based on the observation

that patterns of co-occurring terms carry information about

their semantic relations. These systems extract related terms

directly by recognizing linguistic patterns which connect

synonyms and hyponyms [16,17]. In the pioneering work

of Hearst [16], the use of linguistic patterns was suggested to

discover hyponymy relations from unstructured text.

Patterns like

such NP as {NP, }∗{(or|and)}NP

as in: “Works by such authors as Herrick, Goldsmith, and

Shakespeare”, or like

NP {,NP}∗{,}or other NP

as in: “Bruises, wounds, broken bones or other injuries”

can be used to extract hyponymy relations. From the exam-

ples it is possible to infer that “Herrick”, “Goldsmith” and

“Shakespeare” are all hyponyms of the term “author” and

“bruise”, “wounds” and “broken bone” are “injuries”.

Previously described methods have a general purpose

approach, relying only on mere text, without any other infor-

mation available on the ontological structure of the concepts

to be extracted and organized.

In the work by Shinzato [18] itemizations in HTML docu-

ments taken from the Web are exploited to identify hyponym

candidate sets, statistical measures and heuristics are then

used to select actual hyponyms.

The last described work suggests that, where available,

the information conveyed by the peculiar structure of the

analyzed document can be exploited.

The approach proposed in this work leverages patent

structure and pattern of text to provide a semi-automated

thesaurus generation system.

Focusing on invention components denominations as the

thesaurus terms here identiﬁed, it is possible to exploit the

semantic information conveyed with alternative denomina-

tion sets as deﬁned in Sect. 3.1 to discover synonymy and

hyponymy relations. The proposed approach is described in

the following section.

3 Thesaurus Construction and Comparison

The present chapter is subdivided in two subsections, the ﬁrst

focused on the original algorithm developed by the authors

for computer-aided thesaurus construction, the second details

the proposed procedure to compare thesauri extracted from

complementary patent classes with the aim of identifying the

main technical parameters of their related ﬁelds.

More in details, with the aim of building a model of

domain knowledge according to any of the approaches

described in [2] and in [3], it is necessary to identify two

different kinds of parameters:

• Evaluation Parameters, i.e. parameters to measure the

level of satisfaction of system requirements;

• Control Parameters, i.e. any kind of design variable, prop-

erty or feature controllable by the designer, which might

impact on at least one Evaluation Parameter.

Control Parameters and Evaluation Parameters related to a

speciﬁc technical ﬁeld will be referred as domain parameters

hereafter in the paper.

3.1 Semi-Automated Thesaurus Construction

Most of text mining systems applied to patent analysis suf-

fer the inﬂuence of the language style and the terminology of

the writer; in other terms, when different inventors adopt dif-

ferent terms or expressions to describe the same components

558 G. Cascini and M. Zini

and functions, existing text mining application are rarely able

to identify the existing semantic link between these concepts.

As described in [6], the authors identify the components of

an invention by means of their reference characters, accord-

ing to the universal patent writing rule which claims that “the

same part of an invention appearing in more than one view

of the drawing must always be designated by the same refer-

ence character, and the same reference character must never

be used to designate different parts” [19].

According to this rule, different denominations associated

to the same part, must be semantically related at least within

the given patent text. Besides, when comparing two different

patents, it is necessary to identify if component xof patent

Xis to be considered the same as component yof patent Y.

Chances are that the two components have different names in

different patents while referring to the same type of object.

In order to be able to compare components between dif-

ferent patents it is required to build a component denomina-

tions thesaurus, which deﬁnes concepts as sets of synonyms

(synsets) and hierarchical semantic relationships (hyponymy

and hypernymy) between those concepts. The proposed

approach to semi-automatically build such a thesaurus lever-

ages the extracted components and their alternative denomi-

nations, which are then processed through a heuristic of text

patterns to identify synonymy and hyponymy relationships.

3.1.1 Alternative Denominations

In order to provide an unambiguous description of the algo-

rithm for thesaurus construction it is helpful to introduce

some formal deﬁnitions.

Denomination dkof a component kis a word, or a set of

words, that denotes the component kin the patent text.

Alternative denominations set Ak,p={d1,d2,...,dn}of

a component kis deﬁned as the set of ndenomina-

tions referring to the same component kwithin the patent

p(an exemplary set of alternative denominations for a

few components of patent US 5.328.488 is shown in

Table 1).

The set of all the denominations sets extracted from every

component and every patent in a given invention set Iis

referred as AI. Finally, it is deﬁned DIas the set of all the

component denominations extracted from I.

3.1.2 Synonymy, Hyponymy and Hypernymy

Synonymy is usually deﬁned as different lexemes with the

same meaning leaving open the question of what it means to

have the same meaning. If it were to be applied to any context

a few words would be true synonyms.

Tab le 1 Patent US 5.328.488 “Laser light irradiation apparatus for

medical treatment” excerpt from the list of components and their

alternative denominations

Component – ref.

character Alternative denominations

Laser light

transmissive probe 1

laser light transmissive probe; probe;

right side laser light transmissive

probe; opposite laser light

transmissive probe; laser light

penetrating probe; transmissive

probe; light transmissive probe;

penetrating probe

optical ﬁber 8 optical ﬁber; single optical ﬁber

holder;

particle 20 particle; laser light scattering particle;

scattering particle

laser light emitting

portion 54a

laser light emitting portion; ﬂat

emitting portion

According to [20] the notion of substitutability has been

adopted: two lexemes will be considered synonyms if they

can be substituted for one another in a sentence without

changing either the meaning or the acceptability of the

sentence.

So, for the same purpose, it is assumed that two nomi-

nal syntagms (either formed by a single word or by several

words) are synonyms if they are substitutable in some envi-

ronment. The environment will be that of an invention set

disclosed in a given patent corpus I. Hyponymy is the relation

between two lexemes that holds when one lexeme denotes

a subclass of the other. Ais said to be Hyponym of B,if

Bdenotes a more general class; in this case Bis said to be

Hypernym of A. Thus, car is a hyponym of vehicle and vehi-

cle is hypernym of car [20]. Hereafter, this kind of relations

will be considered as a generalization relation. Hence a the-

saurus built according to the algorithm described below will

always refer to a single patent corpus Iand will be composed

of synsets, every synset being a set of nominal syntagms rep-

resenting component denominations considered synonyms in

the context of the said invention set. A synset can be thought

as a single concept described by the denominations it con-

tains and representing a common meaning or sense for those

denominations.

3.1.3 Co-occurrence Graph

To represent the information about component denomina-

tions conveyed by alternative denominations set, it is pro-

posed to use an undirected weighted graph deﬁned as fol-

lows: the co-occurrence graph as GI=(V,E) where V=

{d1,d2...dn}is the set of all the component denominations

DIdeﬁned above.

Let AIbe the set of all the alternative denominations set

found in the invention set Iand let Ak,p∈Ar.

Computer-Aided Comparison of Thesauri 559

An edge e∈Ebetween node diand node djexists if

∃Ak,p∈AI:di,dj∈Ak,p.

Let Wbe the weight function on G;W:(E)→(NxN):

the weight of the edge e∈E,W(e)=(w1,w2) is cal-

culated such that w1represents the number of alternative

denominations in which diand djco-occur (in one or more

patents of the corpus), and w2represents the number of

different patents in which this happens.

3.1.4 Component Denominations Thesaurus

According to the deﬁnition adopted for the present work

mentioned in Sect. 2.3, the thesaurus can be represented

as a directed graph, in which nodes represent synsets and

directed edges represent generalization relations. As shown

below, not only the internal representation of the thesaurus

is a graph, but it is possible also to represent it graphically

in the user interface, allowing for a clear representation of

the conveyed information. The user is also able to interact

with the graph to modify it, in order to correct errors of the

algorithm or to add or modify relationships that could not be

discovered by the system automatically.

In order to build such a thesaurus the following algorithm

is applied.

1Co-occurrence graph construction

Component denominations and alternative denominations

sets are extracted from the entire patent corpus Iand a

co-occurrence graph is built according to the deﬁnitions

provided above.

2. Generalization edges transformation

The ﬁrst step to build the ﬁnal thesaurus graph consists

in the transformation of co-occurrence edges in gener-

alization edges. If two co-occurring denominations are

in a generalization relation the corresponding edge is

transformed in a generalization edge. To identify gener-

alizations a simple heuristic is proposed: if component

denomination dico-occurs with denomination dj(there is

an edge in the co-occurrence graph) and if di⊂djthen di

is considered to be hypernym of dj.

3. Synsets merging

Once every generalization relationship has been trans-

formed, co-occurring denominations are merged in synset

nodes. If a merge operation leads to inconsistency (a cycle

would be created in the graph) the nodes are not merged

and the co-occurrence edge is left for the user to dis-

ambiguate. To reduce the number of inconsistencies, the

merging algorithm merges edges in an ordered way. Edges

are ordered for increasing number of words of the con-

nected nodes and decreasing edge weight. The merging

operation starts from the smallest denominations and the

highest weights.

4. User disambiguation

In this step the user disambiguates the remaining co-

occurrence edges, corrects wrong relations and eventually

merges or separates synsets according to his/her speciﬁc

knowledge of the ﬁeld.

An exemplary excerpt from a thesaurus construction task

is shown in Figs. 1,2,3,4,5, and 6. The ﬁrst (Fig. 1) rep-

resents the co-occurrence graph (step 1). In Fig. 2edges

representing generalizations have been transformed (step 2);

notice the arrow that points to the hyponym.

In Fig. 3single word co-occurrences have been merged

(step 3). It is worth to note that in this trivial example,

constituted by a small number of patents and a subset of

components, no threshold has been used to merge the nodes;

in a general case, synsets can be created by merging only

the nodes whose link overcome a minimum number of alter-

native denominations in which the nodes co-occur and/or a

minimum number of patents in which the co-occurrence is

found.

In Fig. 4two words co-occurrences have been merged.

Notice that the edge Acannot be merged since there is

already a generalization edge connecting the synsets. In

arc quenching core

core

2;2

1;1

4;4

coil

solenoid coil electromagnetic solenoid

solenoid

movable core

Fig. 1 Edges represent

co-occurrence of denominations

in alternative denominations set;

notice the weight pair on each

edge, the left number represents

the number of alternative

denominations set in which the

nodes co-occur, the number on

the right represents the number of

patents in which the

co-occurrence happens

560 G. Cascini and M. Zini

solenoid coil

arc quenching core

core

1;1

4;4

coil

electromagnetic solenoid

solenoid

movable core

Fig. 2 Generalizations

identiﬁcation. Generalization

relations have been identiﬁed

and transformed

arc quenching core

coil,

solenoid,

core

movable core

electromagnetic solenoid

solenoid coil

1;1

1;4

Fig. 3 Denominations merging step 1. Nodes composed of one word

have been merged, notice that the edge between (coil, solenoid, core)

and (solenoid coil) cannot be merged since there is already a general-

ization relationship

arc quenching core

coil,

solenoid,

core

movable core

solenoid coil,

electromagnetic

solenoid

1;1

Fig. 4 Denominations merging step 2. Nodes (solenoid coil) and

(electromagnetic coil) are merged. The user should disambiguate co-

occurrence edge left

Fig. 5the edge Abetween {coil, solenoid, core} and

{solenoid coil, electromagnetic solenoid} has been disam-

biguated, the expert considers solenoid, solenoid coil, coil

and electromagnetic solenoid as synonyms to the extent of

this invention set.

coil,

solenoid, core,

solenoid coil,

electromagnetic

solenoid

arc quenching core movable core

Fig. 5 Disambiguation: the user has chosen to disambiguate the edge

eliminating the generalization relation, this leads to the merging of the

two nodes

coil,

solenoid,

solenoid coil,

electromagnetic

solenoid

arc quenching core

core

movable core

Fig. 6 The ﬁnal thesaurus. Notice that the user has chosen to manually

separate core from the synset (core, coil, solenoid coil, electromagnetic

solenoid

In Fig. 6the ﬁnal thesaurus is represented. Notice that the

expert has chosen to separate {core} from the synset {coil,

solenoid coil, solenoid, electromagnetic solenoid)} since it

cannot be considered a synonym even to the extent of this

inventions set. Notice also that this error could have been

easily avoided choosing a higher threshold for merging, since

the edge A connecting {core} to the other denominations had

a weight of (1,1), this means that this denominations where

co-occurring only in one patent.

Computer-Aided Comparison of Thesauri 561

3.2 Thesauri Comparison and Field

Parameters Identiﬁcation

The algorithm described in the previous section allows to

build a thesaurus related to a given corpus of patents. It is

evident that, due to the adopted criteria, the robustness of the

process increases with the uniformity of the corpus contents.

In other terms, the reliability of the thesaurus is higher if the

analysis is limited to document belonging to the same class

and even more if it is focused on a speciﬁc sub-class or even

a patent group.

Moreover, it is interesting to observe that in most cases the

IPC classiﬁcation, especially for well established products

and processes, even if not purposefully, is structured accord-

ing to a Function-Behavior-Structure hierarchy, such that top

level classes distinguish different functions or sets of func-

tions within a given domain, while deeper branches as groups

and subgroups are more related to alternative behaviors and

structures to deliver the same function. For example, the

class D06F covers domestic or laundry devices for washing,

rinsing and dry-cleaning textile articles (Function). Within

this class the group D06F 23/00 is related to “Washing

machines with receptacles, e.g. perforated, having a rotary

movement, e.g. oscillatory movement, the receptacle serv-

ing both for washing and centrifugally draining” (Behavior).

The sub-groups D06F 23/02, D06F 23/04, D06F 23/06

distinguish between “rotating or oscillating about a horizon-

tal/vertical/inclined axis” respectively (Structure).

The main idea of the present work for domain parameter

identiﬁcation, as deﬁned in paragraph 3, is that a compari-

son between thesauri extracted from speciﬁc IPC subgroups,

belonging to the same patent class, should highlight common

terms mostly related to the main function of the technical

system. Besides, it is assumed that the most characterizing

differences between thesauri extracted from complementary

IPC subgroups are related to the way the function is deliv-

ered, i.e. to the behavior and the structure of the related

inventions (Fig. 7).By analyzing these attributes from each

thesaurus, and more speciﬁcally the hyponymy-hypernymy

chains in order to extract adjectives and appositions from the

hyponyms, it is possible to provide to the user a list of terms

closely connected to the features governing the functioning

of the system. In facts, as it will be shown in the following

section, it is easy to extract from this list of terms a set of

relevant domain parameters.

At the current level of development, this last step is still

in charge of the user; nevertheless, the speed of the process

makes this task much more efﬁcient than a traditional manual

investigation of the relevant design parameters.

Since the thesauri extracted according to the algorithm

described in Sect. 3.1 can be constituted by hundreds if not

thousands of entries, it is suggested to prioritize the analysis

Fig. 7 Comparison of thesauri extracted from IPC groups and sub-

groups of a same patent class

by taking into account the terms containing the keywords

belonging to the analyzed IPC classes; then, by browsing

the thesaurus network through the hypernym/hyponym links,

it is possible to build a list of adjectives and appositions

from where the user can easily extract relevant technical

parameters of the technical ﬁeld under study.

At the present level of the research, no robust direc-

tions have been identiﬁed to distinguish, within the domain

parameters set, between Evaluation and Control Parameters;

therefore the classiﬁcation is in charge of the patent

analyst.

4 Exemplary Application: Technologies

for Water Puriﬁcation

In order to clarify the proposed comparison procedure, this

chapter describes an exemplary application in the ﬁeld of

water puriﬁcation through different technologies.

The most relevant International Patent Class related to

this function is the C02F (Treatment of water, waste water,

sewage, or sludge), which is subdivided into:

• C02F-1 (Treatment of water, waste water, or sewage);

• C02F-3 (Biological treatment of water, waste water, or

sewage);

• C02F-5 (Softening water; Preventing scale; Adding scale

preventatives or scale removers to water, e.g. adding

sequestering agents);

• C02F-7 (Aeration of stretches of water);

• C02F-9 (Multistep treatment of water, waste water or

sewage).

562 G. Cascini and M. Zini

acid

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

acid reservoir

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

reservoir tank

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

distilled water reservoir

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

filtrate reservoir

A: 2

B: 1

avgC: 2.0 maxC: 2

avgD: 1.0 maxD: 1

fog-laden water reservoir

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

gray water reservoir

A: 2

B: 1

avgC: 2.0 maxC: 2

avgD: 1.0 maxD: 1

tanks

A: 94

B: 46

avgC: 94.0 maxC: 94

avgD: 46.0 maxD: 46

reservoir

A: 19

B: 14

avgC: 19.0 maxC: 19

avgD: 14.0 maxD: 14

Fig. 8 Excerpt from the

thesaurus graph automatically

built by processing 150 patents

belonging to the class C02F-1/02

As mentioned in Sect. 2.1, each of these classes is fur-

ther subdivided into full digit classes, related to alternative

speciﬁc technologies (behaviors) to deliver the main func-

tion. For example, the treatment of water (C02F-1) can be

operated by:

• Heating (C02F-1/02);

• Freezing (C02F-1/22);

• Flotation (C02F-1/24);

• Sorption (C02F - 1/28);

• Irradiation (C02F-1/30);

• Centrifugal separation (C02F-1/38);

• and others...

A thesaurus has been automatically built for each of these

classes through the steps described in Sect. 3.1, by analyz-

ing all the patents granted by the United States Patent Ofﬁce

between 1971 and August 2009 and by the European Patent

Ofﬁce between 1980 and August 2009.

In this case study, the threshold level for automatic

acceptance of the semantic relationships (synonymy, hyper-

nymy/hyponymy) have been set as (3, 2), according to the

weight deﬁnition given in Sect. 3.1. It means that only

the semantic links appearing in at least 3 different compo-

nents and in at least 2 different patents have been stored

in the thesaurus. In order to demonstrate the efﬁciency of

the proposed algorithms, neither manual disambiguation,

nor manual integration of semantic relationships have been

applied. It is clear that a thesaurus improved through the con-

tribution of a subject meta-expert would provide a richer set

of information.

An exemplary excerpt from the thesaurus graph related

to the class C02F-1/02 (Treatment of water by Heating) is

shown in Fig. 8. In this example, each synset is constituted

just by one syntagm (single or multiword). The parameters

showed in the synset boxes are:

•A=number of components where the synset occurs;

•B=number of patents where the synset occurs;

•C=number of components where each alternative

denomination of the synset occurs (average and max

value);

•D=number of patents where each alternative denomina-

tion of the synset occurs (average and max value).

Computer-Aided Comparison of Thesauri 563

As stated in the previous section, the hyponyms related

to a given term are characterized by attributes that can be

associated to parameters which qualify the given term. From

the example in Fig. 8, the patent analyst can deduce with

no efforts (i.e. without reading any patent document) that

reservoirs for water treatment by heating can be classiﬁed

according to their Control Parameter “content”, which can

assume the following values:

• acid;

• distillate water;

• ﬁltrate;

• fog-laden;

• gray water.

Indeed, it must be observed that not necessarily the

attributes related to the same noun can be assumed as

different values of the same parameter. For example, in the

class C02F-1/28 the noun “reservoir” has the attribute “sol-

vent”, which in fact is a possible value of the parameter

“content”; but also “feed” which can be interpreted as a value

“feed” of the control parameter “function”. Therefore, the

interpretation of the parameters must still be done by the

patent analyst, as well as the association of the attributes as

possible values of each parameter.

Besides, the authors are investigating the possibility to

increase the level of automation of the algorithm by connect-

ing the analysis to a general purpose thesaurus as proposed

also in [21]. For example, “acid”, “water” and “ﬁltrate”

share the same direct and indirect hypernyms: “chemical,

chemical substance”, “material, stuff”, “substance”, “mat-

ter”. Thus, the analysis of hypernymy chains might help

distinguishing between values related to different parameters

filter

activated carbon filter layer

A: 1

B: 1

unactivated carbon filter

A: 1

B: 1

avgC: 1.0 maxC:1

avgD: 1.0 maxD:1

activated carbon filter

A: 1

B: 1

avgC: 1.0 maxC:1

avgD: 1.0 maxD:1

sterilizing filter

A: 7

B: 7

avgC: 7.0 maxC: 7

avgD: 7.0 maxD: 7

pressure-sensitive filter

A: 1

B: 1

A: 28

B: 21

avgC: 28.0 maxC: 28

avgD: 21.0 maxD: 21

cartridge filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

filter/settling grid,

lining,

volatile grid

A: 5

B: 3

maxC: 2

maxD: 2

particulate filter

A: 4

B: 4

C: 4

D: 4

leukocyte reduction filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

aluminum filter

A: 1

B: 1

C: 1.0 maxC: 1

D: 1.0 maxD: 1

bubble-removing filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

hydrocarbon absorption filter,

hydrocarbon adsorption filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

biofilter system

A: 1

B: 1

avgC: 1

avgD: 1

filter element,

filter material,

filter media

A: 8

B: 6

avgC: 4.3333335 maxC: 7

avgD: 3.6666667 maxD: 5

activated aluminum filter

A: 1

B: 1

avgC: 2.0 maxC: 2

avgD: 1.0 maxD: 2

ceramic filter material

A: 4

B: 4

avgC: 4.0 maxC: 4

avgD: 4.0 maxD: 4

backflushable biofilter system

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

Fig. 9 Excerpt from the thesaurus graph automatically built by processing 341 patents belonging to the class C02F-1/24

564 G. Cascini and M. Zini

and possibly also to identify the categories of the parameters

themselves.

As claimed in the previous section, it is interesting to

compare the attributes assigned to the same item in alter-

native technical systems, i.e. the hyponyms sets extracted

from complementary patent classes. In facts, the differences

help revealing peculiarities and can be proposed to the patent

analyst as triggers for identifying the most characteristic

technical parameters.

For example, let’s consider Figs. 9and 10 representing

the direct hyponyms sets of the item “ﬁlter” in the classes

C02F-1/24 and C02F-1/28 (water treatment by ﬂotation

and by sorption). A technician, even without reading any

patent from those classes, can identify with minimal efforts

the parameters and values reported in the Tables 2and 3.

Notsurprisingly, ﬂotation systems explicitly cover a wider

range of applications as revealed by the evaluation param-

eters related to the object of the ﬁltering action. Moreover,

several action principles have been identiﬁed.

Besides, sorption-based systems are characterized by dif-

ferent geometries and properties related to their operating

conditions.

By navigating the hypernyms/hyponyms links starting

from the keywords extracted by the IPC classes/subclasses

titles, it is possible to collect a comprehensive set of

parameters and values as a support action for building a

model of the domain under analysis.

activated carbon filter

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

hollow cylindrical filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

carbon filter

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

carbon block filter

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

sediment filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

immersible filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

replaceable torroidal shaped filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

carbon block

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

water filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1 sand filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

replaceable filter

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

filter part

A: 1

B: 1

avgC: 1.0 maxC: 1

avgD: 1.0 maxD: 1

cylindrical filter

A: 2

B: 2

avgC: 2.0 maxC: 2

avgD: 2.0 maxD: 2

filter

A: 25

B: 16

avgC: 25.0 maxC: 25

avgD: 16.0 maxD: 16

Fig. 10 Excerpt from the thesaurus graph automatically built by processing 150 patents belonging to the class C02F-1/28

Computer-Aided Comparison of Thesauri 565

Tab le 2 Exemplary parameters for the item “ﬁlter” and related values

which can be manually extracted from the hyponym set of Fig. 9

Parameter Values

Action principle/Filter material Activated Carbon (ﬁlter)

Action principle/Filter material Unactivated Carbon (ﬁlter)

Action principle/Filter material Aluminum (ﬁlter)

Action principle/Filter material Activated aluminum (ﬁlter)

Action principle/Filter material Ceramic (ﬁlter)

Action principle/Filter material Bioﬁlter

Action principle/Filter material Absorption/adsorption

Object of ﬁlter action Particulate

Object of ﬁlter action Leukocyte

Object of ﬁlter action Bacteria (Sterilizing)

Object of ﬁlter action Bubble

Object of ﬁlter action Hydrocarbon

Maintenance/Cleanability Backﬂushable

Tab le 3 Exemplary parameters for the item “ﬁlter” and related values

which can be manually extracted from the hyponym set of Fig. 10

Parameter Values

Action principle/Filter material Carbon (ﬁlter)

Action principle/Filter material Activated Carbon (ﬁlter)

Action principle/Filter material Sand (ﬁlter)

Shape Cylindrical

Shape Hollow cylindrical

Shape Toroidal

Working Environment Immersible

Maintenance/Replaceability Replaceable

5 Conclusions and Further Developments

This present paper addresses the goal of reducing time and

efforts necessary to gather domain information from patent

analysis. The speciﬁc objective is to speed up the identiﬁca-

tion of domain technical parameters (Evaluation and Control

Parameters) relevant for a given ﬁeld of application. These

sets of parameters can be used either for creating a gen-

eral purpose domain Knowledge Base, or for mapping the

key problems to be addressed in a given ﬁeld of application

[2], or even for supporting evolutionary analyses of technical

systems [3].

The authors, on the base of their past experiences in the

ﬁeld of patent text mining, are studying the possibility to

identify domain technical parameters through the compar-

ison of the thesauri extracted from complementary patent

classes. At the current stage of development, the proposed

approach allows to provide to the patent analyst a set of

attributes for each relevant element of the technical system,

from which the extraction of evaluation and control param-

eters is a quite efﬁcient task, in any case much faster than

any approach based on questionnaire to experts or manual

patent reading. Nevertheless, the applications performed so

far don’t allow to estimate the completeness of the domain

coverage: it is assumed that the attributes and qualiﬁcations

reported in the patents belonging to a certain technical ﬁeld

cover all the domain parameters.

The identiﬁcation of the technical parameters can be fur-

ther automated by exploiting available information as seman-

tic relationships in general purpose thesauri or the location in

the patent text (e.g. parameters extracted from the claims are

essentially related to design choices, i.e. control parameters).

Besides, the ﬁrst attempts to identify also the relationships

between the parameters have revealed that further analyses

are needed to recognize general patterns to be formalized in

terms of algorithmic rules.

The proposed semi-automatic approach to build a the-

saurus for a speciﬁc patent class and to extract relevant

domain parameters has been clariﬁed by means of an exam-

ple in the ﬁeld of water treatments, where six alternative tech-

nologies have been analyzed. The promising results obtained

so far suggest to investigate with further case studies the

validity of the proposed algorithms and the opportunities for

further development.

Acknowledgments The authors would like to thank Niccolò Becattini

and Walter D’Anna from Politecnico di Milano for their contribution to

patents search and analysis.

References

1. Porter, A.L., et al. (2004) Technology future analysis: Toward inte-

gration of the ﬁeld and new methods. Technological Forecasting &

Social Change, 71:287–303.

2. Cavallucci, D., Eltzer, T. (2007) Parameter network as a mean

for driving problem solving process. International Journal of

Computer Application Technology, 30(1/2):125–136.

3. Cascini, G., Rotini, F., Russo, D. (2009) Functional mod-

eling for TRIZ-based evolutionary analyses. Proceedings of

the International Conference on Engineering Design, ICED09,

Stanford University, Stanford, CA, USA, 24-27 August.

4. Bregonje, M. (2005) Patents: A unique source for scientiﬁc tech-

nical information in chemistry related industry? World Patent

Information, 27(4):309–315.

5. Guide to the IPC (version 2009) Available at http://www.wipo.

int/classiﬁcations/ipc/en/general/.

6. Cascini, G., Neri, F. (2004) Natural language processing for

patents analysis and classiﬁcation. Proceedings of the 4th TRIZ

Future Conference, Florence, 3–5 November, published by Firenze

University Press, ISBN 88-8453-221-3.

7. Cascini, G., Russo, D., Zini, M. (2007) Computer-aided patent

analysis: Finding invention peculiarities. Proceedings of the

2nd IFIP Working Conference on Computer Aided Innovation,

Brighton, MI, USA, 8-9 October, published on “Trends in

Computer-Aided Innovation”, Springer, ISBN 978-0-387-75455-

0, pp. 167–178.

8. Cascini, G., Russo, D. (2007) Computer-aided analysis of patents

and search for TRIZ contradictions. International Journal of

Product Development, Special Issue: Creativity and Innovation

Employing TRIZ, 4:1–2.

566 G. Cascini and M. Zini

9. Shapiro, S. (1992) Encyclopedia of Artiﬁcial Intelligence, vol. 2.

Wiley, New York, NY.

10. Aitchison, J., Gilchrist, A., Bawden, D. (2000) Thesaurus

Construction and Use: a Practical Manual. ASLIB, London.

11. Schneider, J.W. (2005) Veriﬁcation of bibliometric methods’

applicability for thesaurus construction. SIGIR Forum, 39(1):

63–64.

12. Schutze, H., Pedersen, J. (1997) A co-occurrence-based the-

saurus and two applications to information retrieval. Information

Processing and Management, 33(3):307–318.

13. Miller, U. (1997) Thesaurus construction: Problems and their

roots. Information Processing and Management, 33(4):481–493.

14. Curran, J., Moens, M. (2002) Improvements in automatic thesaurus

ex-traction. Proceedings of the Workshop on Unsupervised Lexical

Acquisition. Unsupervised Lexical Acquisition: Proceedings

of the Workshop of the ACL Special Interest Group on

the Lexicon (SIGLEX), Philadelphia, July 2002, pp. 59–66.

Association for Computational Linguistics. Paper available at:

http://www.aclweb.org/anthology/W/W02/W02-0908.pdf.

15. Curran, J. (2002) Ensemble methods for automatic thesaurus

extraction. Proceeding EMNLP ’02 Proceedings of the ACL-02

conference on Empirical methods in natural language processing,

Vol. 10, pp. 222–229, doi: 10.3115/1118693.1118722.

16. Hearst, M. (1992) Automatic acquisition of hyponyms from

large text corpora. Proceedings of the 14th Conference

on Computational linguistics, vol. 2, Association for

Computational Linguistics, Morristown, NJ, USA, pp. 539–545.

17. Caraballo, S. (1999) Automatic construction of a hypernym-

labeled noun hierarchy from text. Proceedings of the 37th annual

meeting of the Association for Computational, Morristown, NJ,

USA, pp. 120–126.

18. Shinzato, K., Torisawa, K. (2004) Acquiring hyponymy rela-

tions from web documents. Proceedings of the Human

Language Technology Conference of the North American

Chapter of the Association for Computational Linguistics:

HLT-NAACL 2004. Available at: http://acl.ldc.upenn.edu/hlt-

naacl2004/main/pdf/103_Paper.pdf

19. Consolidated Patent Rules: Title 37 – Code of Federal

Regulations – Patents, Trademarks, and Copyrights. Available at

http://www.uspto.gov/web/ofﬁces/pac/mpep/consolidated_rules.pdf

(last access October 2009).

20. Jurafsky, D., Martin, J.H. (2000) Speech and Language Processing:

An Introduction to Natural Language Processing, Computational

Linguistics, and Speech Recognition. In: Russell, S., Norvig, P.

(Eds.) Prentice Hall series in artiﬁcial intelligence. University of

Colorado, Boulder, Upper Saddle River, NJ: Prentice Hall, Vol.

Xxvi, 934 pp, hardbound, ISBN 0-13-095069-6, $64.00.

21. Lee, S., Huh, S.Y., McNiel, R.D. (2008) Automatic genera-

tion of concept hierarchies using WordNet. Expert Systems with

Applications, 35(3):1132–1144.

A KNOWLEDGE GRAPH AND RULE BASED REASONING METHOD FOR EXTRACTING SAPPHIRE INFORMATION FROM TEXT

Article

Full-text available

Jun 2023

Representation of design information using causal ontologies is very effective for creative ideation in product design. Hence researchers created databases with models of engineering and biological systems using causal ontologies. Manually building many models using technical documents requires significant effort by specialists. Researchers worked on the automatic extraction of design information leveraging the computational techniques of Machine Learning. But these methods are data intensive, have manual touch points and have not yet reported the end-to-end performance of the process. In this paper, we present the results of a new method inspired by the cognitive process followed by specialists. This method uses the Knowledge Graph with Rule based reasoning for information extraction for the SAPPhIRE causality model from natural language texts. Unlike the supervised learning methods, this new method does not require data intensive modelling. We report the performance of the end-to-end information extraction process, which is found to be a promising alternative.

A Knowledge Graph and Rule based Reasoning Method for Extracting SAPPhRE Information from Text

Preprint

Full-text available

May 2023

Kausik Bhattacharya

A Knowledge Graph and Rule based Reasoning Method for Extracting SAPPhRE Information from Text Representation of design information using causal ontologies is very effective for creative ideation in product design. Hence researchers created databases with models of engineering and biological systems using causal ontologies. Manually building many models using technical documents requires significant effort by specialists. Researchers worked on the automatic extraction of design information leveraging the computational techniques of Machine Learning. But these methods are data intensive, have manual touch points and have not yet reported the end-to-end process's performance. In this paper, we present the results of a new method that uses the Knowledge Graph with Rule based reasoning for information extraction for the SAPPhIRE causality model from natural language texts. Unlike the supervised learning methods, this new method is not data intensive. We report the performance of the end-to-end information extraction process, which is found to be a promising alternative.

ARIZ85 and Patent-driven Knowledge Support

Conference Paper

Full-text available

Oct 2012

The growing complexity of technical solutions, which encompass knowledge from different scientific fields, makes necessary, also for multi-disciplinary working teams, the consultation of information sources. Indeed, tacit knowledge is essential, but often not sufficient to achieve a proficient problem solving process. Besides, the most comprehensive tool of the TRIZ body of knowledge, i.e. ARIZ, requires, more or less explicitly, the retrieval of new knowledge in order to entirely exploit its potential to drive towards valuable solutions. A multitude of contributions from the literature support various common tasks encountered when using TRIZ and requiring additional information; most of them hold the objective of speeding up the generation of inventive solutions thanks to the capabilities of text mining techniques. Nevertheless, no global study has been conducted to fully disclose the effective knowledge requirements of ARIZ. With respect to this deficiency, the present paper illustrates an analysis of the algorithm with the specific objective of identifying the different types of information needs that can be satisfied by patents. The results of the investigation lay bare the most significant gaps of the research in the field. Further on, an initial proposal is advanced to structure the retrieval of relevant information from patent sources currently not supported by existing methodologies and software applications, so as to exploit the vast amount of technical knowledge contained in there. An illustrative experiment sheds light on the relevance of control parameters as input terms for the definition of search queries aimed at retrieving patents sharing the same physical contradiction of the problem to be treated.

Computer-Aided Patent Analysis: finding invention peculiarities

Article

Full-text available

Nov 2007

The application of standard Information Extraction techniques to Patent Analysis has several limitations partially due to the difference existing between patents and web pages, which are the object of the biggest majority of information search. Indeed, while in other fields customized processing techniques have been developed, the number of studies fully dedicated to patent text mining is very limited and the tools available on the market still require a relevant human workload. This paper presents an algorithm to identify the peculiarities of an invention through an automatic functional analysis of the patent text; as a result a ranked list of components and functions is provided as well as a selection of meaningful paragraphs disclosing the details of the invention. An example related to laser irradiation devices for medical treatment clarifies its basic steps.

Natural language processing for patents analysis and classification

Article

Full-text available

Jan 2010

Cited By (since 1996): 1, Export Date: 11 December 2012, Source: Scopus

Computer-aided analysis of patents and search for TRIZ contradictions

Article

Full-text available

Jan 2007
Int J Prod Dev

TRIZ, the Soviet-initiated Theory of Inventive Problem Solving, is gaining acknowledgement both as a systematic methodology for innovation and a powerful tool for technology forecasting. Nevertheless, the analysis of patents necessary for gathering the data to be used for the previsional activity is very cumbersome and sometimes unworthy due to the intrinsic low reliability of forecasting tasks. With this perspective it is necessary to speed up the identification of the technical/physical conflict(s) overcome by an invention, according to its textual description. Although text-mining tools have reached relevant capabilities for extracting useful information from huge sets of documents, no specific means are available to support the analysis of patents with the aim of identifying the contradiction underlying a given technical system. This paper proposes a computer-aided approach for accomplishing such a task: the algorithm is described and validated by means of practical examples.

Functional modeling for TRIZ-based evolutionary analyses

Conference Paper

Full-text available

Jan 2009

TRIZ literature presents several papers and even books claiming the efficiency of Altshuller's Laws of Engineering System Evolution as a means for producing technology forecasts. Nevertheless, all the instruments and the procedures proposed so far suffer from poor repeatability, while the increasing adoption of innovation as the key factor for being competitive requires reliable and repeatable methods and tools for the analysis of emerging technologies and their potential impact. The present paper proposes an original algorithm to perform a functional analysis aimed at building a Network of Evolutionary Trends for a given Technical System with repeatable steps. Such a goal has been achieved by integrating well known models and instruments for system description and function representation. The overall procedure has been already validated in a number of industrial case studies and it's here clarified by means of an example about the production of tablets in the pharmaceutical manufacturing sector.

Acquiring hyponymy relations from web documents

Article

Jan 2004

Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition

Book

Jan 2000

Verification of bibliometric methods´ applicability for thesaurus construction

Thesis

Jan 2006

Jesper Wiborg Schneider

A coocurrence-based thesaurus and two applications to information retrieval

Article

Jan 1998

Technology futures analysis: Toward integration of the field and new methods Technology Futures Analysis Methods Working Group

Article

Alan L. Porter

Many forms of analyzing future technology and its consequences coexist, for example, technology intelligence, forecasting, roadmapping, assessment, and foresight. All of these techniques fit into a field we call technology futures analysis (TFA). These methods have matured rather separately, with little interchange and sharing of information on methods and processes. There is a range of experience in the use of all of these, but changes in the technologies in which these methods are used—from industrial to information and molecular—make it necessary to reconsider the TFA methods. New methods need to be explored to take advantage of information resources and new approaches to complex systems. Examination of the processes sheds light on ways to improve the usefulness of TFA to a variety of potential users, from corporate managers to national policy makers. Sharing perspectives among the several TFA forms and introducing new approaches from other fields should advance TFA methods and processes to better inform technology management as well as science and research policy.

Natural Language Processing for patents analysis and classification

Article

Jan 2004

Cited By (since 1996): 1, Export Date: 11 December 2012, Source: Scopus

Computer-Aided Comparison of Thesauri Extracted from Complementary Patent Classes as a Means to Identify Relevant Field Parameters

Abstract and Figures

Recommended publications

MSCA Postdoctoral Fellowships Master Class 2022

Identification of promising patents for technology transfers using TRIZ evolution trends

Measuring patent similarity by comparing inventions functional trees

Computer-Aided Patent Analysis: finding invention peculiarities

Searching for similar products through patent analysis

TRIZ-based Anticipatory Design of Future Products and Processes

ARIZ85 and Patent-driven Knowledge Support