Figure 1 - uploaded by Mihaela Gordan
Content may be subject to copyright.
DSP Structure and functioning 

DSP Structure and functioning 

Source publication
Article
Full-text available
Hydro dams are very important economical and social structures that have a great impact on the population living in surrounding area. Surve illance of dam status consists of a complex process which involves data acquisition and analysis techni ques, implying both measurements from sensors and transducers placed in the dam body and its surround ings...

Contexts in source publication

Context 1
... Spring Tuff, Yucca Mountain, Nevada, Journal of contaminant hydrology, Elsevier Science, Amsterdam, vol. 62-63, 2003, pp. 237-247, [9] J. C. Dunn, A Fuzzy Relative of the [19] J. L. Castro, M. Delgado. Fuzzy Systems with defuzzification are Universal Approximators. IEEE Transactions on System, Man and Cybernetics , Vol. 26(02),1996 The main objective of the Hidroeol project, contract number 21062/2006 (in the frame of National Program 2) is to contribute to the knowledge regarding the integration of wind resources in the existent energy system through: i) establishing the limit to which wind energy can be added without deteriorating the quality of the energy system; ii) establishing conditions, including preserving capacities, and computing the efficiency of integrating wind turbines and plants in the national energy system. The specific objective of the project consists in proving the superiority of hybrid hydro- wind systems integrated in report with the same capacities connected separately. The Romanian global energetic structure in 2004 is shown in Figure 1, and Figure 2 presents the electrical energy structure of the same year, without taking into consideration group 2 from Cernavoda As a consequence the rigidity of the energy system is grater today than it was in 2004, signifying that the rigid part is greater than 81,5% (which was in 2004). As follows the capacity to integrate wind energy in the energy system is less reduced, because a rigid system, with over 80% from the total produced energy by the steam power plant and nuclear plants, cannot absorb unpredictable production capacities, like Between the advantages of wind energy there have to be mentioned the following: do not emit greenhouse effect gases (1 Mwh produced through coil or Diesel burning produces 0.8t of greenhouse effect gas – mainly CO 2 ); do not result toxic wastes concentrated and radioactive; do not consume fossil fuel; the cost is reduced at the investment amortization and maintenance (the production of 1 kwh through wind energy costs 0.05-0.10$ Canadian or 0.03-0.07$ American against the same kwh produced by Diesel burning which costs 0.25-1.00$ ); it is geographical widely spread; the terrain occupied by wind installation is insignificant (98% from wind plants remain for agriculture, animal growth, etc.) wind generators, without deteriorating the system’s general quality. On the other hand wind energy has a series of advantages for which it worth being promoted if it accomplishes certain conditions. In the below table are given, comparatively, different investment costs, of maintenance, etc. to better frame this type, relatively new, of energy among the others energy generation types. Wind plants can increase rural economies through terrain renting and royalty fees The disadvantages of wind energy are: Relatively small energy concentration; Energy concentration in a very small amount of time; Unpredictability These disadvantages which translate through the fact that if the wind energy overruns a certain percent of the total produced energy can cause problems. Denmark manages over 20% of its total energy from wind, but the growth of this percent leads to problems; Denmark being relatively small country (approximately 3 counties) wind unpredictability can make 20% of the total installed capacity to suddenly appear or disappear, fact that rise management problems. In a study financed by Minnesota state USA it is estimated that up to 25% of wind energy from the total amount of produced energy can be efficiently administrated at a low cost of 0.0045 $/kwh [8]. In another American study of the solar energy society it is estimated that up to 20% of the consumed energy can be taken from unpredictable sources without big difficulties and extra costs. Obviously, this limit can be increased if the energetic system has storage capacities. In all these kind of cases, which resides in the mentioned disadvantages, there are foreseen two principle solutions: a) Interconnection with a greater energy system, so that the wind energy percent from the total number of the production diminish under a certain percent – safety threshold, and b) Developing preservation capacities of wind energy Solution a) can, and is something conceived as a diversity requirement of energy sources; a recent study of Kassel University proposed a hybrid wind-solar network spread over whole Germany. The wide spread of such a network would ensure the availability of a certain percent from the total installed capacities. The purposes throughout the entire project, besides others, are to demonstrate based on: i) pilot station which embodies a hydro and wind generator , a superior basin, a stilling basin and a pump; ii) a wind data acquisition station, and iii) a SCADA system to monitor the two components that the hydro-wind ensemble is more efficient than the generator working separately. The second important thing that we follow in the project is to define the maximum limit of the wind energy application in the specific system without the respective systems performance deterioration or loss. It will be modeled a national energy system based on public data to which we have access and we will propose new wind energy capacities interconnected to the network after which the whole ensemble will be studied. The pilot station together with its control and surveillance system (SCADA) has the role to validate some mathematical models, analytically built, of a hydro-wind ensemble having conserving pumping. By validating these models, we can pass by extrapolation to different size wind ensembles, and by mathematical modeling to demonstrate that hydro-wind structures with pumping- accumulations are superior from the point of view of the efficiency to the same capacities without pumping accumulation. Beyond the thematic limit we have the ambitious objective to estimate some integration problems, on large scale, of wind energy in the national energy system. From experimental acquisition data by using SCADA system through ...
Context 2
... CANSCREEN project we have designed and implemented a complex software solution based on web technologies in order to collect, analyse, and validate the cervical cancer screening data and to monitor the screening program actions. The project metodology supports is the Oncology Institute of Cluj Napoca screening program for cervical cancer study, started in 2002. The software solution's hardware and software architecture were defined in order to accomplish the project objectives. Technical solutions and possible architecture analysis indicated that the best solutions that satisfy the requirements of such a system is an n-tier web architecture ( figure 1), with the following development levels: presentation level, management logic level, data access logical level and database level. The presentation level will be implemented for the family doctors offices, the bleeding centers, cervical investigations, and cytological laboratories, for HPV testing, and others. The management logical level is implemented on the web server, while the data access logical level and database level are implemented on the application and database server and file server. After conducting hardware architecture, it became possible to define the main modules that will compose the software application, which consisted in the interface design module (MPI) , the medical data generation and exploitation module (MGEBD , the (MRCD), the medical image acquisition module (MAI) (Pap smears), the image the filtering module (MFI) , the system quality module (MAC) , the economical analysis module (MAE) . Data which contain information related to the identification of tested women and the registers about women at high risk, are stored in a medical database – the system’s medical data generation and exploitation module (MGEBD) – which offers the possibility of retrieving data according to certain criteria, after a previous stage in which the data was encoded – quantification and representation module (MRCD). The relational model of the database ensured the data and relationships integrity and also the connection with various programming languages which generate interfaces. Beside the patient’s identification data, the informational system comprises an image acquisition module (MAI). The acquired images are pap smears which will be stored in the image databases. Using the filtering module (MFI) it is intended to optimize the interpreting procedures for the cytological smears, by elaborating recognition algorithms for extracting the quantifiable features. The system quality is implemented by the quality module (MAC) which randomly selects a percentage of the analyzed images, which are afterwards re-read and transferred to a cytology expert. The economical analysis module (MAE) decides the screening strategy in accordance with the cost-efficiency ratio, by using Markov models applied to existing data in the database. In order to accomplish the user access, there is the interface design module (MPI) at the level of the medical centers involved in the screening program for the cervical cancer. Figure 2 presents the system’s software architecture and the connections between the system’s software modules. Adequate recorded data evaluation, in cervical cancer or other form of this kind of malady is a key factor in taking decisions regarding personal medical care and health politics. Accurate interpretation of medical parameters involves knowing statistic associated indicators meaning and clinic aspects faced with pathology. Medical parameters are represented by quantity, quality or survival type data. What meters are parameters type and their scale because these characteristics will determine tables’ type, graphics or resumed tables which represent data with best precision and manage to send observations to those who are interested. Choosing the most appropriate method of analyzing one issue depends on the way comparison is meant to be done and by selected data. Data are set to influence by their type, size of compared samples, by their property of being normal, by equality of variations, most frequent frequency. Choosing the most appropriate static method adapted to current situation can be done only after collecting all data. A system made for "all people purpose" has to be created for continuous monitoring of screening program. An appropriate legal frame is necessary for recording individual data and the link between "all people purpose" database, screening database, cancer and mortality registers. This system is an essential instrument for screening program management, participation indicator calculation, comprehension, quality and impact, feedback for medical staff, decision and authority factors in public health department. An experimental design has been chosen because is appropriate for evaluating the new strategies for organized programs. The right way to operate is by using epidemiologic regulations, their purpose are defining the basic structural architecture of screening programs and recommending a common methodology for organizing, evaluation and reporting. These regulations are relevant mainly for planning new screening programs in Europe. The result of screening program depends mainly on the quality of pathologist labor. Using samples prevailed from women that participate at screening program the pathologist offers particular information regarding each examined woman's condition. The therapeutic decision depends on the quality of pathologic exam, of its precision and forecast/prediction indicators. In order to achieve screening purposes, a firm set of information regarding each patient is required. Using an identical methodology and terminology for formulating the diagnosis is also required. CANSCREEN medical data are presented in an encoded format, according to the screening programs guidelines, existing at European level. At the base level of a patient’s registering in the cervical cancer screening program are the personal data that characterizes the patient, the data specific to a certain screening phase, and the evaluation and treatment data necessary for patient’s surveillance during the whole screening period. The database design for CANSCREEN project consisted in defining its structure. The database being relational, in the first place, it was necessary to design it before elaborating it, after a database design methodology which consists in: • developing the logic model; • developing the physical model. Beside the medical data identification for system’s necessary, the existing relationships were emphasized, and the restrictions imposed. The design was conducted is two phases: • the development stage of the conceptual model, described by the entity- relationship model ...
Context 3
... needs for water pollution control can only be defined from within the overall context of water resources management. By considering the various influences and aspects involved in water resources management today, it is possible to identify some fundamental information needs. Figure 1 illustrates various functions and uses of water bodies, in relation to human activities or ecological functioning identified from existing policy frameworks, international and regional conventions and strategic action plans for river basins and seas. There are two approaches to water pollution control: the emission-based approach and the water quality-based approach. The differences between these approaches result from the systems applied for limiting discharge and in the charging mechanisms. However, these differences are also reflected in the strategies taken for hazard assessment and the monitoring of discharges to water, i.e. whether it is focused on the effluents or on the receiving water; both have their advantages and disadvantages A combined approach can make optimal use of the advantages. Information needs are focused on the three core elements in water management and water pollution control, namely the functions and use of water bodies, the actual problems and threats for future functioning, and the measures undertaken (with their intended responses) to benefit the functions and uses. Monitoring is the principal activity that meets information needs for water pollution control. Models and decision support systems, which are often used in combination with monitoring, are also useful information tools to support decision-making. A The monitoring large dimension- and information covered surface system can – by be a generally hydrographic considered basin, as the a multitude chain of activities. of main Essentially, river tributary the chain streams, is closed creates with the the management impossibility to and equip control the water action basin of with the decision-maker, local stations for whereas data acquisition past schemes for have the shown entire area. a more Also, top-down other parameters sequence which of a restricted modify on number the monitoring of activities, surface starting with (level a sampling differences network on the streamline, chosen arbitrarily different flow and ending rates in up time with the periods production – winter/summer of a set of data. – Building different width an accountable from springs information to outflow) system requires that the activities in the chain are sequentially designed, starting from the specified information needs. While monitoring is continuing, information needs are also evolving. The objective of an information system for water pollution control is to provide and to disseminate information about water quality conditions and pollution loads in order to fulfil the user- defined information needs. Information systems can be based either on paper reports circulated in defined pathways, or on a purely computerized form in which all information and data are stored and retrieved electronically. In practice, most information systems are a combination of these. The main types of data to be processed in an information system are: • Data on the nature of the water bodies (size and availability of water resources, water quality and function, and structure of the ecosystem); • Data on human activities polluting the water bodies (primarily domestic wastewater and solid waste, industrial activities, agriculture and transport). • Data on the physical environment (e.g. topography, geology, climate, hydrology). The large dimension- covered surface – by a hydrographic basin, the multitude of main river tributary streams, creates the impossibility to equip the water basin with local stations for data acquisition for the entire area. Also, other parameters which modify on the monitoring surface (level differences on the streamline, different flow rates in time periods – winter/summer – different width from springs to outflow) impose the usage of monitoring methods through the realization of mathematical models which permit the tracking by simulating the propagation of pollution agents and by attenuation of the concentration depending on their movement in the water basin. The developing of an intelligent system for tracking and monitoring water quality and pollution agent propagation in surface waters of a hydrographic basin imposes the following: • Developing a hardware system (local stations) located in the water basin in predefined points, which dispose of intelligent equipments of river’s real time data acquisition, processing, storing and transmitting at long distance the data. • Developing at central level, an acquisition, storing and processing system, on a data server, a relational database MS SQL, which support a large number of application programming interfaces (API) which will be developed under the Microsoft Visual C++, Microsoft J++, Microsoft Visual FoxPro programming environments, which supports window opening of Microsoft and Web applications. • Developing and validating mathematical models for the simulation of the propagation of pollution agents through simulation on the data server, for different values of input parameters and ...
Context 4
... Capatina CONMEC Phase 4, „Algoritmi evoluati de control in structura sistemelor mecatronice” A telemedicine-supported process that will monitor the hepatocellular carcinoma is both practical and advantageous. The objective of this paper is to describe the development of the tele-screening system and implementation of a Web application that allows efficient communication between general practitioners and ultrasound specialists, in order to support a telescreening programme for patients with HCC. The screening process is performed according to the following steps: 1. The patients are called to the general practitioner and undergo a specific medical consultation comprising, among others, ultrasound examination. 2. The ultrasound images, acquired according to a specific screening protocol, are sent, together with the other medical data regarding the patient, to an expert diagnosis center. 3. At this level, the specialists analyze the images and the additional information and establish the existence of cancerous formations. 4. Based on the specialist evaluation, transposed into an electronic form, the system establishes a new date for the patients to present themselves to the general practitioner, in order to undergo a new examination. 5. The term for the new examination date appears to the general practitioner when the patient’s records are accesses. The application provides dedicated functionality for each user group identified: general practitioners, imaging specialists, database administrators. The most important risk factors for HCC are hepatitis B virus (HBV), hepatitis C virus (HCV) and alcohol. The screening means testing of a large number of individuals designed to identify those with a particular genetic trait, characteristic, or biological condition. This has been proven to be the most efficient method in diagnosis of cancers, as well as cost-effective. HCC meets the requirements to become the target of surveillance and a sensitive tool – ultrasonography is easily acceptable, can screen the population and effective treatment options are available. US is preferred for the HCC screening mainly because is an available, repeatable, non invasive, non irradiating and relatively low cost imaging technique. The categories of patients that must be included into the telescreening programme are described in the first column of table 1. The second column holds the information regarding to the presence or absence of nodules in patient’s ultrasound images. The presence of the nodules is a decisive factor in planning the next term of ultrasound investigation, thus, the patient is rescheduled to a new medical consultation in the telescreening programme. The imaging specialist provides the information regarding the presence or the absence of nodules, after examining the patient’s US images on the The target population group for screening are: o males and females with HBV liver cirrhosis, especially if the viral replication markers are positive; o males and females with HCV liver cirrhosis; o males and females with liver cirrhosis induced by hereditary hemochromatosis; o abstinent male patients with alcoholic liver cirrhosis; o males with primary biliary cirrhosis . server. The third column shows the time interval (in months) in which the patient should be programmed to a new US investigation (first semicolon - US), respectively, the duration required to perform a new tumor marker investigation. The aim of US examination in HCC screening is to characterize the hepatic parenchyma both regarding the base disease, as well as nodule presence. The US images should be of acceptable quality in order to detect nodules with diameter between 2 and 3 cm, when we can speak about early HCC, increasing the chances for favorable response to treatment. This is why the liver examination should offer information about all eight segments of the liver from two plans and about the portal venous system. In order to don’t miss valuable information from the US examination in real time, we included in the screening protocol two cine-loop sequences, one for each liver lobe. ultrasound screening - The form is available for general practitioners in electronic format. They can access the form in a secure way (user-name, license number and password) and, at the first visit of the patient, they will complete all required fields. The addition of new data for a patient is possible as they occur. So, the form we developed will represent the basis for an electronic medical record. Because it is available on-line and being accessed by the general practitioner and by several medical experts in liver diseases, this record represent a valuable telemedicine resource. In the mean time, based on those medical records, a collaborative database will be developed, which may represent in the future an on-line hepatocellular carcinoma (and other chronic liver disease) registry. In elaborating the patients’ medical record for the ultrasound screening, all the epidemiological and diagnostic knowledge regarding HCC were taken into account. This record contains five distinct fields: • Patient’s Identification data, • Patient’s Medical History - information about viral hepatitis infection (type, duration, route of transmission), alcohol intake, drugs and medication, smoking, family history of liver (or other) cancer, • Several clinical data about the patient (height, weight, BMI), • Data for patients diagnosed with liver cirrhosis, offering information about the type and stage of cirrhosis, base of diagnostic, previous treatments for liver cirrhosis, previous detection of liver nodules, values of several tumor markers (alfa-feto protein, des-carboxy prothrombin, if available), and Data for patients diagnosed with liver cancer, like number and localization of nodules, morphology of tumors, inclusion in TNM classes Specialist’s structured answer form - The HCC telescreening principle implies analysis, by imaging specialists, of the US images previously transmitted by the general practitioner. An expert system is able to automatically analyze and classify the images, offering a quality assurance mechanism, rejecting images that do not correspond to some predefined criteria. The specialists will analyze the entire image set of the patient, following the criteria for HCC classification. After that, they provide their decision to the system, filling a predefined structured answer form . This form contains information about: • The analyze of the entire liver parenchyma, taking into account the echogenity, homogeneity and the contour of the capsule, • The analyze of the focal liver nodules (if present) following the number and localization, and, for each nodule, the echogenity, homogeneity, the presence of the halo, and dimension, • The analyze of the invasion in the portal venous system (presence and localization). The telescreening network architecture, presented in Figure 1, consists of: o several terminals for the acquisition and transmission of specific US images and other medical data placed at general practitioners’ office, o a digital communication network, and o a core, placed at a imaging expert centre, able to handle, store, process, analyze, and classify the images and biomedical data. The system allows bi-directional communication between general practitioners and imaging experts based on wired Internet support. The images and biomedical data are transmitted over the network and integrated into the medical database at the core level. In order to manage all the data flow in the system, and the interaction between the different medical entities scattered along great distances, there was clearly that a Web- based application would be needed, which had to connect to the central database that stores the patients’ medical data. In the Web- application development we chose the Microsoft .NET solution, because of the large set of services and features it offers, and higher productivity in terms of development time. The Web-based application develops on top of ASP.NET, and the medical data is stored and managed in Microsoft SQL Server 2005. The patient is allowed to choose this screening program as an alternative, its inclusion in the system being conditioned by its desire. Moreover, the patient involved in the telescreening program can be excluded in certain cases: he/she is hospitalized in order to receive special treatment, or if the patient refuses to present to the next scheduled consultation. The idea is to implement both a rigorous system, but also open towards the patient desire. In the database there are the “system tables” that manage the links between the patient participant to the screening programme, the date acquired by the general practitioner (medical data and ultrasound images), interpretative data – generated by the imaging specialist and alerts regarding the scheduling of the patient to the next consultation – generated by the software application, in an automatic manner. The generation of this alert is made according to the medical specifications (see Table 1), as follows: 1) the patient category (comprised in the medical data provided by the general practitioner) is read from the database tables holding the patient medical information; 2) as the imaging specialist issues a structured answer the presence or absence of the nodules is read from the structured answer tables; 3) the result – the next scheduled date for the patient consultation is given according to column 3 of Table ...
Context 5
... concrete hydro-dam wall, with the goal to develop a visual identification scheme able to provide maximum accuracy despite the variability of appearance of calcite deposits, the variable lighting conditions on the portion of the wall (which is actually very significant – as shown in the Implementation and Results Section), without knowing in advance if calcite is or is not present in the current image, and if yes, in what amount. These aspects make the calcite identification and assessment a rather difficult image analysis problem: the significant variability of the calcite appearance makes almost impossible the derivation of a calcite appearance model to be used in the identification; model-free approaches seem more suitable, trying to identify natural groupings of the pixel data, if afterwards an interpretation of these groupings is done to identify if any represents calcite or not. The latter approach is a principled description of the method we propose here. The calcite identification on the concrete hydro-dam wall can be treated as a pixel classification problem and mathematically described as such. Since no clear a-priori considerations can be made in respect of the shape of the calcite deposits, it is appropriate to consider as only significant features in the classification task – the color of the pixels in the image. Then again, since only the color is important, and since on the hydro-dam wall the area of the calcite deposits is significantly smaller than the area without calcite (see Fig. 1 as an example), considering as classification data all the pixels in the currently analyzed image would always result in a very unbalanced data set among the classes of interest. This is usually not a favorable situation in unsupervised classification (as well as in the training of supervised classifiers), being prone to more errors in the poor represented class. Therefore we prefer to define the classification data as about the hydro-dam deposits for a future work concern, we focus here on the problem of identification of the calcite formations on with that particular color). As color space, although many choices are possible, we prefer here the natural RGB representation, as it is as suitable as others for Euclidian distance based classifiers. Let us denote a generic data point in this 3-D feature space by the vector x = [ R G B ] T . The current image to be analyzed for the detection and localization of calcite deposits (input to our module) is a sub- plot image of a dam wall, obtained from a database of images specific to the hydro-dam, after its clipping (manual or semiautomatic) from a larger hydro-dam wall image, as shown in Fig. 1, thus the data set to be classified includes all the colors in this currently analyzed sub-plot color image. Let the number of unique RGB triples in this image be N C and the data set to be classified – X C = { x i i = 1 , 2 ,..., N C } . Then the goal is to classify/cluster the data in X C in one of two possible classes of interest: calcite deposit – denoted herein by C c , and anything else but calcite – denoted here by C c (i.e. the complement of C ). According to the above formulation, our goal is to solve a binary classification problem – for which many solutions exist in the literature, but as explained before, learning based approaches or simple unsupervised the set of color appearances in the concrete dam wall image, each color being included only once (regardless the number of pixels in only two classes will risk to be unable to appropriate group the colors corresponding to the class “anything else but calcite”, since these do not uniquely represent one color. Therefore a larger number of classes than two only will be needed in the initial clustering, one per dominant color. An examination of the sub-plots in Fig. 1 below (considering that in the analysis, strictly portions of walls without any additional elements are considered, i.e. an element like the one in the left corner in Fig. 1 will not be taken into account) shows that generally the class C c can be considered composed by at most two dominant color clusters: the grayish like color corresponding to the concrete and the brown- black color possibly corresponding to organic deposits. Thus instead of the 2-class clustering of the set X C , a 3-class clustering should be performed, the three classes being C c , C g (clean concrete surface) and C b (organic deposits). An efficient color clustering algorithm when the number of classes is known a-priori, which has been rather extensively used in image segmentation applications, is fuzzy c- means clustering [10]. As with all unsupervised data clustering method, this algorithm aims to find natural groupings of the data according to their similarity in respect to a selected distance metric in the feature space. In the end of an iterative objective function minimization process, the optimal class centers and membership degrees of the data in the data set X C to be clustered are found, with the optimality defined as the minimization of the classification uncertainty among the data in the three classes. The resulting classes always form a fuzzy partition of X C [10]. The drawback of using the standard fuzzy c- means clustering in the application addressed here is the fact that, in the case of a severely unbalanced number of samples estimated to appear in the classes, the expected fuzzy centroid of the class with fewest data can be rather different than the centroid obtained for data clustering algorithms are not feasible for a minimal error classification in this particular task. Unsupervised color clustering the expected groupings, and this can be mainly accounted for the fact that although the distance between the data and the resulting class center is large, thus leading to a large cost in the objective function, if the number of these terms is negligible in comparison to all data to be classified, the classification error will still be under the convergence error. To overcome this drawback (that is really inacceptable for the calcite detection problem, because in our case, the class C c is always expected to contain fewer data than the other classes, even if we count just the colors in the sub-plot image regardless the number of representatives per color), we propose here to apply a modified fuzzy c-means algorithm; the modification consists in changing the objective function to include a higher penalty to the misclassification of the expected calcite pixels colors, that is, of the lighter colors in the data set for segmentation X C .. We should mention here that, although the number of pixels colors corresponding to the organic deposits (brown-black, that means – dark- most) is also much smaller than of the grayish pixels, which means their distance to the class center would also require a higher weight in the objective function, this was not really considered necessary here since in the last step of the classification, we however merge the classes C b and C g to obtain C c as: C c = C g ∪ C b ; as in any case the color of a brown-dark pixel is closer to a grayish pixel than to a calcite one, the misclassified data for the organic deposits can only appear in the class C g , thus not affecting class C c . The last issue not yet discussed refers to the case of “clean” sub-plots, not containing calcite deposits and/or organic deposits. In that case, as with all clustering algorithms with number of classes specified a-priori, the clustering result will still give three classes of pixels colors. With no post-classification verification of the classes, the pixels in the class having the lighter class center are assumed by default to be in the category C that class where most likely these data belong. In other words, the “natural groupings” formed might be different than to check the distance between the lighter and medium (grayish) class centers, and the possibility of indeed observing calcite for the colors most likely assigned to the class C C is proportional to this distance. The proposed fuzzy c-means version employed in the calcite identification task is described in Section 3; then in Section 4, the fuzzy rules for post-classification verification are given. In its common form, with the mathematical notations introduced in the previous section, the fuzzy c-means clustering is described as follows. Let us denote by C – the number of classes to which the total N C three dimensional data x from the set X C are to be assigned in some membership degree by the algorithm. In our case, C =3. Then a membership matrix can be built, U [ C × N C ] , with the u ji element, j=1,...,C and i=1,...,N C , representing the membership degree of the vector x i to the class j . Each line in U is the discrete representation of the fuzzy set corresponding to a data class. The C fuzzy sets are constrained to form a fuzzy partition of the data set X C (the universe of discourse). Starting from any initial fuzzy partition of the data set to be fuzzy classified X C , the algorithm aims to optimize the partition in the sense of minimizing the uncertainty regarding the membership of every data x i , i=1,...,N C , to each of the classes. This goal is achieved through the minimization of the objective ...
Context 6
... concrete hydro-dam wall, with the goal to develop a visual identification scheme able to provide maximum accuracy despite the variability of appearance of calcite deposits, the variable lighting conditions on the portion of the wall (which is actually very significant – as shown in the Implementation and Results Section), without knowing in advance if calcite is or is not present in the current image, and if yes, in what amount. These aspects make the calcite identification and assessment a rather difficult image analysis problem: the significant variability of the calcite appearance makes almost impossible the derivation of a calcite appearance model to be used in the identification; model-free approaches seem more suitable, trying to identify natural groupings of the pixel data, if afterwards an interpretation of these groupings is done to identify if any represents calcite or not. The latter approach is a principled description of the method we propose here. The calcite identification on the concrete hydro-dam wall can be treated as a pixel classification problem and mathematically described as such. Since no clear a-priori considerations can be made in respect of the shape of the calcite deposits, it is appropriate to consider as only significant features in the classification task – the color of the pixels in the image. Then again, since only the color is important, and since on the hydro-dam wall the area of the calcite deposits is significantly smaller than the area without calcite (see Fig. 1 as an example), considering as classification data all the pixels in the currently analyzed image would always result in a very unbalanced data set among the classes of interest. This is usually not a favorable situation in unsupervised classification (as well as in the training of supervised classifiers), being prone to more errors in the poor represented class. Therefore we prefer to define the classification data as about the hydro-dam deposits for a future work concern, we focus here on the problem of identification of the calcite formations on with that particular color). As color space, although many choices are possible, we prefer here the natural RGB representation, as it is as suitable as others for Euclidian distance based classifiers. Let us denote a generic data point in this 3-D feature space by the vector x = [ R G B ] T . The current image to be analyzed for the detection and localization of calcite deposits (input to our module) is a sub- plot image of a dam wall, obtained from a database of images specific to the hydro-dam, after its clipping (manual or semiautomatic) from a larger hydro-dam wall image, as shown in Fig. 1, thus the data set to be classified includes all the colors in this currently analyzed sub-plot color image. Let the number of unique RGB triples in this image be N C and the data set to be classified – X C = { x i i = 1 , 2 ,..., N C } . Then the goal is to classify/cluster the data in X C in one of two possible classes of interest: calcite deposit – denoted herein by C c , and anything else but calcite – denoted here by C c (i.e. the complement of C ). According to the above formulation, our goal is to solve a binary classification problem – for which many solutions exist in the literature, but as explained before, learning based approaches or simple unsupervised the set of color appearances in the concrete dam wall image, each color being included only once (regardless the number of pixels in only two classes will risk to be unable to appropriate group the colors corresponding to the class “anything else but calcite”, since these do not uniquely represent one color. Therefore a larger number of classes than two only will be needed in the initial clustering, one per dominant color. An examination of the sub-plots in Fig. 1 below (considering that in the analysis, strictly portions of walls without any additional elements are considered, i.e. an element like the one in the left corner in Fig. 1 will not be taken into account) shows that generally the class C c can be considered composed by at most two dominant color clusters: the grayish like color corresponding to the concrete and the brown- black color possibly corresponding to organic deposits. Thus instead of the 2-class clustering of the set X C , a 3-class clustering should be performed, the three classes being C c , C g (clean concrete surface) and C b (organic deposits). An efficient color clustering algorithm when the number of classes is known a-priori, which has been rather extensively used in image segmentation applications, is fuzzy c- means clustering [10]. As with all unsupervised data clustering method, this algorithm aims to find natural groupings of the data according to their similarity in respect to a selected distance metric in the feature space. In the end of an iterative objective function minimization process, the optimal class centers and membership degrees of the data in the data set X C to be clustered are found, with the optimality defined as the minimization of the classification uncertainty among the data in the three classes. The resulting classes always form a fuzzy partition of X C [10]. The drawback of using the standard fuzzy c- means clustering in the application addressed here is the fact that, in the case of a severely unbalanced number of samples estimated to appear in the classes, the expected fuzzy centroid of the class with fewest data can be rather different than the centroid obtained for data clustering algorithms are not feasible for a minimal error classification in this particular task. Unsupervised color clustering the expected groupings, and this can be mainly accounted for the fact that although the distance between the data and the resulting class center is large, thus leading to a large cost in the objective function, if the number of these terms is negligible in comparison to all data to be classified, the classification error will still be under the convergence error. To overcome this drawback (that is really inacceptable for the calcite detection problem, because in our case, the class C c is always expected to contain fewer data than the other classes, even if we count just the colors in the sub-plot image regardless the number of representatives per color), we propose here to apply a modified fuzzy c-means algorithm; the modification consists in changing the objective function to include a higher penalty to the misclassification of the expected calcite pixels colors, that is, of the lighter colors in the data set for segmentation X C .. We should mention here that, although the number of pixels colors corresponding to the organic deposits (brown-black, that means – dark- most) is also much smaller than of the grayish pixels, which means their distance to the class center would also require a higher weight in the objective function, this was not really considered necessary here since in the last step of the classification, we however merge the classes C b and C g to obtain C c as: C c = C g ∪ C b ; as in any case the color of a brown-dark pixel is closer to a grayish pixel than to a calcite one, the misclassified data for the organic deposits can only appear in the class C g , thus not affecting class C c . The last issue not yet discussed refers to the case of “clean” sub-plots, not containing calcite deposits and/or organic deposits. In that case, as with all clustering algorithms with number of classes specified a-priori, the clustering result will still give three classes of pixels colors. With no post-classification verification of the classes, the pixels in the class having the lighter class center are assumed by default to be in the category C that class where most likely these data belong. In other words, the “natural groupings” formed might be different than to check the distance between the lighter and medium (grayish) class centers, and the possibility of indeed observing calcite for the colors most likely assigned to the class C C is proportional to this distance. The proposed fuzzy c-means version employed in the calcite identification task is described in Section 3; then in Section 4, the fuzzy rules for post-classification verification are given. In its common form, with the mathematical notations introduced in the previous section, the fuzzy c-means clustering is described as follows. Let us denote by C – the number of classes to which the total N C three dimensional data x from the set X C are to be assigned in some membership degree by the algorithm. In our case, C =3. Then a membership matrix can be built, U [ C × N C ] , with the u ji element, j=1,...,C and i=1,...,N C , representing the membership degree of the vector x i to the class j . Each line in U is the discrete representation of the fuzzy set corresponding to a data class. The C fuzzy sets are constrained to form a fuzzy partition of the data set X C (the universe of discourse). Starting from any initial fuzzy partition of the data set to be fuzzy classified X C , the algorithm aims to optimize the partition in the sense of minimizing the uncertainty regarding the membership of every data x i , i=1,...,N C , to each of the classes. This goal is achieved through the minimization of the objective ...
Context 7
... concrete hydro-dam wall, with the goal to develop a visual identification scheme able to provide maximum accuracy despite the variability of appearance of calcite deposits, the variable lighting conditions on the portion of the wall (which is actually very significant – as shown in the Implementation and Results Section), without knowing in advance if calcite is or is not present in the current image, and if yes, in what amount. These aspects make the calcite identification and assessment a rather difficult image analysis problem: the significant variability of the calcite appearance makes almost impossible the derivation of a calcite appearance model to be used in the identification; model-free approaches seem more suitable, trying to identify natural groupings of the pixel data, if afterwards an interpretation of these groupings is done to identify if any represents calcite or not. The latter approach is a principled description of the method we propose here. The calcite identification on the concrete hydro-dam wall can be treated as a pixel classification problem and mathematically described as such. Since no clear a-priori considerations can be made in respect of the shape of the calcite deposits, it is appropriate to consider as only significant features in the classification task – the color of the pixels in the image. Then again, since only the color is important, and since on the hydro-dam wall the area of the calcite deposits is significantly smaller than the area without calcite (see Fig. 1 as an example), considering as classification data all the pixels in the currently analyzed image would always result in a very unbalanced data set among the classes of interest. This is usually not a favorable situation in unsupervised classification (as well as in the training of supervised classifiers), being prone to more errors in the poor represented class. Therefore we prefer to define the classification data as about the hydro-dam deposits for a future work concern, we focus here on the problem of identification of the calcite formations on with that particular color). As color space, although many choices are possible, we prefer here the natural RGB representation, as it is as suitable as others for Euclidian distance based classifiers. Let us denote a generic data point in this 3-D feature space by the vector x = [ R G B ] T . The current image to be analyzed for the detection and localization of calcite deposits (input to our module) is a sub- plot image of a dam wall, obtained from a database of images specific to the hydro-dam, after its clipping (manual or semiautomatic) from a larger hydro-dam wall image, as shown in Fig. 1, thus the data set to be classified includes all the colors in this currently analyzed sub-plot color image. Let the number of unique RGB triples in this image be N C and the data set to be classified – X C = { x i i = 1 , 2 ,..., N C } . Then the goal is to classify/cluster the data in X C in one of two possible classes of interest: calcite deposit – denoted herein by C c , and anything else but calcite – denoted here by C c (i.e. the complement of C ). According to the above formulation, our goal is to solve a binary classification problem – for which many solutions exist in the literature, but as explained before, learning based approaches or simple unsupervised the set of color appearances in the concrete dam wall image, each color being included only once (regardless the number of pixels in only two classes will risk to be unable to appropriate group the colors corresponding to the class “anything else but calcite”, since these do not uniquely represent one color. Therefore a larger number of classes than two only will be needed in the initial clustering, one per dominant color. An examination of the sub-plots in Fig. 1 below (considering that in the analysis, strictly portions of walls without any additional elements are considered, i.e. an element like the one in the left corner in Fig. 1 will not be taken into account) shows that generally the class C c can be considered composed by at most two dominant color clusters: the grayish like color corresponding to the concrete and the brown- black color possibly corresponding to organic deposits. Thus instead of the 2-class clustering of the set X C , a 3-class clustering should be performed, the three classes being C c , C g (clean concrete surface) and C b (organic deposits). An efficient color clustering algorithm when the number of classes is known a-priori, which has been rather extensively used in image segmentation applications, is fuzzy c- means clustering [10]. As with all unsupervised data clustering method, this algorithm aims to find natural groupings of the data according to their similarity in respect to a selected distance metric in the feature space. In the end of an iterative objective function minimization process, the optimal class centers and membership degrees of the data in the data set X C to be clustered are found, with the optimality defined as the minimization of the classification uncertainty among the data in the three classes. The resulting classes always form a fuzzy partition of X C [10]. The drawback of using the standard fuzzy c- means clustering in the application addressed here is the fact that, in the case of a severely unbalanced number of samples estimated to appear in the classes, the expected fuzzy centroid of the class with fewest data can be rather different than the centroid obtained for data clustering algorithms are not feasible for a minimal error classification in this particular task. Unsupervised color clustering the expected groupings, and this can be mainly accounted for the fact that although the distance between the data and the resulting class center is large, thus leading to a large cost in the objective function, if the number of these terms is negligible in comparison to all data to be classified, the classification error will still be under the convergence error. To overcome this drawback (that is really inacceptable for the calcite detection problem, because in our case, the class C c is always expected to contain fewer data than the other classes, even if we count just the colors in the sub-plot image regardless the number of representatives per color), we propose here to apply a modified fuzzy c-means algorithm; the modification consists in changing the objective function to include a higher penalty to the misclassification of the expected calcite pixels colors, that is, of the lighter colors in the data set for segmentation X C .. We should mention here that, although the number of pixels colors corresponding to the organic deposits (brown-black, that means – dark- most) is also much smaller than of the grayish pixels, which means their distance to the class center would also require a higher weight in the objective function, this was not really considered necessary here since in the last step of the classification, we however merge the classes C b and C g to obtain C c as: C c = C g ∪ C b ; as in any case the color of a brown-dark pixel is closer to a grayish pixel than to a calcite one, the misclassified data for the organic deposits can only appear in the class C g , thus not affecting class C c . The last issue not yet discussed refers to the case of “clean” sub-plots, not containing calcite deposits and/or organic deposits. In that case, as with all clustering algorithms with number of classes specified a-priori, the clustering result will still give three classes of pixels colors. With no post-classification verification of the classes, the pixels in the class having the lighter class center are assumed by default to be in the category C that class where most likely these data belong. In other words, the “natural groupings” formed might be different than to check the distance between the lighter and medium (grayish) class centers, and the possibility of indeed observing calcite for the colors most likely assigned to the class C C is proportional to this distance. The proposed fuzzy c-means version employed in the calcite identification task is described in Section 3; then in Section 4, the fuzzy rules for post-classification verification are given. In its common form, with the mathematical notations introduced in the previous section, the fuzzy c-means clustering is described as follows. Let us denote by C – the number of classes to which the total N C three dimensional data x from the set X C are to be assigned in some membership degree by the algorithm. In our case, C =3. Then a membership matrix can be built, U [ C × N C ] , with the u ji element, j=1,...,C and i=1,...,N C , representing the membership degree of the vector x i to the class j . Each line in U is the discrete representation of the fuzzy set corresponding to a data class. The C fuzzy sets are constrained to form a fuzzy partition of the data set X C (the universe of discourse). Starting from any initial fuzzy partition of the data set to be fuzzy classified X C , the algorithm aims to optimize the partition in the sense of minimizing the uncertainty regarding the membership of every data x i , i=1,...,N C , to each of the classes. This goal is achieved through the minimization of the objective ...
Context 8
... concrete hydro-dam wall, with the goal to develop a visual identification scheme able to provide maximum accuracy despite the variability of appearance of calcite deposits, the variable lighting conditions on the portion of the wall (which is actually very significant – as shown in the Implementation and Results Section), without knowing in advance if calcite is or is not present in the current image, and if yes, in what amount. These aspects make the calcite identification and assessment a rather difficult image analysis problem: the significant variability of the calcite appearance makes almost impossible the derivation of a calcite appearance model to be used in the identification; model-free approaches seem more suitable, trying to identify natural groupings of the pixel data, if afterwards an interpretation of these groupings is done to identify if any represents calcite or not. The latter approach is a principled description of the method we propose here. The calcite identification on the concrete hydro-dam wall can be treated as a pixel classification problem and mathematically described as such. Since no clear a-priori considerations can be made in respect of the shape of the calcite deposits, it is appropriate to consider as only significant features in the classification task – the color of the pixels in the image. Then again, since only the color is important, and since on the hydro-dam wall the area of the calcite deposits is significantly smaller than the area without calcite (see Fig. 1 as an example), considering as classification data all the pixels in the currently analyzed image would always result in a very unbalanced data set among the classes of interest. This is usually not a favorable situation in unsupervised classification (as well as in the training of supervised classifiers), being prone to more errors in the poor represented class. Therefore we prefer to define the classification data as about the hydro-dam deposits for a future work concern, we focus here on the problem of identification of the calcite formations on with that particular color). As color space, although many choices are possible, we prefer here the natural RGB representation, as it is as suitable as others for Euclidian distance based classifiers. Let us denote a generic data point in this 3-D feature space by the vector x = [ R G B ] T . The current image to be analyzed for the detection and localization of calcite deposits (input to our module) is a sub- plot image of a dam wall, obtained from a database of images specific to the hydro-dam, after its clipping (manual or semiautomatic) from a larger hydro-dam wall image, as shown in Fig. 1, thus the data set to be classified includes all the colors in this currently analyzed sub-plot color image. Let the number of unique RGB triples in this image be N C and the data set to be classified – X C = { x i i = 1 , 2 ,..., N C } . Then the goal is to classify/cluster the data in X C in one of two possible classes of interest: calcite deposit – denoted herein by C c , and anything else but calcite – denoted here by C c (i.e. the complement of C ). According to the above formulation, our goal is to solve a binary classification problem – for which many solutions exist in the literature, but as explained before, learning based approaches or simple unsupervised the set of color appearances in the concrete dam wall image, each color being included only once (regardless the number of pixels in only two classes will risk to be unable to appropriate group the colors corresponding to the class “anything else but calcite”, since these do not uniquely represent one color. Therefore a larger number of classes than two only will be needed in the initial clustering, one per dominant color. An examination of the sub-plots in Fig. 1 below (considering that in the analysis, strictly portions of walls without any additional elements are considered, i.e. an element like the one in the left corner in Fig. 1 will not be taken into account) shows that generally the class C c can be considered composed by at most two dominant color clusters: the grayish like color corresponding to the concrete and the brown- black color possibly corresponding to organic deposits. Thus instead of the 2-class clustering of the set X C , a 3-class clustering should be performed, the three classes being C c , C g (clean concrete surface) and C b (organic deposits). An efficient color clustering algorithm when the number of classes is known a-priori, which has been rather extensively used in image segmentation applications, is fuzzy c- means clustering [10]. As with all unsupervised data clustering method, this algorithm aims to find natural groupings of the data according to their similarity in respect to a selected distance metric in the feature space. In the end of an iterative objective function minimization process, the optimal class centers and membership degrees of the data in the data set X C to be clustered are found, with the optimality defined as the minimization of the classification uncertainty among the data in the three classes. The resulting classes always form a fuzzy partition of X C [10]. The drawback of using the standard fuzzy c- means clustering in the application addressed here is the fact that, in the case of a severely unbalanced number of samples estimated to appear in the classes, the expected fuzzy centroid of the class with fewest data can be rather different than the centroid obtained for data clustering algorithms are not feasible for a minimal error classification in this particular task. Unsupervised color clustering the expected groupings, and this can be mainly accounted for the fact that although the distance between the data and the resulting class center is large, thus leading to a large cost in the objective function, if the number of these terms is negligible in comparison to all data to be classified, the classification error will still be under the convergence error. To overcome this drawback (that is really inacceptable for the calcite detection problem, because in our case, the class C c is always expected to contain fewer data than the other classes, even if we count just the colors in the sub-plot image regardless the number of representatives per color), we propose here to apply a modified fuzzy c-means algorithm; the modification consists in changing the objective function to include a higher penalty to the misclassification of the expected calcite pixels colors, that is, of the lighter colors in the data set for segmentation X C .. We should mention here that, although the number of pixels colors corresponding to the organic deposits (brown-black, that means – dark- most) is also much smaller than of the grayish pixels, which means their distance to the class center would also require a higher weight in the objective function, this was not really considered necessary here since in the last step of the classification, we however merge the classes C b and C g to obtain C c as: C c = C g ∪ C b ; as in any case the color of a brown-dark pixel is closer to a grayish pixel than to a calcite one, the misclassified data for the organic deposits can only appear in the class C g , thus not affecting class C c . The last issue not yet discussed refers to the case of “clean” sub-plots, not containing calcite deposits and/or organic deposits. In that case, as with all clustering algorithms with number of classes specified a-priori, the clustering result will still give three classes of pixels colors. With no post-classification verification of the classes, the pixels in the class having the lighter class center are assumed by default to be in the category C that class where most likely these data belong. In other words, the “natural groupings” formed might be different than to check the distance between the lighter and medium (grayish) class centers, and the possibility of indeed observing calcite for the colors most likely assigned to the class C C is proportional to this distance. The proposed fuzzy c-means version employed in the calcite identification task is described in Section 3; then in Section 4, the fuzzy rules for post-classification verification are given. In its common form, with the mathematical notations introduced in the previous section, the fuzzy c-means clustering is described as follows. Let us denote by C – the number of classes to which the total N C three dimensional data x from the set X C are to be assigned in some membership degree by the algorithm. In our case, C =3. Then a membership matrix can be built, U [ C × N C ] , with the u ji element, j=1,...,C and i=1,...,N C , representing the membership degree of the vector x i to the class j . Each line in U is the discrete representation of the fuzzy set corresponding to a data class. The C fuzzy sets are constrained to form a fuzzy partition of the data set X C (the universe of discourse). Starting from any initial fuzzy partition of the data set to be fuzzy classified X C , the algorithm aims to optimize the partition in the sense of minimizing the uncertainty regarding the membership of every data x i , i=1,...,N C , to each of the classes. This goal is achieved through the minimization of the objective ...

Similar publications

Article
Full-text available
- Pumps and turbines are an integral part of the hydro machine family and play a vital role in many large industries. These machines are required to operate efficiently for the maximum output, and thus they must be free from any damages or problems. Cavitation is a phenomenon occurring in hydro machines (more specifically, pumps and turbines) which...
Conference Paper
Full-text available
We propose an innovative underground energy storage based on supercritical CO2 as a working fluid. The storage allows generation of electrical energy for hours, and therefore represents an ideal mean to compensate for the natural fluctuation in electricity generation from renewable energy throughout a day. The storage makes use of underground salt...
Article
Full-text available
This article analyzes counter-rotor hydraulic units developed on the basis of jet turbines. A new design of a counter-rotor hydraulic unit developed by the authors, consisting of a jet turbine with a nozzle and a water wheel, the principle of its operation and essence is described. Based on the velocity triangle of water flowing through the reactiv...
Article
Full-text available
Over the past few decades, in Sri Lanka, a large number of hand pump-fitted water supply boreholes have been constructed in the dry zone as community drinking water supply sources. During the construction of these water supply boreholes, a significant amount of geological, hydrogeological, and water quality information have become available, though...
Article
Full-text available
Karun River watershed located in southwest of Iran experiences great changes in the last decade due to hydro climatic changes, large-scale land use and land cover changes, construction of dams and agricultural and industrial development. Stream-flow series of the annual maximum, annual minimum and annual mean in Ahvaz hydrometric station as represe...