April 2013
·
44 Reads
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
April 2013
·
44 Reads
January 2013
·
863 Reads
·
709 Citations
Nature Methods
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools. Supplementary information The online version of this article (doi:10.1038/nmeth.2340) contains supplementary material, which is available to authorized users.
January 2013
·
223 Reads
·
40 Citations
New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs-homologs separated by a speciation and a duplication event, respectively-provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model's estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ∼400000 specific annotations with the estimated Precision of 90%, ∼19000 of which are highly specific-e.g. "penicillin binding," "tRNA aminoacylation for protein translation," or "pathogenesis"-and are freely available at http://gorbi.irb.hr/.
January 2013
·
6 Reads
The settings files as given to the Clus-HMC-Ens algorithm. (ZIP)
January 2013
·
8 Reads
Results of the experimental assays. (XLSX)
January 2013
·
37 Reads
January 2012
·
748 Reads
·
3 Citations
Personalized recommender systems rely on personal usage data of each user in the system. However, privacy policies protecting users' rights prevent this data of being publicly available to a wider researcher audience. In this work, we propose a memory biased random walk model (MBRW) based on real clickstream graphs, as a generator of synthetic clickstreams that conform to statistical properties of the real clickstream data, while, at the same time, adhering to the privacy protection policies. We show that synthetic clickstreams can be used to learn recommender system models which achieve high recommender performance on real data and at the same time assuring that strong de-minimization guarantees are provided.
July 2011
·
8,707 Reads
·
5,302 Citations
Outcomes of high-throughput biological experiments are typically interpreted by statistical testing for enriched gene functional categories defined by the Gene Ontology (GO). The resulting lists of GO terms may be large and highly redundant, and thus difficult to interpret. REVIGO is a Web server that summarizes long, unintelligible lists of GO terms by finding a representative subset of the terms using a simple clustering algorithm that relies on semantic similarity measures. Furthermore, REVIGO visualizes this non-redundant GO term set in multiple ways to assist in interpretation: multidimensional scaling and graph-based visualizations accurately render the subdivisions and the semantic relationships in the data, while treemaps and tag clouds are also offered as alternative views. REVIGO is freely available at http://revigo.irb.hr/.
January 2011
·
28 Reads
·
4 Citations
This year's Discovery Challenge was dedicated to solving of the video lecture recommendation problems, based on the data collected at VideoLectures.Net site. Challenge had two tasks: task 1 in which new-user/new-item recommendation problem was simulated, and the task 2 which was a simulation of the clickstream-based recommendation. In this overview we present challenge datasets, tasks, evaluation measure and we analyze solutions and results.
June 2010
·
203 Reads
·
29 Citations
RSCTC’2010 Discovery Challenge was a special event of Rough Sets and Current Trends in Computing conference. The challenge was organized in the form of an interactive on-line competition, at TunedIT.org platform, in days between Dec 1, 2009 and Feb 28, 2010. The task was related to feature selection in analysis of DNA microarray data and classification of samples for the purpose of medical diagnosis or treatment. Prizes were awarded to the best solutions. This paper describes organization of the competition and the winning solutions.
... Our novel features provide a much larger coverage than existing methods while maintaining a high accuracy. 2) Preliminary experiments on standard WEBSPAM-UK2007 [5], ClueWeb-2009 [6], and ECML-PKDD-2011 [7] benchmark datasets demonstrate the effectiveness of the novel features on learning the classifier for detecting web spam. The rest of the paper is formed as follows: We review the previous research work in Section 2. In section 3, we describe the proposed groups of novel web spam features. ...
Reference:
Novel Features for Web Spam Detection
January 2011
... RapidMiner is often successfully used in the application of classification algorithms [7]. Furthermore, it provides a support for Meta learning for classification [8] and constructing of recommender system workflow templates [9]. In this paper, we focus on building recommender system for higher education students. ...
... In recent years there has been a significant expansion in the use of telemedicine to improve safety and efficacy of treatment as well as to improve patient education [10]. ...
January 2009
Bio-Algorithms and Med-Systems
... Accurately predicting protein function is a cornerstone in molecular biology, with extensive applications in drug design, drug discovery and disease modeling (Rezaei et al., 2020). However, the complexity and variability of proteins pose significant challenges for computational prediction models (Radivojac et al., 2013;Schauperl & Denny, 2022). The functionality of a protein is affected by its threedimensional structure, often dictating its interactions with other molecules (Ivanisenko et al., 2005). ...
January 2013
Nature Methods
... In the original study, orthologs for a protein of interest were identified as those matching the query sequence with a score above an alignment threshold relative to the size of the searched database [24]. Since then, profile elements have been identified using bit-score thresholds, protein domains, membership in Clusters of Orthologous Groups of proteins (COGS), and methods for distinguishing between orthologs and paralogs [25,37,38,39,34,40]. ...
January 2013
... Four datasets were used as a case study for the feature selection algorithms. Two sets are microarray datasets that are used for research on psoriasis [31,32,33,34] and cancer [35]. The two other sets are mass spectrometry datasets, used for research on cancer [36] and micro organisms [37]. ...
June 2010
... irb. hr/) [26]. KEGG diagrams (Kyoto Encyclopedia of Genes and Genomes; https:// www. ...
July 2011
... Medical knowledge is a cognitive and technical component, i.e., it comprises the individual's perspectives, beliefs, talents, and expertise. Challenging aspects of medical plans and medical knowledge itself are (i) time, data gathering may last years, while the answer can require only a few seconds; (ii) space, because data may arrive from many different health care units, in distinct formats; and (iii) medicine's inherent complexity, the depth of knowledge that each medical specialty offers [Jovic et al., 2007c] [Gamberger et al., 2008]. ...
July 2008