Ron Sacks-Davis's research while affiliated with RMIT University and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (70)


Text Databases
  • Chapter

July 2011

·

49 Reads

·

1 Citation

·

Beng Chin Ooi

·

Ron Sacks-Davis

·

[...]

·

Text databases provide rapid access to collections of digital documents. Such databases have become ubiquitous: text search engines underlie the online text repositories accessible via the Web and are central to digital libraries and online corporate document management.

Share

Object-Oriented Databases

July 2011

·

75 Reads

·

1 Citation

There has been a growing acceptance of the object-oriented data model as the basis of next generation database management systems (DBMSs). Both pure object-oriented DBMS (OODBMSs) and object-relational DBMS (ORDBMSs) have been developed based on object-oriented concepts. Object-relational DBMS, in particular, extend the SQL language by incorporating all the concepts of the object-oriented data model. A large number of products for both categories of DBMS is today available. In particular, all major vendors of relational DBMSs are turning their products into ORDBMSs [Nori, 1996].


Efficient Passage Ranking for Document Databases

August 2002

·

61 Reads

·

89 Citations

ACM Transactions on Information Systems

Queries to text collections are resolved by ranking the documents in the collection and returning the highest-scoring documents to the user. An alternative retrieval method is to rank passages, that is, short fragments of documents, a strategy that can improve effectiveness and identify relevant material in documents that are too large for users to consider as a whole. However, ranking of passages can considerably increase retrieval costs. In this paper we explore alternative query evaluation techniques, and develop new techniques for evaluating queries on passages. We show experimentally that, appropriately implemented, effective passage retrieval is practical in limited memory on a desktop machine. Compared to passage ranking with adaptations of current document ranking algorithms, our new "DO-TOS" passage ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.


Filtered Document Retrieval with Frequency-Sorted Indexes

July 2002

·

57 Reads

·

205 Citations

Journal of the American Society for Information Science

Ranking techniques are effective at finding answers in document collections but can be expensive to evaluate. We propose an evaluation technique that uses early recognition of which documents are likely to be highly ranked to reduce costs; for our test data, queries are evaluated in 2% of the memory of the standard implementation without degradation in retrieval effectiveness. cpu time and disk traffic can also be dramatically reduced by designing inverted indexes explicitly to support the technique. The principle of the index design is that inverted lists are sorted by decreasing within-document frequency rather than by document number, and this method experimentally reduces cpu time and disk traffic to around one third of the original requirement. We also show that frequency sorting can lead to a net reduction in index size, regardless of whether the index is compressed.


Storage Management for Files of Dynamic Records

July 2002

·

22 Reads

·

13 Citations

We propose a new scheme for managing files of variable-length dynamic records, based on storing the records in large, fixed-length blocks. We demonstrate the e#ectiveness of this scheme for text indexing, showing that it achieves space utilisation of over 95%. With an appropriate block size and caching strategy, our scheme requires on average around two disk accesses---one read and one write---for insertion of or change to a record.


Indexing Documents for Queries on Structure, Content and Attributes

July 2002

·

58 Reads

·

35 Citations

Indexing and retrieval techniques for large text databases are well developed, but most of the techniques developed to date assume that the text to be indexed has little or no structure. With the growth in the use of sophisticated markup languages for text, a database system for structured documents should use, not just document content, but structural information and attributes, and should support queries on content, structure and attributes. In this paper we review and compare two recent approaches for accessing document collections. For one of the approaches, position-based indexing, queries are resolved by manipulating ranges of word o sets while for the other, based on a path model, the position of a word is represented in terms of the structural components that enclose it. The former allows slightly smaller indexes; the latter allows more efficient query evaluation.



Figure 15: Components, interfaces, and main implementation languages of the SIM XML-direct architecture  
System Architectures for Structured Document Data
  • Article
  • Full-text available

November 2000

·

81 Reads

·

9 Citations

Acoustics, Speech, and Signal Processing Newsletter, IEEE

Semi-structured data, including but not limited to structured documents, has speci#c characteristics and is used in ways di#erent to tabular data. SGML and XML are widely used to represent information of this type. The demands on systems that manage semi-structured data vary from those on traditional relational systems. This paper reviews the nature and characteristics of semi-structured data, and the functional needs of those applications, including query requirements, document description, manipulation, and document management needs. It examines alternative physical models for semi-structured data, and evaluates and compares alternative system architectures. 1

Download


Retrieval of Partial Documents

October 2000

·

33 Reads

·

42 Citations

Introduction Provision of answers to informally phrased questions is a central part of information retrieval. These answers traditionally take the form of documents retrieved from a text database, but documents will often be unsatisfactory as answers. They may be large and unwieldy; the answer they represent may be diffuse, and therefore hard for the user to extract; and word-based retrieval systems may be misled by the breadth of vocabulary of a long document into believing it to be relevant. Indexing and returning parts of documents addresses these problems. We have approached the problem of partial documents in two ways. The first approach is to regard documents as an unstructured series of "pages" of text of similar length, each of which can be returned as an answer to a query. We would expect, under this approach, that any bias in the retrieval mechanism towards documents of a particular length should be eliminated. By regarding an answer to be the document from which an


Citations (46)


... ore complex and abstract it tends to be." Därför är det bra att SGML tillåter att man skapar en egen DTD anpassad till varje unikt behov. Eftersom SGML är så övergripande är det relativt enkelt att konvertera en uppmärkt text till exempelvis XML eller HTML. Denna anpassningsförmåga är en styrka i många sammanhang. Som syns i tabell 2 ovan, bedömer Wilkinson et. Al (1998) att PostScript och PDF kräver mer utrymme och är mindre flexibla jämfört med SGML, HTML och XML. XML anses vara bäst gällande presentationsflexibilitet medan PostScript och PDF har högst presentationskvalitet. ...

Reference:

Elektronisk publicering: vetenskapliga dokument med åtkomst via webben
Document Computing
  • Citing Book
  • January 1998

... During this process, the evaluation and adaptation of query languages for retrieving geometries (Frank 1982) and several proposals for indexing spatial data structures (e.g., Stonebraker et al. 1983, Guttman 1984 were also significant milestones. These works evolved into the Dual (Schilcher 1985, Ooi et al. 1989, Aref and Samet 1991 and Integrated architectures (Dayal et al. 1987). The latter represented a crucial instant in the development of spatial database architectures and resulted in several Spatial Database Management Systems (SDMS) such as PROBE (Orenstein 1986, Orenstein andManola 1988) and POSTGRES (Stonebraker and Rowe 1986). ...

Extending a DBMS for Geographic Applications
  • Citing Conference Paper
  • February 1989

... Address translation hardware for virtual memory implementation is a widely used application of hashing. Ramamohanarao and Sacks-Davis gave a summary of the hardware implementation of the page tables using hashing [7]. The one level scheme was used in IBM system/38 [8], [9] and in IBM RT PC [10] with bit extraction and XOR hashing functions. ...

Hardware address translation for machines with a large virtual memory
  • Citing Article
  • October 1981

Information Processing Letters

... [7]), a modified Newton-Raphson iteration is used. For this class of problem it is considered by some ([11,13,14]) that IlJ[] is a more suitable parameter than p for selecting algorithms. It can also be useful in estimating the condition number (IIB[]-IIB-1][) of the characteristic matrix [I-hbJ] of Rosenbrock methods [15]. ...

A type-insensitive ODE code based on second derivative formulas
  • Citing Article
  • December 1981

Computers & Mathematics with Applications

... If they are stored in memory, a typical query of 25 terms will require that 25 inverted file entries be accessed from disk before the query can be processed. If, to conserve memory space, the vocabulary and associated information are merged on disk with the inverted file entries and some form of hashing (such as extensible hashing [14]) is used, the expected number of disk accesses can kept to about 1.2 on average per query term, but at the cost of a 20%–30% expansion in the size of the inverted file. A more practical solution is to allow two accesses per query—the first into the index file containing the vocabulary and term information, including the inverted file entry address, and the second to actually retrieve the inverted file entry. ...

Recursive Linear Hashing.
  • Citing Article
  • September 1984

ACM Transactions on Database Systems

... The initial species concentrations are indicated by the column vector 0 . Numerical solutions for stiff ODE systems defined by Equation (1) can be obtained using explicit or implicit ODE integrator [4][5][6][7][8]. Many ODEs have been used for chemical kinetic models; however, they are stiff [9], ′ = ( , ), 0 ≤ ≤ (1) ...

Fixed Leading Coefficient Implementation of SD-Formulas for Stiff ODEs
  • Citing Article
  • December 1980

ACM Transactions on Mathematical Software

... While some work into curating collections for use in evaluating near duplication detection exists [5][6][7]18], such corpora have primarily focused on document collections that do not reflect our problem domain. Finally, we are not concerned in this work with retrieval of documents using their signatures and leave investigating the applicability such methods [2,3,12,14] to future work. ...

A signature file scheme based on multiple organizations for indexing very large text databases
  • Citing Article
  • October 1990

Journal of the American Society for Information Science

... It is recognized that both navigational and associative access to the database are important. This complies with experience from systems, where large complex databases are to be handled, e.g., 15] . Our interface provides very exible functionality to navigate through relationships, returning individual relationships or (sets of) objects which are related to a speciic object. ...

Querying in a Large Hyperbase
  • Citing Conference Paper
  • January 1991

... The way to store the structure of documents in the legal domain has evolved with the appearance of new standards applicable to structured documents. Those approaches where the structure is kept separately from the document content [38,3], have given way to those where the structure forms part of the text of the document [56,12,65,55]. XML allows a document to be tagged according to its semantic structure, and provides additional standards and utilities to access (XPath) and manipulate document components in XML documents (XSLT). ...

Managing a Digital Library of Legislation.

... Niemi andJärvelin & Niemi 1999), like several other authors (e.g. Sacks- Davis et al., 1995;Zobel et al., 1991;Lambrix & Padgham, 2000), have proposed complex entities for representing and manipulating hierarchical documents. Järvelin and others (2000) have shown that complex entities are natural structures for informetrics. ...

Efficiency of Nested Relational Document Database Systems.