FrontBase logo

FrontBase Documentation

FrontBase logo

Backtrack:
  Welcome!
    4. Administering a FrontBase Server
Updated: 20-Nov-2000
prevnext
Table of Contents

4.13. Indexing

FrontBase offers two strategies for maintaining indices:

PRESERVE SPACE

The default strategy, called PRESERVE SPACE, is very space efficient and works well with tables up to a few hundred thousand rows. An index on a table, no matter the number of colums, costs less than 5 bytes per row. If you have a table with 100,000 rows, creating an index on this table will thus increase the disk footprint by less than 500 KB. Memory efficeincy is attained since column values are not stored together with the index information (an optimized B-tree). The downside is that rows from the table must be loaded to get column values when using the index. This index mode works well in most cases. Users like the low disk space footprint of a database.

PRESERVE TIME

The alternative strategy, called PRESERVE TIME is fast when searching through millions of rows at the expense of higher disk space consumption. The mode scales very well and can easily handle tables with many millions of rows. Column values are copied into the B-tree, which increases the disk space footprint, but speeds up lookups considerably. The actual rows are loaded only when needed. If a given SELECT only fetches column values that are part of an index, the actual rows are not loaded at all. Such a SELECT is very fast.

Example:

Consider a typical indexing setup with a WORD table, a DOCUMENT table and a RELATION table:

CREATE TABLE WORD(
    WORDPK INT PRIMARY KEY,   -- Implies an index
    WORD VARCHAR(64));
CREATE INDEX ON WORD(WORD);
CREATE TABLE DOCUMENT(
    DOCUMENTPK INT PRIMARY KEY,   -- Implies an index
    DOCUMENT CLOB);
CREATE TABLE RELATION(
    WORDFK INT,
    DOCUMENTFK INT,
    PRIMARY KEY(WORDFK, DOCUMENTFK));   -- Implies an index
CREATE INDEX ON RELATION(DOCUMENTFK, WORDFK);
COMMIT;

To get a list of DOCUMENTFKs identifying the documents in which a given word is found:

SELECT DOCUMENTFK
    FROM RELATION, WORD
    WHERE RELATION.WORDFK = WORD.WORDPK AND WORD.WORD = '<some-word>';

To get a list of WORDFKs identifying the words found in a given document:

SELECT WORDFK
    FROM RELATION
    WHERE DOCUMENTFK = <some-document-pk>;

To find a list of DOCUMENTFKs identifying the documents in which two given words are found:

SELECT DOCUMENTFK
    FROM RELATION, WORD
    WHERE RELATION.WORDFK = WORD.WORDPK AND WORD.WORD = '<word_1>'
    INTERSECT
    SELECT DOCUMENTFK
    FROM RELATION, WORD
    WHERE RELATION.WORDFK = WORD.WORDPK AND WORD.WORD = '<word_2>'
   

This could/should be wrapped into a VIEW.

In a reasonable setup with 100,000 documents and 100 words on average per document, the RELATION table holds 10,000,000 rows and is a perfect candidate for PRESERVE TIME. In all of the above SELECT statements, the actual rows would not be loaded if the indices were set to PRESERVE TIME. The indices would hold the actual column values.

The WORD table is a less likely candidate for PRESERVE TIME. It will probably make more sense to use PRESERVE SPACE combined with a proper sized cache for this table.


If you have feedback or questions on this document, please send e-mail to doc-feedback@frontbase.com. Please reference the section number and topic. Thanks!!

©2000 FrontBase, Inc. All rights reserved.