Logical Structure of a Database

The purpose of a database is to record information on disk and make it accessible for use through programs. In order to facilitate use, the data is organized into tables, columns, and rows. A table is made up of a variable number of rows, all of which have the same column structure. Rows differ from one another only in terms of their content. Thus, the terms "structure of a table," "row structure," and "column structure" are all essentially synonymous.

Columns

We also use the term "database element" when referring to a class of data distinct from others (constant, variable, etc.). This term refers to a value, or a column of a table, for a particular row. The particular row referred to is determined by program action at run time, but programs normally act on database elements one row at a time, rather than generally on the column value of all rows in a table. It is possible, of course, to change the value of a particular column in all records with a program that loops through all the rows of a table.

In programs, database elements are referred to in a standard format composed of the programming name of the table, followed by a period, followed by the column name, as TableName.ColumnName. The column lastName in the table person, for example, appears as person.lastName every time it is referred to in a program.

This binomial nomenclature of database elements is a characteristic of DataFlex's application independence of data. Application independence of data refers to the fact that the data can exist independently of any given program. Databases are not generally defined, nor modified, in applications, although for specific situations, there are means of doing so.

Rather, database elements are referred to in programs by their full names in normal usage. The name of any given database element does not vary from one program to another, nor does its type or length.

When using the Embedded Database, column names can be any combination of characters, up to 32 characters long, that starts with a letter and contains no punctuation other than underscore (_). Columns are typed as String, Number, or Date, plus two types not found in other classes: Text and Binary. The first three types are fixed-length types, and the last two are variable-length. For more information, see Variable-Length Columns.

Data Types in Database Elements

When data is moved to a database element, validity checking for the type of the column is applied to the data, triggering an error when the data violates the type format. Examples would be moving abc to a Number field, or 13/13/13 to a Date. String and Text type columns will reject any characters outside the printable range, such as control sequences. Binary columns reject nothing.

The storage formats of the data types on disk also vary somewhat. Numbers and Dates are stored as fixed-point values in Binary Coded Decimal (BCD) form. Dates are Julian integers and are three bytes in length for all values.

BCD format permits the storage of two digits in one byte of disk space (unlike String, which uses one byte per character). A three-byte Number field can store values up to ±999,999 or, if a decimal point is involved, ±99.9999, for example.

Variable-Length Columns

The DataFlex Text and Binary data formats apply exclusively to database columns intended to hold ASCII and binary data whose length exceeds the 255-byte maximum that applies to String columns. For this reason, such columns cannot be manipulated with the commands and functions used for manipulating ordinary columns such as move, uppercase, trim, calc, and left. Such columns must be opened for input or output with source and destination commands, and their contents manipulated like those of any actual sequential file.

The facility for addressing Text and Binary (variable-length) columns as sequential files is the dbms: driver and the dbms option to the source and destination commands. When the dbms: driver (with the colon) is used, the table and column must be addressed by their numbers. When the command option (without the colon) is used, on the other hand, the table and column can be specified in the usual fileName.fieldName format.

The following program would append the entire contents of Sequential File WITNESS.EYE onto column account of the table newsEvent in the row whose name column had the value of Hindenburg:

string sTransvar
open newsevent
direct_input "width: 250: witness.eye"
append_output dbms newsEvent.account
move "Hindenburg" to newsEvent.name
find eq newsEvent.name
if not (seqeof) repeat
    readln sTransvar
    writeln sTransvar
if not (seqeof) loop
close_input
save newsevent

Since the append_output command was used, the imported material would be appended to any existing contents of Field account. The width: format option is set at 250 to provide end-of-line characters for the readln command that imports the sequential data.

Output into files, devices, or fields opened with append_output can be positioned within the existing content with the set_channel_position command. Where a width: greater than the default argument size is required, the set_argument_size command may be used to increase available capacity. Entire files and fields up to the size set can be moved directly from one to another in a single command without use of an intermediate variable.

For binary files, the read_block command is most suitable for moving data. The following program would export the value of Field 43 in the record whose name field had the value of Hyde of Database File employee (number 16) to binary Graphic File SUSPECT.CUT:

string sTransvar
direct_output "suspect.bmp"
open employee
move "Hyde" to employee.name
find eq employee.name
[found] direct_input "dbms: binary: 16, 43"
if not (seqeof) repeat
    read_block sTransvar 250
    write sTransvar
if not (seqeof) loop
abort

Since File SUSPECT.BMP is opened with a direct_output command, any existing file by that name is deleted before output begins. The numerical reference (16, 43) to the file and field in the direct_input command is the format required where a driver (binary in this case) is used. Dbms is also used in the form of a driver (with a colon, inside the quotation marks) for the same reason.

The Text and Binary formats are provided to support fields larger than the 255-byte limit that applies to String fields. These types are sometimes called "variable-length," referring to the length of what is returned from one of these fields when it is retrieved into, say, a separate disk file, or a device such as a printer. Unlike String type fields, they do not output trailing spaces to pad their contents out to their full length. They accomplish this through the maintenance, when compressed, of a two-byte header at the beginning of each such field that indicates the length of the data in bytes. Due to this header, the amount of data which a Binary or Text field can hold is two bytes less than its declared size. For example, a Binary field declared with a size of 1024 will be able to hold no more than 1022 bytes of data.

On the disk, however, Text and Binary fields do occupy their entire declared length, unless you use data compression for the file. For this reason, using compression is practically mandatory for files containing such fields. Furthermore, when a file containing a field of either of these types is opened, the record buffer for the file will actually consume memory for the full declared length of such fields, a serious factor under circumstances of limited memory and many large fields.

Logical Structure of a Database

Columns

Data Types in Database Elements

Variable-Length Columns

See Also