The Parallel Loader
Data Loader Overview
Transferring data into a database is generally referred to as loading or importing data. Transferring data from a database into a host file is referred to as exporting data. Ndlm uses or creates the following items in importing or exporting data:
A running database instance must always be specified as either the destination for imported data or the source of data to be exported.
An ASCII text file serves as either the import data source or the export data destination.
load specification file (spec-file)
The user creates a load specification file which maps database columns to the fields in an ASCII text file. If IMPORT, ODBCIMPORT, SCTIMPORT or EXECSQL is the first word in this file, ndlm loads data from the input source into database tables. If EXPORT or SCTEXPORT is the first word in the file, ndlm exports data from the database to an ASCII file or an SCT File respectively.
During an import, ndlm creates a rejected record file, to which are written any records that cannot be loaded into the target table due to some error condition. The reject file is called input-file.BAD, where input-file is the name of the ASCII file containing the data being imported. If data is being imported from multiple files, the reject file will be named after the first file processed, and any rejected records from subsequent input files will be written to this reject file. Data type mismatch, or violations of NOT NULL constraints are common examples of error conditions that might result in record rejection. The reject file is created in the current directory.
Skip files are created during an import or export only when the SKIPIF option is included in the load specification file. Records containing fields that satisfy a specified condition or conditions are placed in a "skip file" instead of being imported/exported.
If no name for this file is defined by the user, SAND CDBMS assigns it a default name consisting of the name of the load specification file, with the .skp extension. Note that when more than one SKIPIF clause is used, file names must be included for each (except for the last SKIPIF clause in a series, which will write by default to the spec-file.skp file if no other file name is specified).
The NUCLEUS environment variable / nucleus.ini file
The following optional parameters can be used with the Parallel Loader. These can be set in either the NUCLEUS environment variable or in the [CLIENT] section of the nucleus.ini configuration file.
The TEMPDRIVE parameter sets the drive(s) where the temporary loader files will be written. Multiple drives can be listed, separated by semicolons (;).
Default: . (current directory)
Note: For best results, ensure that the temporary drive is not located on the same disk as the database or, especially, the flat file. In addition, make sure that the amount of free disk space in the temporary drive is at least three (3) times the size of the flat file.
The TEMPCACHE parameter allocates buffer space in memory (in megabytes) to hold Virtual File System (VFS) data during Parallel Loader operations. By default, VFS data is not cached. Enabling this parameter (by setting it to a positive value) can speed up load operations and SCT File creation.
The TEMPPAGE parameter sets the Virtual File System (VFS) page size (in kilobytes). Data is written to the VFS in clusters of this size. If multiple locations for the VFS are specified by the TEMPDRIVE parameter, one page of data at a time will be written to each location in "round robin" fashion.
NUCLEUS='SUPPORT:/usr/sand TEMPDRIVE:/usr/sand/tmp;/drv1/tmp;/drv2/tmp TEMPCACHE:32 TEMPPAGE:4'
SET NUCLEUS=SUPPORT:C:\PROGRA~1\SAND TEMPDRIVE:D:\sand\tmp;E:\sand\tmp TEMPCACHE:8 TEMPPAGE:1024
The Parallel Loader