SQL* LOADER
11 Overview of SQL * Loader
Definitions:
SQL*
LOADER (sqlldr) is a utility which is
used to load the data from flat files to table. We have to feed control file to
the sql loader this control file contains the information about the flat file
and which table needs to be loaded.
12 Architecture
13 Types of Files Used
The following type of files
used
1.
Input File :
1.1
Data file or external file :
Collection for physical records
1.2
Control file : Instruct SQL * Loader where to find the data
, how to parse and interpret the data and where to insert the data
2.
Output File:
2.1
Bad File: Contains all records which are rejected either
due to format error (data type mismatch, size mismatch) or oracle error.
2.2
Discard File: it will be created whenever there
are records which are not satisfying the filtering criteria specified in the
when clause of the control file.
2.3
Log file: contains state of the load, error occurred,
how many are loaded and how many are failed etc.
Data file or external file structure:
FIXED LEGTH
RECORD FORMAT
|
VARIABLE
LEGTH RECORD FORMAT
|
All record in a external
file are of the same length
Example:
EMP# EMP_NAME SALARY DOB
10002 Prasad B 30000 12-05-82
10003 Ranjit P 70000 14-11-81
|
Record fields are separated
by a delimiters like, | \. In this case we specify a delimiter in the control
file
Example:
EMP# |EMP_NAME|SALARY |DOB|
10002|Prasad B| 30000 |12-05-82|
10003|Ranjit P|70000|14-11-81|
|
Ø
Control files overall structure
[OPTIONS
(Opt_type)] --Options Clause
LOAD
DATA
INFILE <file_ name> [RECSIZE <integer>] [BUFFERS <integer>]
[APPEND/INSERT/REPLACE/TRUNCATE] ---modes
INTO TABLE “schema_ name ". " table_name "
FIELDS TERMINATED BY <delimiter> --for variable length records in External file
[OPTIONALLY
ENCLOSED BY <<enclosing character> >]
[TRAILING
NULLCOLS] --
Columns that are not present in the record treated as null
[WHEN
<condition>] --filter
condition
(
<column_name> [<data_type>] [ <sql_function>] , --for
variable length records
<column_name>
[<data_type>] [<sql_function> ] ,
<column_name> [<data_type>] [<sql _function>]
)
Ø Control file structure for FIXED – LENGTH RECORD
OPTIONS ( ERRORS = 55)
LOAD DATA
INFILE <File Name>
INTO TABLE <table name>
(<column-name> POSITION
(<integer>:<integer>) <data_type>,
<column-name>POSITION
(<integer>:<integer>) <data_type>
)
Ø Control file structure for VARIABLE – LENGTH RECORD
OPTIONS ( rows= 5)
LOAD DATA
INFILE <File Name>
INTO TABLE <table name>
FIELDS TERMINATED
BY ‘|’
(<column-name> <data_type>,
<column-name> <data_type>)
Syntax
explanation
OPTIONS
Clause
1. BAD : Name of the bad file
2. CONTROL : The name of the control file
3. DIRECT
: [FALSE] Specifies whether or not to use a direct path load(dpl) or
conventional
4. DISCARD
: Name of the discard file
5. DISCARDMAX
: [ALL] Max no. of records to discard.
6. ERRORS
: [50] No. of allowed errors.
7. LOAD :
[ALL] No. of logical rows to load.
8. LOG :
Name of the log file.
9. PARALLEL: [FALSE] Perform parallel load. (increases performance)
10. READSIZE:
[1 MB] Size of the read buffer.
11. ROWS : [64] No. of rows to be
committed at a time. [ALL] for dpl
12. SILENT:
Suppress load message.
13. SKIP :
[0] Allow us to skip n number of logical records.
14. USERID
: The Oracle username and password
LOAD DATA
The LOAD DATA statement tells SQL*Loader
that this is the beginning of a new data load.
INFILE
The
INFILE
clause specifies the name of a datafile containing data that you want to load.
MODE
INSERT: INSERT is SQL*Loader's default
method. It requires the table to be empty before loading.
APPEND: If data
already exists in the table, SQL*Loader appends the new rows to it. If data
doesn't already exist, the new rows are simply loaded.
REPLACE:
All rows
in the table are deleted and the new data is loaded
The row deletes cause any delete
triggers defined on the table to fire
It works as
per below :
·
Load your data into a work table.
·
Use the SQL language UPDATE statement with
correlated subqueries.
·
Drop the work table.
TRUNCATE
Using this method, SQL*Loader uses the SQL
TRUNCATE command to achieve the best possible performance. For the TRUNCATE
command to operate, the table's referential integrity constraints must first be
disabled. If they have not been disabled, SQL*Loader returns an error.
Data File :
OTS_EMP.dat
101|Prasad B|24000|D01|
102|Ranjit P|25000|D02|
TEMP Table:
CREATE TABLE XXOTS_EMP (emp_no NUMBER(4),
emp_name VARCHAR2(10),
sal NUMBER(5),
dept_no VARCHAR2(5));
Control File :
LOAD DATA
INFILE “/ORATECH/INBOUND/data/OTS_EMP.dat”
INSERT
INTO TABLE XXOTS_EMP
FIELDS TERMINATED BY '|'
(emp_no ,
emp_name “UPPER(:emp_name)”, - -
function
sal number,
dept_no
)
Executed as
$sqlldr userid = apps/apps control=OTS_EMP.ctl(we should navigate to the
path where the control file present.
Data Loading Methods
SQL * Loader provide two
methods of data loading
Ø Conventional path load
Ø Direct path load
By default we will be using conversional path load.
A conventional path load executes SQL
INSERT
statements to populate tables in an Oracle database.
Direct path load uses the direct path API to pass the data
to be loaded to the load engine in the server.
When to use
conventional path?
1. When
applying inserts or updates to a nonindexed table concurrently with the load .
2. When
loading data into a clustered table
3.
When loading a relatively small number of rows
4.
When loading a relatively small number of rows
into a large table with referential and column-check integrity constraints
When to use direct
path?
1. You
have a large amount of data to load quickly. A direct path load can quickly
load and index large amounts of data. It can also load data into either an
empty or nonempty table.
2. To start SQL*Loader in direct path load mode, set the DIRECT
parameter to true on the command line or in the parameter file, if used, in
the format:DIRECT=true or in
options(direct=true)
Points to be noted:
In direct path
load the following constraints and triggers will be disabled
Triggers
Table
insert triggers are also disabled when a direct path load begins
Constraints
Parallel Data Loading
If triggers or integrity constraints
pose a problem, but you want faster loading, you should consider using
concurrent conventional path loads. That is, use multiple load sessions
executing concurrently on a multiple-CPU system. (Can be used for conversional
path/Direct path loading)
Restrictions
to use parallel load
1 Referential
integrity and check constraints must be disabled
2 Triggers must
be disabled
3 Rows can only
be appended. This is due the individual loads not been coordinated.
Lab Sessions:
sqlldr
username@server/password control=loader.ctl
sqlldr
username/password@server control=loader.ctl
CASE:1
load
data
infile
'c:\data\mydata.csv'
into
table emp
fields
terminated by "," optionally enclosed by '"'
(
empno, empname, sal, deptno )
10001,"Scott
Tiger", 1000, 40
10002,"Frank
Naude", 500, 20
CASE:2
load
data
infile
*
replace
into
table departments
( dept
position (02:05) char(4),
deptname position (08:27) char(20)
)
begindata
COSC COMPUTER SCIENCE
ENGL ENGLISH LITERATURE
MATH MATHEMATICS
POLY POLITICAL SCIENCE
CASE:3
LOAD
DATA
INFILE
*
INTO
TABLE load_delimited_data
FIELDS
TERMINATED BY ","
OPTIONALLY
ENCLOSED BY '"'
TRAILING
NULLCOLS
( data1,
data2
)
BEGINDATA
11111,AAAAAAAAAA
22222,"A,B,C,D,"
CASE:4
LOAD
DATA
INFILE *
INTO TABLE load_positional_data
(
data1 POSITION(1:5),
data2 POSITION(6:15)
)
BEGINDATA
11111AAAAAAAAAA
22222BBBBBBBBBB
CASE:5
OPTIONS
(SKIP=5)
LOAD
DATA
INFILE
*
INTO
TABLE load_positional_data
( data1 POSITION(1:5),
data2 POSITION(6:15)
)
BEGINDATA
11111AAAAAAAAAA
22222BBBBBBBBBB
...
sqlldr
userid=ora_id/ora_passwd control=control_file_name.ctl skip=4
CASE:6
LOAD
DATA
INFILE *
INTO TABLE modified_data
(
rec_no
"my_db_sequence.nextval",
region CONSTANT '31',
time_loaded "to_char(SYSDATE,
'HH24:MI')",
data1 POSITION(1:5) ":data1/100",
data2 POSITION(6:15)
"upper(:data2)",
data3 POSITION(16:22)"to_date(:data3,
'YYMMDD')"
)
BEGINDATA
11111AAAAAAAAAA991201
22222BBBBBBBBBB990112
CASE:7
LOAD
DATA
INFILE 'mail_orders.txt'
BADFILE 'bad_orders.txt'
APPEND
INTO TABLE mailing_list
FIELDS TERMINATED BY ","
(
addr,
city,
state,
zipcode,
mailing_addr "decode(:mailing_addr, null, :addr,
:mailing_addr)",
mailing_city "decode(:mailing_city, null, :city,
:mailing_city)",
mailing_state
)
CASE:8
Can
one load data from multiple files/ into multiple tables at once?
LOAD
DATA
INFILE file1.dat
INFILE file2.dat
INFILE file3.dat
APPEND
INTO TABLE emp
( empno
POSITION(1:4) INTEGER EXTERNAL,
ename
POSITION(6:15) CHAR,
deptno POSITION(17:18) CHAR,
mgr
POSITION(20:23) INTEGER EXTERNAL
)
CASE:9
LOAD
DATA
INFILE *
INTO TABLE tab1 WHEN tab = 'tab1'
( tab
FILLER CHAR(4),
col1 INTEGER
)
INTO TABLE tab2 WHEN tab = 'tab2'
( tab
FILLER POSITION(1:4),
col1 INTEGER
)
BEGINDATA
tab1|1
tab1|2
tab2|2
tab3|3
CASE:10
LOAD
DATA
INFILE 'mydata.dat'
REPLACE
INTO TABLE emp
WHEN empno != ' '
( empno
POSITION(1:4) INTEGER EXTERNAL,
ename
POSITION(6:15) CHAR,
deptno POSITION(17:18) CHAR,
mgr
POSITION(20:23) INTEGER EXTERNAL
)
INTO TABLE proj
WHEN projno != ' '
(
projno POSITION(25:27) INTEGER EXTERNAL,
empno
POSITION(1:4) INTEGER EXTERNAL
)
CASE:11
Can
one selectively load only the records that one needs?
Look
at this example, (01) is the first character, (30:37) are characters 30 to 37:
LOAD
DATA
INFILE
'mydata.dat' BADFILE 'mydata.bad'
DISCARDFILE 'mydata.dis'
APPEND
INTO TABLE my_selective_table
WHEN (01) <> 'H' and (01) <> 'T'
and (30:37) = '20031217'
(
region CONSTANT '31',
service_key POSITION(01:11) INTEGER EXTERNAL,
call_b_no POSITION(12:29) CHAR
)
CASE:12
NOTE:
SQL*Loader does not allow the use of OR in the WHEN clause. You can only use
AND as in the example above! To workaround this problem, code multiple "INTO
TABLE ... WHEN" clauses. Here is an example:
LOAD
DATA
INFILE
'mydata.dat' BADFILE 'mydata.bad'
DISCARDFILE 'mydata.dis'
APPEND
INTO TABLE my_selective_table
WHEN (01) <> 'H' and (01) <> 'T'
(
region CONSTANT '31',
service_key POSITION(01:11) INTEGER EXTERNAL,
call_b_no POSITION(12:29) CHAR
)
INTO TABLE my_selective_table
WHEN (30:37) = '20031217'
(
region CONSTANT '31',
service_key POSITION(01:11) INTEGER EXTERNAL,
call_b_no POSITION(12:29) CHAR)
CASE:13
Can
one skip certain columns while loading data?
LOAD
DATA
TRUNCATE
INTO TABLE T1
FIELDS
TERMINATED BY ','
(
field1,
field2 FILLER,
field3
)
CASE:14
--
Variable-length, delimited and enclosed data format
LOAD
DATA
INFILE
*
APPEND
INTO
TABLE emp
FIELDS
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(empno,
ename,
job,
mgr,
hiredate DATE(20) "DD-Month-YYYY",
sal,
comm,
deptno CHAR TERMINATED BY ':',
projno,
loadseq
SEQUENCE(MAX,1))
BEGINDATA
7782, "Clark",
"Manager", 7839, 09-June-1981, 2572.50,, 10:101
7839, "King",
"President", , 17-November-1981,5500.00,,10:102
7934, "Miller", "Clerk",
7782, 23-January-1982, 920.00,, 10:102
7566, "Jones",
"Manager", 7839, 02-April-1981, 3123.75,, 20:101
7499, "Allen",
"Salesman", 7698, 20-February-1981, 1600.00,
(same line continued) 300.00, 30:103
7654, "Martin",
"Salesman", 7698, 28-September-1981, 1312.50,
(same line continued) 1400.00, 3:103
7658, "Chan", "Analyst",
7566, 03-May-1982, 3450,, 20:101
CASE:15
Example
on direct load
1.
Create the following table
CREATE TABLE tab1
(EMPNO number(3) primary key ,
ename VARCHAR2(10) UNIQUE
);
2.
select * from TAB1
no rows selected
3.
SQL Loader Program
options(direct=true)
load data
infile *
insert into table tab1
fields terminated by ','
(empno,
ename)
begindata
101,Ranjit,
101,Ranjit,
4.
select * from TAB1
Please note empno is Primary
Key and ename is unique key though data is load due to direct path data load.
Constraints violated here
5.
Delete from TAB1
*Cause: An attempt has been made to access an index
or index partition
that has been marked unusable by a direct
load or by a DDL
operation
6.
User table TAB1 truncated.
Interview Questions
1. What is the SQL * Loader and what is it used for?
SQL * Loader is
bulk loader utility used for moving data from external files into the oracle
database
2. Can one skip certain columns while loading data?
Yes we can skip
the columns by using filler
3. What it the syntax of the control file?
Just tell Basic
syntax
LOAD DATA
INFILE
<file_path>
INSERT INTO
TABLE <name of the table>
FIELDS
TERMINATED BY ‘|’
( Col - 1,
Col - 2,
.
.
Col-n
);
4. How will you give positions in fixed files?
Column Name POSITION(n:m)
5. How will you give constant value to SQL * Loader
Column Name
CONSTANT ‘INR’
6. What is the syntax for executing SQL * Loader?
Sqlldr
userid = <username/password@databasename> Control=<Path of Control
File>
7.
How
to Load single file data into multiple tables ?
Yes we can load by
using WHEN clause
8. Is it possible to load data from multiple tables to single table?
We need to provide multiple in files of the
same format.
Ex : LOAD DATA
INFILE file1.dat
INFILE file2.dat
INFILE file3.dat
APPEND
INTO TABLE emp
(empno ,
ename ,
deptno ,
mgr )
9. Whats the Common Error you faced during
loading data from flat file to Temporary tables?
1 SQL*Loader-601: For INSERT option, table must
be empty.
2 Errors Due to wrong syntax
3
Errors due data type mismatch or
size mismatch or due wrong formats (mostly date format)
10.
Could
you please explain INSERT/APPEND/REPLACE/TRUNCATE ?
INSERT:
Table must be empty
APPEND: If data already exists
in the table, SQL*Loader appends the new rows to it. If data doesn't already exist, the new rows are
simply loaded.
REPLACE: All rows in the table are deleted and the new
data is loaded
The row deletes cause any delete
triggers defined on the table to fire
It uses correlated update procedure internally
TRUNCATE: Uses the standard
SQL truncate command
Note: TRUNCATE and REPLACE functionally
same but the way it works different TRUNCATE is faster than replace.
If you use TRUNCATE we must disable integrity constraints.
11.
Difference
between Conventional path load and direct path load?
By default we will be using
conversional path load.
A conventional path load executes
SQL
INSERT
statements to populate tables in an Oracle
database.
Direct path load uses the direct path API to
pass the data to be loaded to the load engine in the server. Hence it faster,
constraints will be over looked.
12. What are the file involved in SQL * Loader ?
-
Data file ( can be .csv, .dat, .txt)
-
Control file
-
Bad file
-
Discard file
-
Log file
13. What is the difference between bad file, discard file and log
file?
Bad
File
|
Discard
File
|
Log
File
|
Contains all records which are rejected either due
to
Data type mismatch
More data than columns size
Format errors (Mostly date)
|
Contains all records which are not satisfied by
when clause
|
Contains state of the load, error occurred, how
many are loaded successfully , how many are moved to bad file , how many
moved discard file.
Reason for failure
|
14. Is there a
SQL*Unloader to download data to a flat file?
Oracle does not supply any data unload utilities. However, you can use SQL*Plus
to select and format your data and then spool it to a file:
spool oratechdata.txt
select col1 || ',' || col2 || ',' || col3 from tab1 where
col2 = 'XYZ';
spool off
Alternatively use the UTL_FILE PL/SQL package:
14.
How
can get SQL*Loader to COMMIT only at the end of the load file?
One cannot, but by setting the ROWS= parameter to a
large value, committing can be reduced.
By default every 64 Records commit will be issued
15.
How
will you register SQL*LOADER AS a Concurrent Programs?
Ø
Define executable by giving Execution Method as
SQL*Loader.
Ø
Define Concurrent Programs
Ø
Attached Concurrent Program to the Request Group
of appropriate Responsibility.
16. Where
you placed Control file on server?
In
Appropriate top bin Folder. For example if Control File is of Oracle Purchasing
module then Control file should be placed in $PO_TOP/bin Folder.
17. From
where you will get the data file?
From functional
consultant or team lead it is depends on the project sometime you get it from
customer also directly.
In case of issue in the data file you have revert back
with all the issue to functional / team lead / customer depends on who is the
data file provider in the project.
18.
I
want skip the data in the file randomly let us take an example
I have 100
records I want to skip 51 , 67 , 71 , 83 ,99 how to do this?
It is not possible to do this until unless there is
condition provided, if some condition/pattern is there then we can avoid
loading with when clause.
19.
Ho w
to generate the sequence in SQL * Loader?
There are two way to generate the
sequence in sql * loader
Method I
Col1 “DB_SQ.nextval”
Method II
Col1 SEQUENCE(MAX ,1 )
20.
Can I use SQL functions in SQL * Loader?
Yes you can use
Syntax : Col1 “UPPER(:Col1)”, - - function
If you face any other
Interview questions keep adding it, and let us all of us know about it.
Happy Learning J, Wish you
very all the best
super article . thank you for sharing
ReplyDelete