Wednesday, 28 January 2015

SQL* LOADER Quick Reference With Interview Questions


SQL* LOADER



11    Overview of SQL * Loader


Definitions:  SQL* LOADER (sqlldr)  is a utility which is used to load the data from flat files to table. We have to feed control file to the sql loader this control file contains the information about the flat file and which table needs to be loaded.

12    Architecture


13   Types of Files Used

The following type of files used

1.      Input File  :

1.1   Data file or external file : Collection for physical records

1.2   Control file :  Instruct SQL * Loader where to find the data , how to parse and interpret the data and where to insert the data

2.      Output File:

2.1   Bad File:  Contains all records which are rejected either due to format error (data type mismatch, size mismatch) or oracle error.

2.2   Discard File: it will be created whenever there are records which are not satisfying the filtering criteria specified in the when clause of the control file.

2.3   Log file: contains state of the load, error occurred, how many are loaded and how many are failed etc.

Data file or external file structure:


FIXED LEGTH RECORD FORMAT
VARIABLE LEGTH RECORD FORMAT

All record in a external file are of the same length


Example:

EMP#          EMP_NAME              SALARY                 DOB
10002         Prasad B                    30000            12-05-82
10003         Ranjit P                    70000            14-11-81

Record fields are separated by a delimiters like, | \. In this case we specify a delimiter in the control file

Example:

EMP# |EMP_NAME|SALARY |DOB|
10002|Prasad B| 30000 |12-05-82|
10003|Ranjit P|70000|14-11-81|


Ø  Control files overall structure
[OPTIONS (Opt_type)]                                 --Options Clause
        LOAD DATA
INFILE <file_ name> [RECSIZE <integer>] [BUFFERS <integer>]
[APPEND/INSERT/REPLACE/TRUNCATE] ---modes
INTO TABLE “schema_ name ". " table_name "
FIELDS TERMINATED BY <delimiter>        --for variable length records in External file
[OPTIONALLY ENCLOSED BY <<enclosing character> >]
[TRAILING NULLCOLS]     -- Columns that are not present in the record treated as null
[WHEN <condition>]             --filter condition
( <column_name> [<data_type>] [ <sql_function>] , --for variable length records
  <column_name> [<data_type>]  [<sql_function> ] ,
  <column_name> [<data_type>]  [<sql _function>]
)

Ø    Control file structure for FIXED – LENGTH RECORD
             OPTIONS ( ERRORS = 55)
             LOAD DATA
             INFILE <File Name>
             INTO TABLE <table name>
              (<column-name> POSITION (<integer>:<integer>) <data_type>,
                <column-name>POSITION (<integer>:<integer>) <data_type>
               )

Ø    Control file structure for VARIABLE – LENGTH RECORD
OPTIONS ( rows= 5)
 LOAD DATA
INFILE <File Name>
INTO TABLE <table name>
FIELDS TERMINATED BY ‘|’
 (<column-name> <data_type>,
  <column-name> <data_type>)

Syntax explanation
OPTIONS Clause
1.      BAD : Name of the bad file
2.      CONTROL : The name of the control file
3.       DIRECT : [FALSE] Specifies whether or not to use a direct path load(dpl) or conventional
4.       DISCARD : Name of the discard file
5.       DISCARDMAX : [ALL] Max no. of records to discard.
6.       ERRORS : [50] No. of allowed errors.
7.       LOAD : [ALL] No. of logical rows to load.
8.       LOG : Name of the log file.
9.      PARALLEL: [FALSE] Perform parallel load. (increases performance)
10.    READSIZE: [1 MB] Size of the read buffer.
11.    ROWS : [64] No. of rows to be committed at a time. [ALL] for dpl
12.    SILENT: Suppress load message.
13.    SKIP : [0] Allow us to skip n number of logical records.
14.    USERID : The Oracle username and password
LOAD DATA
    The LOAD DATA statement tells SQL*Loader that this is the beginning of a new data load.
INFILE
    The INFILE clause specifies the name of a datafile containing data that you want to load.
MODE
INSERT:  INSERT is SQL*Loader's default method. It requires the table to be empty before loading.
APPEND: If data already exists in the table, SQL*Loader appends the new rows to it. If data doesn't already exist, the new rows are simply loaded.
REPLACE:
   All rows in the table are deleted and the new data is loaded
  The row deletes cause any delete triggers defined on the table to fire
  It works as per below :
·        Load your data into a work table.
·        Use the SQL language UPDATE statement with correlated subqueries.
·        Drop the work table.

TRUNCATE
Using this method, SQL*Loader uses the SQL TRUNCATE command to achieve the best possible performance. For the TRUNCATE command to operate, the table's referential integrity constraints must first be disabled. If they have not been disabled, SQL*Loader returns an error.

Small Example:
Data File :
OTS_EMP.dat
101|Prasad B|24000|D01|
102|Ranjit  P|25000|D02|
TEMP Table:
CREATE TABLE XXOTS_EMP (emp_no NUMBER(4),
                                                    emp_name VARCHAR2(10),
                                                   sal NUMBER(5),
                                                   dept_no VARCHAR2(5));
Control File  :
LOAD DATA
INFILE “/ORATECH/INBOUND/data/OTS_EMP.dat”
INSERT
INTO TABLE XXOTS_EMP
FIELDS TERMINATED BY '|'
(emp_no ,
 emp_name “UPPER(:emp_name)”, - - function
 sal number,
 dept_no
)
Executed as        
$sqlldr userid = apps/apps control=OTS_EMP.ctl(we should navigate to the path where the control file present.
Data Loading Methods
SQL * Loader provide two methods of data loading
Ø  Conventional path load
Ø  Direct path load
By default we will be using conversional path load.
A conventional path load executes SQL INSERT statements to populate tables in an Oracle database.
Direct path load uses the direct path API to pass the data to be loaded to the load engine in the server.
When to use conventional path?
1.      When applying inserts or updates to a nonindexed table concurrently with the load .
2.      When loading data into a clustered table
3.       When loading a relatively small number of rows
4.       When loading a relatively small number of rows into a large table with referential and column-check integrity constraints
When to use direct path?
1.      You have a large amount of data to load quickly. A direct path load can quickly load and index large amounts of data. It can also load data into either an empty or nonempty table.
2.  To start SQL*Loader in direct path load mode, set the DIRECT parameter to true on the command line or in the parameter file, if used, in the format:DIRECT=true or in options(direct=true)
   Points to be noted:
        In direct path load the following constraints and triggers will be disabled
     Triggers
           Table insert triggers are also disabled when a direct path load begins
     Constraints
              PRIMARY KEY,  UNIQUE,NOT NULL, Check Referential constraints
Parallel Data Loading

If triggers or integrity constraints pose a problem, but you want faster loading, you should consider using concurrent conventional path loads. That is, use multiple load sessions executing concurrently on a multiple-CPU system. (Can be used for conversional path/Direct path loading)




Restrictions to use parallel load
1 Referential integrity and check constraints must be disabled
2 Triggers must be disabled
3 Rows can only be appended. This is due the individual loads not been coordinated.

Lab Sessions:
sqlldr username@server/password control=loader.ctl
sqlldr username/password@server control=loader.ctl
CASE:1

load data
infile 'c:\data\mydata.csv'
into table emp
fields terminated by "," optionally enclosed by '"'                      
( empno, empname, sal, deptno )
10001,"Scott Tiger", 1000, 40
10002,"Frank Naude", 500, 20

CASE:2

load data
infile *
replace
into table departments
(  dept     position (02:05) char(4),
    deptname position (08:27) char(20)
)
begindata
COSC  COMPUTER SCIENCE
ENGL  ENGLISH LITERATURE
MATH  MATHEMATICS
POLY  POLITICAL SCIENCE

CASE:3

LOAD DATA
INFILE *
INTO TABLE load_delimited_data
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(  data1,
   data2
)
BEGINDATA
11111,AAAAAAAAAA
22222,"A,B,C,D,"


CASE:4

LOAD DATA
  INFILE *
  INTO TABLE load_positional_data
  (  data1 POSITION(1:5),
     data2 POSITION(6:15)
  )
BEGINDATA
11111AAAAAAAAAA
22222BBBBBBBBBB

CASE:5

OPTIONS (SKIP=5)
LOAD DATA
INFILE *
INTO TABLE load_positional_data
(  data1 POSITION(1:5),
   data2 POSITION(6:15)
)
BEGINDATA
11111AAAAAAAAAA
22222BBBBBBBBBB
...
sqlldr userid=ora_id/ora_passwd control=control_file_name.ctl skip=4

CASE:6

LOAD DATA
  INFILE *
  INTO TABLE modified_data
  (  rec_no                      "my_db_sequence.nextval",
     region                      CONSTANT '31',
     time_loaded                 "to_char(SYSDATE, 'HH24:MI')",
     data1        POSITION(1:5)  ":data1/100",
     data2        POSITION(6:15) "upper(:data2)",
     data3        POSITION(16:22)"to_date(:data3, 'YYMMDD')"
  )
BEGINDATA
11111AAAAAAAAAA991201
22222BBBBBBBBBB990112

CASE:7

LOAD DATA
  INFILE 'mail_orders.txt'
  BADFILE 'bad_orders.txt'
  APPEND
  INTO TABLE mailing_list
  FIELDS TERMINATED BY ","
  (  addr,
     city,
     state,
     zipcode,
     mailing_addr   "decode(:mailing_addr, null, :addr, :mailing_addr)",
     mailing_city   "decode(:mailing_city, null, :city, :mailing_city)",
     mailing_state
  )

CASE:8

Can one load data from multiple files/ into multiple tables at once?
LOAD DATA
  INFILE file1.dat
  INFILE file2.dat
  INFILE file3.dat
  APPEND
  INTO TABLE emp
  ( empno  POSITION(1:4)   INTEGER EXTERNAL,
    ename  POSITION(6:15)  CHAR,
    deptno POSITION(17:18) CHAR,
    mgr    POSITION(20:23) INTEGER EXTERNAL
  )

CASE:9

LOAD DATA
 INFILE *
 INTO TABLE tab1 WHEN tab = 'tab1'
   ( tab  FILLER CHAR(4),
     col1 INTEGER
   )
 INTO TABLE tab2 WHEN tab = 'tab2'
   ( tab  FILLER POSITION(1:4),
     col1 INTEGER
   )
BEGINDATA
tab1|1
tab1|2
tab2|2
tab3|3
CASE:10

LOAD DATA
 INFILE 'mydata.dat'
 REPLACE
 INTO TABLE emp
      WHEN empno != ' '
 ( empno  POSITION(1:4)   INTEGER EXTERNAL,
   ename  POSITION(6:15)  CHAR,
   deptno POSITION(17:18) CHAR,
   mgr    POSITION(20:23) INTEGER EXTERNAL
 )
 INTO TABLE proj
      WHEN projno != ' '
 (  projno POSITION(25:27) INTEGER EXTERNAL,
    empno  POSITION(1:4)   INTEGER EXTERNAL
 )
CASE:11

Can one selectively load only the records that one needs?
Look at this example, (01) is the first character, (30:37) are characters 30 to 37:
LOAD DATA
  INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
  APPEND
  INTO TABLE my_selective_table
  WHEN (01) <> 'H' and (01) <> 'T' and (30:37) = '20031217'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR
  )
CASE:12

NOTE: SQL*Loader does not allow the use of OR in the WHEN clause. You can only use AND as in the example above! To workaround this problem, code multiple "INTO TABLE ... WHEN" clauses. Here is an example:
LOAD DATA
  INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
  APPEND
  INTO TABLE my_selective_table
  WHEN (01) <> 'H' and (01) <> 'T'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR
  )
  INTO TABLE my_selective_table
  WHEN (30:37) = '20031217'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR)

CASE:13

Can one skip certain columns while loading data?

LOAD DATA
TRUNCATE INTO TABLE T1
FIELDS TERMINATED BY ','
( field1,
  field2 FILLER,
  field3
)


CASE:14

-- Variable-length, delimited and enclosed data format
LOAD DATA
INFILE *
APPEND
INTO TABLE emp
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(empno,
 ename,
 job,
 mgr,
 hiredate DATE(20) "DD-Month-YYYY",
 sal,
 comm,
 deptno CHAR TERMINATED BY ':',
 projno,
 loadseq  SEQUENCE(MAX,1))
BEGINDATA
   7782, "Clark", "Manager", 7839, 09-June-1981, 2572.50,,  10:101
   7839, "King", "President", , 17-November-1981,5500.00,,10:102
   7934, "Miller", "Clerk", 7782, 23-January-1982, 920.00,, 10:102
   7566, "Jones", "Manager", 7839, 02-April-1981, 3123.75,, 20:101
   7499, "Allen", "Salesman", 7698, 20-February-1981, 1600.00,
   (same line continued)                 300.00, 30:103
   7654, "Martin", "Salesman", 7698, 28-September-1981, 1312.50,
   (same line continued)                1400.00, 3:103
   7658, "Chan", "Analyst", 7566, 03-May-1982, 3450,,  20:101



CASE:15

Example on direct load
1.      Create the following table
  CREATE TABLE tab1
  (EMPNO number(3) primary key ,
   ename VARCHAR2(10) UNIQUE
  );
2.      select * from TAB1
          no rows selected
3.      SQL Loader Program
options(direct=true)
load data
infile *
insert into table tab1
fields terminated by ','
(empno,
 ename)
begindata
101,Ranjit,
101,Ranjit,
4.      select * from TAB1
  
  
 Please note empno is Primary Key and ename is unique key though data is load due to direct  path data load.
Constraints violated here
5.      Delete from TAB1
              *Cause:    An attempt has been made to access an index or index partition
              that has been marked unusable by a direct load or by a DDL
             operation
      6.    User table TAB1 truncated.


Interview Questions
1.      What is the SQL * Loader and what is it used for?
 SQL * Loader is bulk loader utility used for moving data from external files into the oracle database
2.       Can one skip certain columns while loading data?
      Yes we can skip the columns by using filler
3.      What it the syntax of the control file?
        Just tell Basic syntax
           LOAD DATA
           INFILE <file_path>
           INSERT INTO TABLE <name of the table>
           FIELDS TERMINATED BY ‘|’
            (  Col - 1,
               Col - 2,
               .
               .
               Col-n
            );
4.      How will you give positions in fixed files?
      Column Name POSITION(n:m)
5.      How will you give constant value to SQL * Loader
       Column Name  CONSTANT ‘INR’
6.      What is the syntax for executing SQL * Loader?
  Sqlldr userid = <username/password@databasename> Control=<Path of Control File> 
7.      How to Load single file data into multiple tables ?
    Yes we can load by using WHEN clause
8.      Is it possible to load data from multiple tables to single table?
         We need to provide multiple in files of the same format.
Ex :  LOAD DATA
  INFILE file1.dat
  INFILE file2.dat
  INFILE file3.dat
  APPEND
  INTO TABLE emp
  (empno  ,
    ename  ,
    deptno ,
    mgr    )

9.      Whats the Common Error you faced during loading data from flat file to Temporary tables?

      1  SQL*Loader-601: For INSERT option, table must be empty.
      2  Errors Due to wrong syntax
             3   Errors due data type mismatch or size mismatch or due wrong formats (mostly date  format)

10.   Could you please explain INSERT/APPEND/REPLACE/TRUNCATE ?
            INSERT: Table must be empty
 APPEND: If data already exists in the table, SQL*Loader appends the new rows to it. If data     doesn't already exist, the new rows are simply loaded.
                     REPLACE: All rows in the table are deleted and the new data is loaded
  The row deletes cause any delete triggers defined on the table to fire
   It uses correlated update procedure internally
  TRUNCATE: Uses the standard SQL truncate command
  Note: TRUNCATE and REPLACE functionally same but the way it works different TRUNCATE is faster than replace.
  If you use TRUNCATE we must disable integrity constraints.
11.   Difference between Conventional path load and direct path load?
 By default we will be using conversional path load.
A conventional path load executes SQL INSERT statements to populate tables in an Oracle database.
Direct path load uses the direct path API to pass the data to be loaded to the load engine in the server. Hence it faster, constraints will be over looked.

12.   What are the file involved in SQL * Loader ?
-        Data file ( can be  .csv, .dat, .txt)
-        Control file
-        Bad file
-        Discard file
-        Log file

13.   What is the difference between bad file, discard file and log file?

Bad File
Discard File
Log File
Contains all records which are rejected either due to
Data type mismatch
More data than columns size
Format errors (Mostly date)
Contains all records which are not satisfied by when clause
Contains state of the load, error occurred, how many are loaded successfully , how many are moved to bad file , how many moved discard file.
Reason for failure

       14.  Is there a SQL*Unloader to download data to a flat file?
            Oracle does not supply any data unload utilities. However, you can use SQL*Plus to select and format your data and then spool it to a file:
spool oratechdata.txt
select col1 || ',' || col2 || ',' || col3 from tab1 where  col2 = 'XYZ';
spool off

Alternatively use the UTL_FILE PL/SQL package:

14.   How can get SQL*Loader to COMMIT only at the end of the load file?

One cannot, but by setting the ROWS= parameter to a large value, committing can be reduced.
By default every 64 Records commit will be issued

15.   How will you register SQL*LOADER AS a Concurrent Programs?

Ø  Define executable by giving Execution Method as SQL*Loader.
Ø  Define Concurrent Programs
Ø  Attached Concurrent Program to the Request Group of appropriate Responsibility.
16.  Where you placed Control file on server?
  In Appropriate top bin Folder. For example if Control File is of Oracle Purchasing module then Control file should be placed in $PO_TOP/bin Folder.


17.   From where you will get the data file?
 From functional consultant or team lead it is depends on the project sometime you get it from customer also directly.

In case of issue in the data file you have revert back with all the issue to functional / team lead / customer depends on who is the data file provider in the project.
   
18.   I want skip the data in the file randomly let us take an example
I have 100 records I want to skip 51 , 67 , 71 , 83 ,99  how to do this?

It is not possible to do this until unless there is condition provided, if some condition/pattern is there then we can avoid loading with when clause.

19.   Ho w to generate the sequence in SQL * Loader?
There are two way to generate the sequence in sql * loader     
Method I
  Col1   “DB_SQ.nextval”
Method II
 Col1 SEQUENCE(MAX ,1 )
20.      Can I use SQL functions in SQL * Loader?
Yes you can use
               Syntax :         Col1 “UPPER(:Col1)”, - - function



      If you face any other Interview questions keep adding it, and let us all of us know about it.



Happy Learning J, Wish you very all the best






1 comment: