Sunday, April 24, 2016

Difference - View vs Materialized view

The below lists out the differences between View and Materialized view in Oracle.

1.
View acts as an image of a table through which a table contents can be accessed but the result is not stored anywhere on the disk. Hence it is a virtual table.
A materialized view is also similar in this aspect except that the underlying query results are saved to a table. Hence it is physically stored on the database.
2.
A view's results are always the latest as its auto-refreshed each time it is accessed.
A MV needs to be manually refreshed either using the DBMS_MVIEW.REFRESH or periodically refreshed using Oracle scheduler.
3.
Only simple views can be used for DML operations. If the view consists of joins or aggregate functions, its' a complex view. Such views need INSTEAD OF triggers for DML operations.
MV is better suited for DML operations as the main application is data replication.
4.
When we create a view on a table, rowid of view is same as the base table.
Rowid is different.
5.
Lower performance than MV.
Better performance as the data's stored in a table and the table maybe indexed.also joining is done at the time of refresh so no need to fire join statement everytime as in case of view.
6.
Mostly used in applications for data retrieval as no extra space required for storage and simple to maintain.
When performance is key, MVS are used. Hence mainly used in Datawarehousing and OLAP environments.



Materialized Views and MV Logs

This is a very important topic for Interviews. So don't skip this!

A materialized view, or Snapshot as they were previously known, is a table segment whose contents are periodically refreshed based on a query, either against a local or remote table. Using materialized views against remote tables is the simplest way to achieve replication of data between sites.


Basic syntax :



CREATE MATERIALIZED VIEW view-name
BUILD [IMMEDIATE | DEFERRED]
REFRESH [FAST | COMPLETE | FORCE ]
ON [COMMIT | DEMAND ]
[[ENABLE | DISABLE] QUERY REWRITE]
[ON PREBUILT TABLE]
AS
SELECT ...;
The BUILD clause options are shown below.
  • IMMEDIATE : The materialized view is populated immediately.
  • DEFERRED : The materialized view is populated on the first requested refresh.
The following refresh types are available.
  • FAST : A fast refresh is attempted. If materialized view logs are not present against the source tables in advance, the creation fails.
  • COMPLETE : The table segment supporting the materialized view is truncated and repopulated completely using the associated query.
  • FORCE : A fast refresh is attempted. If one is not possible a complete refresh is performed.
A refresh can be triggered in one of two ways.
  • ON COMMIT : The refresh is triggered by a committed data change in one of the dependent tables.
  • ON DEMAND : The refresh is initiated by a manual request or a scheduled task.
The QUERY REWRITE clause tells the optimizer if the materialized view should be consider for query rewrite operations. An example of the query rewrite functionality is shown below.
The ON PREBUILT TABLE clause tells the database to use an existing table segment, which must have the same name as the materialized view and support the same column structure as the query.

Basic things to check : 



  • User has CREATE MATERIALIZED VIEW privilege
  • If creating MV on a remote database, does user have CREATE DATABASE LINK privilege ? 



Connect to the materialized view owner and create the database link and the materialized view itself.
CONNECT scott/tiger@db2

CREATE DATABASE LINK DB1.WORLD CONNECT TO scott IDENTIFIED BY tiger USING 'DB1.WORLD';

CREATE MATERIALIZED VIEW emp_mv
BUILD IMMEDIATE 
REFRESH FORCE
ON DEMAND
AS
SELECT * FROM emp@db1.world;

Remember to gather stats after building the materialized view.
BEGIN
  DBMS_STATS.gather_table_stats(
    ownname => 'SCOTT',
    tabname => 'EMP_MV');
END;
/


Create Materialized View Logs

Since a complete refresh involves truncating the materialized view segment and re-populating it using the related query, it can be quite time consuming and involve a considerable amount of network traffic when performed against a remote table. To reduce the replication costs, materialized view logs can be created to capture all changes to the base table since the last refresh. This information allows a fast refresh, which only needs to apply the changes rather than a complete refresh of the materialized view.
To take advantage of the of the fast refresh, connect to the master instance and create the materialized view log.
CONNECT scott/tiger@db1

CREATE MATERIALIZED VIEW LOG ON scott.emp
TABLESPACE users
WITH PRIMARY KEY
INCLUDING NEW VALUES;


Refresh 

A materialized view can be manually refreshed using the DBMS_MVIEW package.
EXEC DBMS_MVIEW.refresh('EMP_MV');
Rather than using a refresh group, you can schedule DBMS_MVIEW.REFRESH called using the Oracle Scheduler

Application :


Materialized views can be used to improve the performance of a variety of queries, including those performing aggregations and transformations of the data. This allows the work to be done once and used repeatedly by multiple sessions, reducing the total load on the server.



Create a materialized view to perform the aggregation in advance, making sure you specify the ENABLE QUERY REWRITE clause.
CREATE MATERIALIZED VIEW emp_aggr_mv
BUILD IMMEDIATE 
REFRESH FORCE
ON DEMAND
ENABLE QUERY REWRITE 
AS
SELECT deptno, SUM(sal) AS sal_by_dept
FROM   emp
GROUP BY deptno;

EXEC DBMS_STATS.gather_table_stats(USER, 'EMP_AGGR_MV');

Types of Exceptions


There are 3 types of Exceptions :

a) Named System Exceptions
b) Unnamed System Exceptions
c) User-defined Exceptions

Two ways to handle unnamed system exceptions:


1. By using the WHEN OTHERS exception handler, or
2. By associating the exception code to a name and using it as a named exception.

Using User defined exceptions : 


RAISE_APPLICATION_ERROR is used for User defined exceptions in application.


1. to replace generic Oracle exception messages with our own, more meaningful messages.

2. The second is to create exception conditions of our own(range is in between -20000 and -20999)

To handle the exception explicitly, they must be declared using Pragma EXCEPTION_INIT


EXCEPTION_INIT will associate a predefined Oracle error number to a programmer_defined exception name.


To handle the exception explicitly, they must be declared using Pragma EXCEPTION_INIT as given below and handled referencing the user-defined exception name in the exception section.


DECLARE


   exception_name EXCEPTION;


   PRAGMA EXCEPTION_INIT (exception_name, Err_code);


BEGIN


Execution section


EXCEPTION


  WHEN exception_name THEN


     handle the exception


END;



Eg :



create or replace procedure new_emp
    ( p_name in emp.ename%type
      , p_sal in emp.sal%type
      , p_job in emp.job%type
      , p_dept in emp.deptno%type
      , p_mgr in emp.mgr%type 
      , p_hired in emp.hiredate%type := sysdate )
is
    invalid_manager exception;
    PRAGMA EXCEPTION_INIT(invalid_manager, -2291);
    dummy varchar2(1);
begin
    -- check hiredate is valid
    if trunc(p_hired) > trunc(sysdate) 
    then
        raise_application_error
            (-20000
             , 'NEW_EMP::hiredate cannot be in the future'); 
    end if;

    insert into emp
        ( ename
          , sal
          , job
          , deptno
          , mgr 
          , hiredate )
    values      
        ( p_name
          , p_sal
          , p_job
          , p_dept
          , p_mgr 
          , trunc(p_hired) );
exception
    when dup_val_on_index then
        raise_application_error
            (-20001
             , 'NEW_EMP::employee called '||p_name||' already exists'
             , true); 
    when invalid_manager then
        raise_application_error
            (-20002
             , 'NEW_EMP::'||p_mgr ||' is not a valid manager'); 

end;
/

Transactions in Oracle



  A transaction is a set of SQL statements which Oracle treats as a Single Unit. Either all the statements should execute successfully or none. ( No partial commits )


TCL (Transaction control language) commands are used for controlling the state of the transaction they are fired from. 



TCL Statements available in Oracle are:

COMMIT :  Makes changes done in  a transaction permanent.

Example :
insert into emp (empno,ename,sal) values (101,’Abid’,2300);
commit;

ROLLBACK : used to rollback the changes and restore the state of database to the last commit point.

Example :

delete from emp;
rollback;          /* undo the changes */

SAVEPOINT

Used to specify a point in the transaction to which later you can rollback. This is used for partial rollback.

Example :

insert into emp (empno,ename,sal) values (109,’Sami’,3000);

savepoint a;

insert into dept values (10,’Sales’,’TRV’);

savepoint b;

insert into salgrade values (‘III’,9000,12000);

Now if you give

rollback to a;



Then  rows from salgrade table and dept will be roll backed. Now you can commit the row inserted into emp table or rollback the transaction.

If you give
rollback to b;
Then row inserted into salgrade table will be roll backed. Now you can commit the row inserted into dept table and emp table or rollback to savepoint a or completely roll backed the transaction.
If you give
rollback; 
Then the whole transactions is roll backed. 

Note : You can rollback only to savepoints marked in the current txn.

If you give

commit;



Then the whole transaction is committed and all savepoints are removed.

Oracle MERGE


The MERGE statement was introduced in Oracle 9i to conditionally insert or update data depending on its presence, a process also known as an "upsert". This is typically the case when you have to synchronize a table periodically with data from another source (table/view/query). In place of 3 separate unwieldy INSERT, UPDATE and DELETE statements with conditional sub-queries, the all-in-one MERGE does the job in one shot.
The MERGE statement reduces table scans and can perform the operation in parallel if required.  


MERGE INTO employees e
    USING (SELECT * FROM hr_records WHERE start_date > ADD_MONTHS(SYSDATE, -1)) h
    ON (e.id = h.emp_id)
  WHEN MATCHED THEN
    UPDATE SET e.address = h.address
  WHEN NOT MATCHED THEN
    INSERT (id, address)
    VALUES (h.emp_id, h.address);

The MERGE statement is optimized for merging sets of data, rather than single rows, as shown in the example below. It is the fastest method also simple to understand and code.


MERGE Restrictions : 


1. cannot update any of the columns you are merging on

2. MERGE is a deterministic statement – that is, you cannot update a row of the target table multiple times in the same MERGE statement. 
3. MERGE tells you the total number of rows processed in the upsert, it does not tell you the individual counts of rows inserted/updated/deleted. 

PL/SQL Collections




Associative Arrays
Nested tables
V-Arrays(Variable sized arrays)
An associative array (formerly called PL/SQL table or index-by table) is a set of key-value pairs. Each key is a unique index, used to locate the associated value with the syntax variable_name(index).
The data type of index can be either a string type or PLS_INTEGER. Indexes are stored in sort order, not creation order.



Like a database table, an associative array:

  • Is empty (but not null) until you populate it
  • Unbounded, grow dynamically as elements are added.



Unlike a database table, an associative array:

  • Does not need disk space or network operations
  • Cannot be manipulated with DML statements
Used for :
  • intended for temporary data storage such as a small lookup table which can be constructed in memory each time you invoke the subprogram or initialize the package that declares it 
  • Passing collections to and from the database server 
In the database, a nested table is a column type that stores an unspecified number of rows in no particular order.
A nested table definition does not allocate space. It defines a type, which you can use to declare:
  • The datatype of a column of a relational table
  • An object type attribute
  • A PL/SQL variable, parameter, or function return type
A nested table is appropriate when:
  • The number of elements is not set.
  • Index values are not consecutive.
  • You must delete or update some elements, but not all elements simultaneously.
    Nested table data is stored in a separate store table, a system-generated database table. When you access a nested table, the database joins the nested table with its store table. This makes nested tables suitable for queries and updates that affect only some elements of the collection.
  • You would create a separate lookup table, with multiple entries for each row of the main table, and access it through join queries.

An array is an ordered set of homogeneous data elements. Oracle arrays are of variable size, which is why they are called varrays.
Bounded, Maximum size must be defined at the time of creation.
When you declare a varray, it does not allocate space. It defines a type, which you can use as:
  • The datatype of a column of a relational table
  • An object type attribute
  • A PL/SQL variable, parameter, or function return type
A varray is appropriate when:
  • You know the maximum number of elements.
  • You usually access the elements sequentially.
Because you must store or retrieve all elements at the same time, a varray might be impractical for large numbers of elements.


2. Indexed by binary integer or varchar2
--same as for Associative arrays--
--same as for Associative arrays--
3. Individual elements can be deleted, hence it can be a sparse collection. The elements need not be consecutive at creation.A nested array is dense initially, but it can become sparse, because you can delete elements from it.is always dense,individual elements cannot be deleted.
4. Cannot be stored in a table column.These can be stored in a table column.
--same as for Nested tables --
                                                                    

Difference IN vs EXISTS


This is a repeat question in almost every PL/SQL Interview. 

Well, both work very differently. 

Select * from T1 where x in ( select y from T2 )

is typically processed as:

select * 
from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;

The subquery is processed first and then joined to the original table -- typically.

As opposed to 

select * from t1 where exists ( select id from t2 where y = x ).

Outer query is processed 1st, for every row returned by t1, the inner query runs once and is evaluated with the outer row.

That is processed more like:

for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then 
OUTPUT THE RECORD
end if
end loop

It always results in a full scan of T1 whereas the first query can make use of an index on T1 (x).

So, when is EXISTS appropriate and IN appropriate?

Lets say the result of the subquery
( select y from T2 )

is "huge" and takes a long time. But the outer table T1 is relatively small and executing the subquery is faster due to nice indexes joining T1 and T2. (x and y in the example above) Then exists will be faster as the time to do a  full scan on T1 and do the index probe into T2 could be less then the time to simply full scan T2 to build the subquery.

Lets say the result of the subquery is small -- then IN is typicaly more appropriate.

If both the subquery and the outer table are huge -- either might work as well as the other -- depends on the indexes and other factors. [AskTom]