Life, the Universe and Everything Oracle: Using PL/SQL tables as IN- and OUTPUT parameters can be bad for performance

Sometimes, when you start a new project, you want to go that extra mile and "do thing right this time". That is just what one of our programmers thought when setting up a new package to manipulate large amounts of data. Instead of repeating the code to merge two PL/SQL tables, he wrote a procedure to do that for him. Nice and clean, modular programming, etc, etc.

First day of testing: everything works fine.
Second day: still going strong.
Third day: the load test. With amounts of data not even close to the amounts to be processed in production, the system slows to a crawl. Processing seems to take forever and memory usage is going through the roof.

In this case, the separate procedure for merging two PL/SQL tables was the problem. When creating this procedure, the developer created something like this, stripped of any irrelevant code (Seven of Nine: Code is irrelevant...)

procedure concat_plsql_table
          ( p_tab1  IN      t_test_tab
          , p_tab2  IN OUT  t_test_tab
          )
IS
BEGIN
  --
  -- LOOP through all records of the first table
  -- and place them in the output table
  --
  FOR i IN 1..p_tab1.count LOOP
    p_tab2(p_tab2.count+1) := p_tab1(i);
  END LOOP;
END;

The problem with this, is that the parameters are referenced by value. This means that the compiler creates a copy of the variable, starts working on that variable, and returns a copy to the calling procedure when finished. It is all this copying that wreaks havoc on your performance and memory.

The solution can be very simple:
* either do not use a separate procedure
* or use the NOCOPY compiler hint in your code

procedure concat_plsql_table
          ( p_tab1  IN            t_test_tab
          , p_tab2  IN OUT NOCOPY t_test_tab
          )
IS
BEGIN
  --
  -- LOOP through all records of the first table
  -- and place them in the output table
  --
  FOR i IN 1..p_tab1.count LOOP
    p_tab2(p_tab2.count+1) := p_tab1(i);
  END LOOP;
END;

With the NOCOPY compiler hint, the compiler is instructed not to copy the variable, but to use a reference to the original variable and work from there. There are some limitations on the usage (see the Oracle Documentation), so be careful on how you use this.

In this case, the performance went up by about 200 times, from minutes to seconds prcessing time. For any procedure using large amounts of data in arrays, it is worthwhile to consider this option.

For a more complete example, see the code below:

create or replace package test_plsql_table
as
  --
  -- Main procedure
  --
  procedure test_response;
  --
end;
/

create or replace package body test_plsql_table
as
  --
  -- pl/sql table
  --
  type t_test_rec IS RECORD ( id       number
                            , desc_col varchar2(100)
                            );
  type t_test_tab IS TABLE of t_test_rec
       index by binary_integer;
  --
  -- Concatenate two pl/sql tables into a third
  --
  procedure concat_plsql_table
            ( p_tab1  IN     t_test_tab
            , p_tab2  IN OUT t_test_tab
            )
  IS
  BEGIN
    --
    -- LOOP through all records of the first table
    -- and place them in the output table
    --
    FOR i IN 1..p_tab1.count LOOP
      p_tab2(p_tab2.count+1) := p_tab1(i);
    END LOOP;
  END;
  --
  -- Main procedure
  --
  procedure test_response
  IS
    l_tab1  t_test_tab;
    l_tab2  t_test_tab;
    l_tab3  t_test_tab;
  BEGIN
    --
    -- Display the start time
    --
    dbms_output.put_line('Procedure started at '||to_char(systimestamp,'hh24:mi:ss'));
    --
    -- Initial content for the first table: 50 records
    --
    FOR i IN 1..500 LOOP
      l_tab1(i).id       := i;
      l_tab1(i).desc_col := 'Record '||i||' in initial table';
    END LOOP;
    --
    -- LOOP over records and in each LOOP, concatenate the initial table with a second one
    --
    FOR i IN 1..1000 LOOP
      concat_plsql_table( l_tab1
                        , l_tab2);
      IF mod(i,100) = 0 THEN
        dbms_output.put_line('.. '||lpad(i*500,6,' ')||' records at '||to_char(systimestamp,'hh24:mi:ss'));
      END IF;
    END LOOP;
    --
    -- tab2 should now contain a lot of records
    --
    dbms_output.put_line('Number of records created in result table 2 = '||l_tab2.count);
    --
    -- Display the end time
    --
    dbms_output.put_line('.');
    dbms_output.put_line('Procedure started at '||to_char(systimestamp,'hh24:mi:ss'));
    --
    -- LOOP over records and in each LOOP, concatenate the initial table with a second one
    --
    FOR i IN 1..1000 LOOP
      FOR i IN 1..l_tab1.count LOOP
        l_tab3(l_tab3.count+1) := l_tab1(i);
      END LOOP;
      IF mod(i,100) = 0 THEN
        dbms_output.put_line('.. '||lpad(i*500,6,' ')||' records at '||to_char(systimestamp,'hh24:mi:ss'));
      END IF;
    END LOOP;
    --
    -- tab3 should now contain a lot of records (same amount as tab2)
    --
    dbms_output.put_line('Number of records created in result table 3 = '||l_tab3.count);
    --
    -- Display the end time
    --
    dbms_output.put_line('.');
    dbms_output.put_line('Procedure finished at '||to_char(systimestamp,'hh24:mi:ss'));
  END;
  --
end;
/

Life, the Universe and Everything Oracle

Friday, November 5, 2010

Using PL/SQL tables as IN- and OUTPUT parameters can be bad for performance

No comments:

Post a Comment