Life, the Universe and Everything Oracle

Monday, December 13, 2010

Enhancing Oracle Reports - create a custom type

Usually, the report types (destination types) provided by Oracle Reports are sufficient. You can create your report as a file, in the cache, send it to a printer etc. See the list of standard types in the Oracle Documentation.

It is also possible to create your own destination, but the documentation is a bit vague about this, to put it mildly. Once you know where to look though, it is quite easy to accomplish.

First of all, you will need to modify the reportsserver configuration file. You can do this through Enterprise Manager, or modify the file directly ($ORACLE_HOME/reports/conf/<repserv>.conf). You must add a destination tag, just like there are many others in this file. It would be sufficient to add something like this:
<destination destype="myDest" class="myDestClass"/>

And now, your reportserver will not start, or at least give you an error that the specified class does not exist. This is the hard part. You must create a Java Class that extends the Destination Class. So, you find yourself a Java Programmer (shouldn't be that hard to find, these days) and let him create your own class. In this class, you can do (almost) anything you can imagine. You will have the XML provided by the reports engine avaiable and you will need some imagination as to what you want to do with this report.
After the class is created, compiled and packaged, you add it to the classpath of the reports server. Easiest way to do this is to use Enterprise Manager. After that, the reports server will be able to find the new class and start the engine.

You could for example create an RTF document from the standard XML using a stylesheet and then mail it to one or more recipients. You could send it to a printer and e-mail it both in the same class. Any requirement you might have for your reports is now available. It will require your effort to implement all your requirements though.

This is a good way to enhance the output options of reports, it is just too bad that it is not so very well documented...
When you start working with this, there will undoubtedly be some other obstacles, but if you have any questions, you know where to find me (i.e. right here ;-)

Wednesday, December 1, 2010

Using too many indexes

When writing about the use of indexes (or rather, the lack of) it soon occurred to me that when you tell the developers to use indexes, some of them really take that too far. I once saw a 18-column table with a total of 20 indexes on it. That might be useful in a certain DWH database, but this was in a OLTP database.

But why is it not good to have a lot of indexes? Is there such a thing as too many indexes? From the optimizer point of view, it just means a few extra access paths to consider, that should not be too difficult. On the other hand, from a DML perspective, when you insert a record or update certain columns, all related indexes must be modified as well. So, instead of inserting one row (1 I/O operation), you could end up with inserting one row and inserting 20 index entries (21 I/O operations!).

So yes, you can certainly have too many indexes.

What to do? Well, the obvious thing to do is to get rid of all the unnecessary indexes. But not knowing which indexes are useful, you might accidentally drop the wrong ones and end up with terrible performance for your application.

The best course of action would be to let the database determine what indexes are being used and which indexes are candidates for deletion. You do this with the following statement:

ALTER INDEX index MONITORING USAGE;

This starts monitoring if the index is being use by any statement in the database. You can let this monitoring run as long as you like. I would advise to use a representative period of time, so you can be sure that each and every part of your application has been used (every statement out there should have run at least once). You should consider special cases like an end-of-month report that will run. It would be a shame to drop any indexes and then discover that this crucial report takes two days to complete…

After you have monitored long enough, you should check the V$OBJECT_USAGE table. This table contains the rows for the monitored indexes and contains a column “USED”. If this is set to TRUE, the index was used during to monitoring period. If not, the column value will be FALSE.

Now you can determine if there are unused indexes in your datamodel. These are the candidates for deletion. Please remember that index usage is not restricted to query performance. You should not drop primary key or unique ket indexes, just because they are not used in your monitored workload. They have a different (but important) reason for being there.

After you complete your analysis, you can disable monitoring:

ALTER INDEX index NOMONITORING USAGE;

This way, you keep the minimum number of indexes while retaining optimal performance. You (or rather, the database) will save a lot of effort updating indexes that will most likely never be used and thus optimizing your performance just that bit more.

Also see the Oracle documentation for some more details…

Sunday, November 14, 2010

Why won't developers use indexes

It keeps turning up again and again. The application is released to the customer and after a few weeks, or even a few days, the first issues arrive on the slow performance.

Time and again, this is due to the application queries not using the right indexes, or even using any indexes at all. The queries can be well written, as I usually keep repeating to the developers how important that is. On the other hand, I keep telling everybody about the usefulness of indexes, and somehow that doesn't seem to stick...

So, let's try this one more time ;-)

If you write a query (any query), it is important to ask yourself a few questions:

What data will this query access
How are the tables in this query connected. Am I using the right connections (comparing the right attributes)
Am I using the right filter conditions on the data

These are questions of a more logical nature. This has everything to do with constructing a good query. The "hard" part is the technical part of the puzzle:

How will the database access my data
Are there enough fast access paths (at least one) to choose from
Will this query perform when retrieving the needed amount of data

These questions are really not that tough. Most of the time, when writing a query, you will have quite a good idea about what data and what amount of data is to be retrieved. Try to picture this query in the datamodel. How would the database be able to retrieve your data in the fastest way possible? On many occasions, the answer would be to use an index. When accessing a table with an index, only a portion of the table will have to be scanned and the access can be much, much faster.

So, to all developers out there, please consider your queries carefully. After creating the query, try to anticipate how it will function. If there is any doubt if the query will perform adequately, at least test it with a representative amount and distribution of data.

Of course, I am fully aware of the many types of queries out there, not to mention the many types of indexes you can use. Usually, it pays off to invest some time reading up on the use of indexes. There are a great number of resources to use out here on the internet, so I won't even try to begin explaining them. As a developer, do yourself a favor and spend some time getting to know the workings of the most common types of indexes.

And if all else fails: please contact your DBA...

Friday, November 5, 2010

Using PL/SQL tables as IN- and OUTPUT parameters can be bad for performance

Sometimes, when you start a new project, you want to go that extra mile and "do thing right this time". That is just what one of our programmers thought when setting up a new package to manipulate large amounts of data. Instead of repeating the code to merge two PL/SQL tables, he wrote a procedure to do that for him. Nice and clean, modular programming, etc, etc.

First day of testing: everything works fine.
Second day: still going strong.
Third day: the load test. With amounts of data not even close to the amounts to be processed in production, the system slows to a crawl. Processing seems to take forever and memory usage is going through the roof.

In this case, the separate procedure for merging two PL/SQL tables was the problem. When creating this procedure, the developer created something like this, stripped of any irrelevant code (Seven of Nine: Code is irrelevant...)

procedure concat_plsql_table
          ( p_tab1  IN      t_test_tab
          , p_tab2  IN OUT  t_test_tab
          )
IS
BEGIN
  --
  -- LOOP through all records of the first table
  -- and place them in the output table
  --
  FOR i IN 1..p_tab1.count LOOP
    p_tab2(p_tab2.count+1) := p_tab1(i);
  END LOOP;
END;

The problem with this, is that the parameters are referenced by value. This means that the compiler creates a copy of the variable, starts working on that variable, and returns a copy to the calling procedure when finished. It is all this copying that wreaks havoc on your performance and memory.

The solution can be very simple:
* either do not use a separate procedure
* or use the NOCOPY compiler hint in your code

procedure concat_plsql_table
          ( p_tab1  IN            t_test_tab
          , p_tab2  IN OUT NOCOPY t_test_tab
          )
IS
BEGIN
  --
  -- LOOP through all records of the first table
  -- and place them in the output table
  --
  FOR i IN 1..p_tab1.count LOOP
    p_tab2(p_tab2.count+1) := p_tab1(i);
  END LOOP;
END;

With the NOCOPY compiler hint, the compiler is instructed not to copy the variable, but to use a reference to the original variable and work from there. There are some limitations on the usage (see the Oracle Documentation), so be careful on how you use this.

In this case, the performance went up by about 200 times, from minutes to seconds prcessing time. For any procedure using large amounts of data in arrays, it is worthwhile to consider this option.

For a more complete example, see the code below:

create or replace package test_plsql_table
as
  --
  -- Main procedure
  --
  procedure test_response;
  --
end;
/

create or replace package body test_plsql_table
as
  --
  -- pl/sql table
  --
  type t_test_rec IS RECORD ( id       number
                            , desc_col varchar2(100)
                            );
  type t_test_tab IS TABLE of t_test_rec
       index by binary_integer;
  --
  -- Concatenate two pl/sql tables into a third
  --
  procedure concat_plsql_table
            ( p_tab1  IN     t_test_tab
            , p_tab2  IN OUT t_test_tab
            )
  IS
  BEGIN
    --
    -- LOOP through all records of the first table
    -- and place them in the output table
    --
    FOR i IN 1..p_tab1.count LOOP
      p_tab2(p_tab2.count+1) := p_tab1(i);
    END LOOP;
  END;
  --
  -- Main procedure
  --
  procedure test_response
  IS
    l_tab1  t_test_tab;
    l_tab2  t_test_tab;
    l_tab3  t_test_tab;
  BEGIN
    --
    -- Display the start time
    --
    dbms_output.put_line('Procedure started at '||to_char(systimestamp,'hh24:mi:ss'));
    --
    -- Initial content for the first table: 50 records
    --
    FOR i IN 1..500 LOOP
      l_tab1(i).id       := i;
      l_tab1(i).desc_col := 'Record '||i||' in initial table';
    END LOOP;
    --
    -- LOOP over records and in each LOOP, concatenate the initial table with a second one
    --
    FOR i IN 1..1000 LOOP
      concat_plsql_table( l_tab1
                        , l_tab2);
      IF mod(i,100) = 0 THEN
        dbms_output.put_line('.. '||lpad(i*500,6,' ')||' records at '||to_char(systimestamp,'hh24:mi:ss'));
      END IF;
    END LOOP;
    --
    -- tab2 should now contain a lot of records
    --
    dbms_output.put_line('Number of records created in result table 2 = '||l_tab2.count);
    --
    -- Display the end time
    --
    dbms_output.put_line('.');
    dbms_output.put_line('Procedure started at '||to_char(systimestamp,'hh24:mi:ss'));
    --
    -- LOOP over records and in each LOOP, concatenate the initial table with a second one
    --
    FOR i IN 1..1000 LOOP
      FOR i IN 1..l_tab1.count LOOP
        l_tab3(l_tab3.count+1) := l_tab1(i);
      END LOOP;
      IF mod(i,100) = 0 THEN
        dbms_output.put_line('.. '||lpad(i*500,6,' ')||' records at '||to_char(systimestamp,'hh24:mi:ss'));
      END IF;
    END LOOP;
    --
    -- tab3 should now contain a lot of records (same amount as tab2)
    --
    dbms_output.put_line('Number of records created in result table 3 = '||l_tab3.count);
    --
    -- Display the end time
    --
    dbms_output.put_line('.');
    dbms_output.put_line('Procedure finished at '||to_char(systimestamp,'hh24:mi:ss'));
  END;
  --
end;
/

Monday, October 25, 2010

Modifying Standard Apex Javascript

Last week, I ran into trouble with having to adjust some standard Apex Javascript function. I wanted to modify some of the functionality of the doSubmit() function. Normally, that would be no prolem, because you can write your own Javascript functions and use the standard furnctions (like doSubmit) in those functions.

It turned out that when using the option of "Create a button displayed among this region's items", you have less control of what this button should do, and Apex generates a button with an onClick=doSubmit() attribute. There is no way of overriding this functionality (at least I haven't found one), so you can't change the Javascript behaviour of this button*.

So, what I really wanted is to write my own doSubmit, and from this function, call the original doSubmit, so I wouldn't have to worry about missing any standard code after my own validations and checks.

Luckily, you can replace the standard functionality AND use the standard functions at the same time. All you have to do is store the original function in a variable and create your own version. In your own version, you can reuse the original function by invoking the variable you created. This code is to be placed in the "HTML Header" section of the "Edit page attributes" page:

<script language="JavaScript" type="text/javascript">

</script>

In this example, I created the variable org_doSubmit, storing the original doSubmit function from the standard Apex library. After that, I redefined the doSubmit function (one input parameter, which I named request). I let the user confirm if he has chosen to Save. When he decides not to save, I reroute to the Cancel functionality. This assumes I have a page that has at least both a P1_SAVE and a P1_CANCEL button, with corresponding branches on the page.

This way, you can override and at the same time reuse the standard Apex Javascript features, should the need arise.

* For the other type of button (Create a button in a region position) this is rather straigthforward. You just put the type to "target is a URL" and invoke your named function in the URL section. This option is simply not there for the buttons among region items...

Monday, September 6, 2010

"NVL" for SQL*plus commandline parameters

Recently, I had to create an installer SQL*Plus script which could accept input parameters. "So, big deal?" I hear you think. And at first, the task seemed simple enough. Until I started testing it (which is always a good idea in my opinion…)

I created a script "installer.sql" and referred the input parameter as &1, &2 and &3. I needed these for the username, password and database (TNSNames entry to be precise). For the sake of simplicity, let’s say this was the entire script:

  PROMPT connecting as &1
  CONNECT &1/&2@&3
  SELECT username from user_users;

and then run the script:

  $ sqlplus /nolog @installer.sql test test test

Simple enough, right? The script does exactly what you'd except: It prompts that it will connect as user "test", then connects (provided you actually have a user test with password test on database test...) and shows the current user (hopefully "test").

I then tried the use other parameters (&4) to do some more then only connecting. Of course, that will work fine, but I ran into trouble when trying to leave the fourth parameter empty. I wanted this parameter to be optional and give it some value based on the contents of the database.

What happens is that the fourth parameter is not set to NULL, but the parameter is not recognised at all. First thing SQL*Plus will do, is ask you for this parameter:

  Enter value for 4:

So, when parameters are not entered on the commandline, SQL*plus will not recognise them. After a lot of experimenting, I found that you can define your own "parameter 4" using the COLUMN syntax. Using this, you can use a sort of NVL function for the inputparameters. I haven't exactly fugured out why thuis works, but so far, all tests are positive:

  COLUMN inputpar01 NEW_VALUE 1 NOPRINT
  select 1 inputpar01
    from dual
   where 1=2;

This fools SQL*Plus creating the explicit parameter 1 (instead of an implicit parameter), but the nice part of it is that it actually retains the value for an implicit parameter, if defined.
[Edit]: Please note that for sqlplus 11.x and above, you must be connected to a database for this part to work. So you will have to replace the "/nolog" with a valid login. This can be for any database, though.[/Edit]

I then modified the script to, basically, the following:

  COLUMN inputpar01 NEW_VALUE 1 NOPRINT
  COLUMN inputpar02 NEW_VALUE 2 NOPRINT
  COLUMN inputpar03 NEW_VALUE 3 NOPRINT
  COLUMN inputpar04 NEW_VALUE 4 NOPRINT
  select 1 inputpar01
       , 2 inputpar02
       , 3 inputpar03
       , 4 inputpar04
    from dual
   where 1=2;
  --
  PROMPT connecting as &1
  CONNECT &1/&2@&3
  SELECT username from user_users;
  PROMPT value for parameter 4 = &4

You can modify this with extra checks for mandatory parameters, allowed values, etc, etc. This way, you actually have an option to use NVL-like functionality on your input parameters.

Wednesday, August 4, 2010

The wrong way to build a CLOB

Recently, I came across one of the old CLOB-related problems again. A developer on the team had to create a CLOB and started out to create the pieces for the CLOB (there was some extensive logic applicable to the various pieces). After that, he inserted a row into the table, retrieving the CLOB handle and started to piece together the CLOB.

So far, there is nothing wrong functionally. He just broke down the work and created (a lot of) pieces for the final CLOB. The mistake was in inserting an empty CLOB and adding pieces to it a lot of times. To elaborate on that, I'll give you an example.

First, we create a simple table:

  create table lob_test
  ( id number
  , text clob
  )
  lob (text) store as (disable storage in row);

And then, we insert 100 CLOB's using the piecewise modification of a pre-inserted empty CLOB.

  DECLARE
    l_text CLOB;
    l_piece char(1024) := 'X';
  BEGIN
    -- create 100 CLOB's
    FOR l_id IN 1..100 LOOP
      -- First, create the CLOB in the table
      INSERT INTO lob_test (id, text)
        VALUES (l_id, EMPTY_CLOB())
        RETURNING text INTO l_text;
      -- Then, with the LOB locator,
      -- build your CLOB from pieces
      FOR l_loop IN 1..100 LOOP
        -- 100 pieces of 1k
        dbms_lob.append(l_text, l_piece);
      END LOOP;
    END LOOP;
    --
    COMMIT;
    --
  END;

This took about 22 seconds on my database. Remember this for future reference...

Now, we use a different approach. Instead of inserting an empty CLOB and modifying it, we now create a temporary CLOB, modify that and postpone the insert until we have a fully created CLOB:

  DECLARE
    l_text CLOB;
    l_piece char(1024) := 'X';
  BEGIN
    -- create 100 CLOB's
    FOR l_id IN 101..200 LOOP
      -- First, create a temporary CLOB
      dbms_lob.createtemporary(l_text,TRUE);
      -- Then, with the LOB locator,
      -- build your CLOB from pieces
      FOR l_loop IN 1..100 LOOP
        -- 100 pieces of 1k
        dbms_lob.append(l_text, l_piece);
      END LOOP;
      -- Finally, insert the CLOB
      INSERT INTO lob_test (id, text)
        VALUES (l_id, l_text);
    END LOOP;
    --
    COMMIT;
    --
  END;

This last piece of code does exactly the same (functionally), but it executes in under 2 seconds this time.

The reason that the first method takes so much longer, is that every call to dbms_lob.append will update the block for the CLOB. It generates UNDO and REDO. The second method holds the CLOB in memory only and has no block to update, so no UNDO and REDO is created. This will not only speed thing up, but will also have a lot of impact on the amount of logging created on your production database.

So, overall, try to keep the updating of a CLOB column to a minimum.