Wednesday, August 4, 2010

The wrong way to build a CLOB

Recently, I came across one of the old CLOB-related problems again. A developer on the team had to create a CLOB and started out to create the pieces for the CLOB (there was some extensive logic applicable to the various pieces). After that, he inserted a row into the table, retrieving the CLOB handle and started to piece together the CLOB.

So far, there is nothing wrong functionally. He just broke down the work and created (a lot of) pieces for the final CLOB. The mistake was in inserting an empty CLOB and adding pieces to it a lot of times. To elaborate on that, I'll give you an example.

First, we create a simple table:

  create table lob_test
  ( id number
  , text clob
  )
  lob (text) store as (disable storage in row);


And then, we insert 100 CLOB's using the piecewise modification of a pre-inserted empty CLOB.

  DECLARE
    l_text CLOB;
    l_piece char(1024) := 'X';
  BEGIN
    -- create 100 CLOB's
    FOR l_id IN 1..100 LOOP
      -- First, create the CLOB in the table
      INSERT INTO lob_test (id, text)
        VALUES (l_id, EMPTY_CLOB())
        RETURNING text INTO l_text;
      -- Then, with the LOB locator,
      -- build your CLOB from pieces
      FOR l_loop IN 1..100 LOOP
        -- 100 pieces of 1k
        dbms_lob.append(l_text, l_piece);
      END LOOP;
    END LOOP;
    --
    COMMIT;
    --
  END;

This took about 22 seconds on my database. Remember this for future reference...

Now, we use a different approach. Instead of inserting an empty CLOB and modifying it, we now create a temporary CLOB, modify that and postpone the insert until we have a fully created CLOB:

  DECLARE
    l_text CLOB;
    l_piece char(1024) := 'X';
  BEGIN
    -- create 100 CLOB's
    FOR l_id IN 101..200 LOOP
      -- First, create a temporary CLOB
      dbms_lob.createtemporary(l_text,TRUE);
      -- Then, with the LOB locator,
      -- build your CLOB from pieces
      FOR l_loop IN 1..100 LOOP
        -- 100 pieces of 1k
        dbms_lob.append(l_text, l_piece);
      END LOOP;
      -- Finally, insert the CLOB
      INSERT INTO lob_test (id, text)
        VALUES (l_id, l_text);
    END LOOP;
    --
    COMMIT;
    --
  END;

This last piece of code does exactly the same (functionally), but it executes in under 2 seconds this time.

The reason that the first method takes so much longer, is that every call to dbms_lob.append will update the block for the CLOB. It generates UNDO and REDO. The second method holds the CLOB in memory only and has no block to update, so no UNDO and REDO is created. This will not only speed thing up, but will also have a lot of impact on the amount of logging created on your production database.

So, overall, try to keep the updating of a CLOB column to a minimum.