Pravin Satav
Pravin Satav

Reputation: 702

Rebuild Informatica persistent cache only when data is updated in the table

I have requirement to build a persistent cache in Informatica for a large table due to performance issue. My requirement is build this cache only when there is some change in the table(database is oracle). In my case table updates data weekly/monthly.

I m trying to figure it out how informtica will understand there is some change in the table and it needs to rebuild the cache now. Is there any elegant approach to achieve this?

One way I can think of is, create a flat file with lookup table data every time lookup table is updated. Use flat file as lookup. Now, this is possible in my case as lookup table is getting loaded through Informatica. What if it is getting updated outside informatica ?

Thank you

Upvotes: 2

Views: 1222

Answers (1)

Utsav
Utsav

Reputation: 8103

enter image description here

In this example, I am using source table emp_location in the queries, as your sample table. Also the cache files start with emp_location*.

I created 2 dummy sessions, before your main session.

  1. Create a run table, and save the last session run time in it.

     select * 
     from run_history
     where process_name='EMP_DIM'
    
    +--------------+----------------------+
    | PROCESS_NAME |    LAST_READ_TIME    |
    +--------------+----------------------+
    | EMP_DIM      | 15-MAY-2016 12:00:07 |
    +--------------+----------------------+
    

    In the pre session task of your main session, use something like this.

    update run_history 
    set last_read_time = sysdate
    where process_name='EMP_DIM';
    

Now find the time, your table was updated. If table doesn't have a update time column, use this to get latest update time of the table.

select scn_to_timestamp(max(ora_rowscn)) 
from emp_location;

Now is first dummy session, use this query. It will return 1 row if something was updated in source table after last_read_time. If not, then it will return 0 rows.

select 1 from dual
where 
    (select scn_to_timestamp(max(ora_rowscn)) 
      from emp_location)
>
    (select last_read_time 
       from run_history
       where process_name='EMP_DIM')
  1. In the link task, put condition as, $s_check_last_update.TgtSuccessRows=1.

    So next session will only run when there was a real change. In its post_session_task run a command to clear cache files.

    Windows: del $PMCacheDir\emp_location* Unix: rm -rf $PMCacheDir\emp_location*

  2. This link task will have condition like. IIF($s_check_last_update.TgtSuccessRows=1,0,1).

  3. In the main session, openGeneral tab and use Treat Input links as OR.

    Now Integration service will recreate fresh cache files, if they are deleted.

Alternatively, you can achieve the same thing by a shell/bash script, which will connect to Oracle and check if something was updated in table, after last read time. And if true, it should remove cache files.

Upvotes: 2

Related Questions