Oracle DBA Blog: Entries tagged as Audit Vault

Entries tagged as Audit Vault

Related tags

OCT 13: Adding a new Oracle host to Audit Vault

[AV Server] is your Audit Vault Server host - one of these.
[Agent] is the server where your the databases to be audited are hosted - probably quite a few of these.
[Source Database] is one of the many database instances running on the Agent server(s) that you want to collect audit information from.

Where you need to supply your own values, these are shown in italics. I've assumed here that I want to audit a PROD1 database on the prod host.

1. [AV Server] Add the new agent.

avca add_agent -agentname agent_prod -agenthost prod

AVCA started
Adding agent...
Enter agent user name: agent_prod
Enter agent user password: GHt5_5dhY
Re-enter agent user password: GHt5_5dhY
Agent added successfully.

2. [Agent] Install the Agent software. You'll need the agent details that you added to the AV Server in step 1.

3. [Source Database] Create an account in the database that you want to collect audit data from and then assign the correct privileges to it. (Note that you only need to run the last command if you'll be using the Redo Collector)

create user avsrc_prod identified by avpwd1;
@/oem/oracle/product/av/scripts/streams/source/zarsspriv.sql avsrc_prod setup;
@/oem/oracle/product/av/scripts/streams/source/zarsspriv.sql avsrc_prod redo_coll;

4. [AV Server] Verify that source database configuration is ok.

avorcldb verify -src prod:1521:PROD1.WORLD -colltype ALL

5. [AV Server] Add Source

avorcldb add_source -src prod:1521:PROD1.WORLD -desc PROD1 -agentname agent_prod

6. [AV Server] Add Collectors. There are three collector types you could add. Here I'm just adding the DB_AUD and OS_AUD collectors.

avorcldb add_collector -srcname PROD1.WORLD -agentname agent_prod -colltype OSAUD 
         -orclhome /oem/oracle/product/10.2.0.4 -collname PROD1_OSAUD

avorcldb add_collector -srcname PROD1.WORLD -agentname agent_prod -colltype DBAUD
         -collname PROD1_DBAUD

7. [Agent] Complete Source configuration. This sets up tnsnames.ora and Wallet entries on the Agent machine.

avorcldb setup -srcname PROD1.WORLD -srcusr -wpwd GHt5_5dhY

8. [AV Server] Start Agent

avctl start_agent -agentname agent_prod

9. [AV Server] Start Collectors

avctl start_collector -collname PROD1_OSAUD -srcname PROD1.WORLD
avctl start_collector -collname PROD1_DBAUD -srcname PROD1.WORLD

Thanks again to Tammy for her help.

OCT 13: Hanging Audit Vault Warehouse Refresh Job

It was just a Pre-Production system that we could play around with but I noticed that there were no recent audit trail entries when I tried some tests. On closer inspection, the most recent audit record was a couple of days old, so it looked like the job which inserts new audit records into the Audit Vault Data Warehouse wasn't running properly. The first step was to connect to that database instance and check the status of the job.

SQL> select owner, job_name, elapsed_time from all_scheduler_running_jobs; 

OWNER                          JOB_NAME 
------------------------------ ------------------------------ 
ELAPSED_TIME 
--------------------------------------------------------------------------- 
AVSYS                          REFRESH_WAREHOUSE_DATA 
+002 21:06:37.19

OK, so the Warehouse Refresh job has been running for a few days (2 days, 21 hours and 6 minutes). When I looked at the session for that job, it was waiting on "enq: PS - contention" which is (trumpet salute) a parallel execution wait event.

  1  select sid, username, last_call_et, event, wait_time, seconds_in_wait, state, 
            blocking_session, blocking_session_status 
  2  from v$session 
  3* where sid=201 
SQL> / 

       SID USERNAME                       LAST_CALL_ET 
---------- ------------------------------ ------------ 
EVENT                                                             WAIT_TIME 
---------------------------------------------------------------- ---------- 
SECONDS_IN_WAIT STATE               BLOCKING_SESSION BLOCKING_SE 
--------------- ------------------- ---------------- ----------- 
       201 AVSYS                                249575 
enq: PS - contention                                                      0 
          48361 WAITING                          169 VALID

It's a bit of a red herring because it's really waiting on the PX slave, so the real event should be in that session 169.

SQL> c/201/169 
  3* where sid=169 
SQL> / 

       SID USERNAME                       LAST_CALL_ET 
---------- ------------------------------ ------------ 
EVENT                                                             WAIT_TIME 
---------------------------------------------------------------- ---------- 
SECONDS_IN_WAIT STATE               BLOCKING_SESSION BLOCKING_SE 
--------------- ------------------- ---------------- ----------- 
       169 AVSYS                                249250 
cursor: pin X                                                             0 
              0 WAITING                              UNKNOWN

So, looking at that made me Google around and come up with this. Which implies that I should look at bug number 5908030 which is stated as a duplicate of 5476091. Both RDBMS bugs and both will be fixed in (gulp) 11.2.

The next job was to kill the currently running refresh job and restart it, particularly as this was just Pre-Prod and the lack of audit records in the Warehouse was holding up testing. I tried various approaches to stopping the job, running from the highest level more elegant approaches, down to brute force.

SQL> exec dbms_scheduler.stop_job('AVSYS.REFRESH_WAREHOUSE_DATA'); 
BEGIN dbms_scheduler.stop_job('AVSYS.REFRESH_WAREHOUSE_DATA'); END; 

* 
ERROR at line 1: 
ORA-27365: job has been notified to stop, but failed to do so immediately 
ORA-06512: at "SYS.DBMS_ISCHED", line 164 
ORA-06512: at "SYS.DBMS_SCHEDULER", line 483 
ORA-06512: at line 1

Tried that a few times and decided it was probably well and truly stuck, particularly when the bugs were referring to systemstate dumps, so I tried killing the stuck slave, first

 1  select sid, username, last_call_et, event, wait_time, seconds_in_wait, state, 
           blocking_session, blocking_session_status 
 2  from v$session 
 3* where sid=201 
SQL> / 

       SID USERNAME                       LAST_CALL_ET 
---------- ------------------------------ ------------ 
EVENT                                                             WAIT_TIME 
---------------------------------------------------------------- ---------- 
SECONDS_IN_WAIT STATE               BLOCKING_SESSION BLOCKING_SE 
--------------- ------------------- ---------------- ----------- 
       201 AVSYS                                251777 
enq: PS - contention                                                      0 
          50563 WAITING                          169 VALID 
SQL> 3 
  3* where sid=201 
SQL> c/201/169 
  3* where sid=169 
SQL> / 

       SID USERNAME                       LAST_CALL_ET 
---------- ------------------------------ ------------ 
EVENT                                                             WAIT_TIME 
---------------------------------------------------------------- ---------- 
SECONDS_IN_WAIT STATE               BLOCKING_SESSION BLOCKING_SE 
--------------- ------------------- ---------------- ----------- 
       169 AVSYS                                251446 
cursor: pin X                                                             0 
              0 WAITING                              UNKNOWN 

SQL> select sid, serial# from v$session where sid=169; 
       SID    SERIAL# 
---------- ---------- 
       169       1585 

SQL> alter system kill session '169,1585'; 
alter system kill session '169,1585' 
* 
ERROR at line 1: 
ORA-00031: session marked for kill

I waited and tried a few more times, but that slave really didn't want to die ...

SQL> l 
  1  select sid, username, last_call_et, event, wait_time, seconds_in_wait, state, 
            blocking_session, blocking_session_status 
  2  from v$session 
  3* where sid=169 
SQL> / 

       SID USERNAME                       LAST_CALL_ET 
---------- ------------------------------ ------------ 
EVENT                                                             WAIT_TIME 
---------------------------------------------------------------- ---------- 
SECONDS_IN_WAIT STATE               BLOCKING_SESSION BLOCKING_SE 
--------------- ------------------- ---------------- ----------- 
       169 AVSYS                                251725 
cursor: pin X                                                             0 
              0 WAITING                              UNKNOWN

So I tried try killing the process

SQL> select sid, serial#, paddr, osuser, process, status 
  2  from v$session where sid=169; 

       SID    SERIAL# PADDR            OSUSER 
---------- ---------- ---------------- ------------------------------ 
PROCESS      STATUS 
------------ -------- 
       169       1585 070000005A383348 oracle 
962750       KILLED 

SQL> !ps -ef|grep 962750 
  oracle  962750       1   1   Jul 13      - 11:49 ora_p006_AV01TEST 
  oracle 1519634 1757348   2 12:11:27  pts/1  0:00 grep 962750 
SQL> !kill -9 962750 
SQL> !ps -ef|grep 962750 
  oracle 1519638 1757348   1 12:11:38  pts/1  0:00 grep 962750

Now it's gone, see if parent session for refresh job is still there

  1  select sid, username, last_call_et, event, wait_time, seconds_in_wait, state, 
            blocking_session, blocking_session_status 
  2  from v$session 
  3* where sid=201 
SQL> / 

no rows selected

No, so the job should be gone too

SQL> select * from all_scheduler_running_jobs; 

OWNER                          JOB_NAME 
------------------------------ ------------------------------ 
JOB_SUBNAME                    SESSION_ID SLAVE_PROCESS_ID SLAVE_OS_PRO 
------------------------------ ---------- ---------------- ------------ 
RUNNING_INSTANCE RESOURCE_CONSUMER_GROUP 
---------------- -------------------------------- 
ELAPSED_TIME 
--------------------------------------------------------------------------- 
CPU_USED 
--------------------------------------------------------------------------- 
AVSYS                          REFRESH_WAREHOUSE_DATA 
                                      122               19 692392 
               1 
+000 00:01:25.48 
+000 00:00:18.38

Well, it had restarted and been running for a minute or so. The audit trail entries started to appear in the warehouse, anyway. As I won't be working with this again for a while, I'm counting on colleagues to keep me posted about whether this keeps happening!

Oracle DBA Blog

Pages

Thursday, 13 October 2011

Entries tagged as Audit Vault

Entries tagged as Audit Vault

OCT 13: Adding a new Oracle host to Audit Vault

OCT 13: Hanging Audit Vault Warehouse Refresh Job

No comments:

Post a Comment