The relationship between groom and nzbackup

There are three basic functions that every Netezza DBA must perform regularly:

  1. Ensure statistics are up to date
  2. Groom your tables
  3. Backup your production databases

Let’s focus on groom and its dependency on nzbackup.  I recently ran into an issue where groom was running every day — but none of the deletes/updates were getting removed.

When any backup operation is run, a new entry is created in _t_backup_history recording the type of backup (0 – full, 1 – differential, 2 – cumulative, 4 – schema only, etc), when and for what database. Another key piece of information captured is the backup operation’s transaction id.

Quick piece of background: every record in a Netezza system has four hidden columns:

  1. createxid is the transaction id of the operation that created the record
  2. deletexid is the transaction id of the operation that deleted the record.
  3. datasliceid is the data slice (disk) that the record resides on
  4. rowid  is the unique row identifier (unique to the entire system)

Netezza logically deletes records by populating the deletexid column with the delete operation’s transaction id. This is an instruction to the FPGA to not allow these records past, eliminating visibility to them completely.

So why is the last backup operation ID important?  This information is used by subsequent differential and cumulative backups to identify newly inserted records, deletes and updates. Records logically deleted since the last backup operation are recorded, ensuring a restore of that increment will result in those records properly being deleted.

Groom also uses this information to ensure that all logical deletes were recognized by a backup before it actually physically moves the row. Consider this example:

1. A table has 5 million rows in it when it gets backed up on Sunday night
2. A user deletes all of the rows (not truncate) on Monday morning
3. A groom operation is run

In this example none of the deletes will be physically removed and the table will not shrink.   To put it simply: a logically deleted record is eligible to be physically removed only if the deletexid is less than the last backup’s operation ID.  If it isn’t then groom will leave the record in place (by default).  You can override this behavior by adding ‘reclaim backupset none’  to your GROOM command.  This instructs groom to ignore any existing backup sets and physically remove any logical deletes.

PLEASE NOTE: doing so will mean the next differential backup requested will recognize that a groom was forced and perform a full backup for any tables impacted.

So how did this come up?  Well, it turns out a full backup had been run months back to test the backup script.  No differential or cumulative backups had been run since – meaning the last backup operation hadn’t been updated since.  All of the deletes and updates that had occurred since then weren’t eligible for grooming as a result and the tables slowly grew on disk until we discovered our mistake.

I wrote a script to help identify this situation for future use. It takes a single parameter: database name.  Like anything you find on the net, test it in your environment.

####################################################################
#!/bin/sh

set -- `getopt -a -u -l database: h $*`

while [[ $1 != "--" ]] ; do
     case $1 in
          "--database" ) DATABASE=$2 ; shift ;;
     esac
     shift
done

if [ -z "${DATABASE}" ] ; then
     echo "ERROR: You must specify a database to review with -database DATABASE."
     exit 1
fi

DB_OBJID=`/nz/support/contrib/bin/nz_get_database_objid ${DATABASE}`

if (( ${DB_OBJID:=0} == 0 )) ; then
     echo "ERROR: There is not a valid database with the name ${DATABASE}."
     exit 2
fi

DB_LAST_OPID=`nzsql -t -A -c "select max(opxid) from _t_backup_history where dbname = ^$DATABASE^ and type in ( 0,1,2)"`

if (( ${DB_LAST_OPID:=0} == 0 )) ; then
     echo "ERROR: Could not find any full, diff or cumulative backups for database ${DATABASE}."
     exit 3
fi

nzsql -t -A -c "select rpad(' tablename',50),rpad(' ungroomable',15), rpad(' visible rows',15), rpad(' total rows',15)"
nzsql -t -A -c "select rpad('-',50,'-'),rpad('-',15,'-'), rpad('-',15,'-'), rpad('-',15,'-')"

nzsql ${DATABASE} -c "\dt" -t -A | awk -F \| '{ print $1 }' | while read TABLENAME ; do
     nzsql ${DATABASE} -t -A  <<eof
          \o /dev/null
          set show_deleted_records = 1;
          \o
          select rpad('${TABLENAME}',50) ,
                 lpad(nvl(sum(case when deletexid > ${DB_LAST_OPID} then 1 else 0 end),0),15),
                 lpad(nvl(sum(case when deletexid = 0 then 1 else 0 end),0),15),
                 lpad(count(1),15)
          from   ${TABLENAME}
eof
done

exit 0
####################################################################
About these ads
This entry was posted in administration, General, Performance, Uncategorized and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s