Using Netezza query history to compare performance

Sometime way back when, Netezza introduced a more advanced query history recording mechanism that captured information about sessions, queries, plans, table access and column-level stats.  The great thing about this tool was that each entry in the query history table included the query’s checksum.  This allows us to compare a query’s performance over time even when the plan isn’t around.  Even more helpful, however, is that we can use this to compare query performance between completely different systems.

This week I found myself wrapping up a Mustang to TwinFin… er I mean IBM Netezza 1000 — migration and wrote the following query to easily compare Mustang times against the new TwinFin system.  I wasn’t disappointed to learn that the overwhelming majority of queries got much faster but I did find a few that were slower.  With this information, I could zero in on the actual queries that needed attention and see what we could do to resolve that.

Here is the query (and I’m sure there are many ways to do this):

SELECT
*,
mustang_avg_runtime-twinfin_avg_runtime avg_diff
FROM
(
SELECT
tfp.queryid,
substr(querytext, 1, 6) first_word,
COUNT(1) twinfin_qrycount,
mustang_qrycount,
mustang_min_runtime,
mustang_max_runtime,
mustang_avg_runtime,
MIN(tfp.runtime_seconds) twinfin_min_runtime,
MAX(tfp.runtime_seconds) twinfin_max_runtime,
avg(tfp.runtime_seconds) twinfin_avg_runtime
FROM
(
SELECT
checksum queryid,
querytext,
EXTRACT(epoch from finishtime-submittime) runtime_seconds
FROM
twinfin_qryhist..”$hist_query_prolog_1″ qp
JOIN twinfin_qryhist..”$hist_query_epilog_1″ qe
USING(npsid, npsinstanceid, opid)
WHERE
qp.submittime::date BETWEEN ‘2012-02-27’ AND ‘2012-03-07’
) tfp,
(
SELECT
queryid,
COUNT(1) mustang_qrycount,
MIN(runtime_seconds) mustang_min_runtime,
MAX(runtime_seconds) mustang_max_runtime,
avg(runtime_seconds) mustang_avg_runtime
FROM
(
SELECT
checksum queryid,
EXTRACT(epoch from finishtime-submittime) runtime_seconds
FROM
mustang_qryhist..”$hist_query_prolog_1″ qp
JOIN mustang_qryhist..”$hist_query_epilog_1″ qe
USING(npsid, npsinstanceid, opid)
) x
GROUP BY
1
) mus
WHERE
tfp.queryid = mus.queryid
GROUP BY
1,
2,
4,
5,
6,
7
) x
WHERE
twinfin_qrycount > 5
ORDER BY
(mustang_avg_runtime-twinfin_avg_runtime) DESC ;

I used in-line views because the query history we had for Mustang was very large and aggregating each BEFORE the join resulted in a much faster query (~4s to compare TF against years of Mustang data).

Advertisements
This entry was posted in administration, General, Performance. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s