Monday, December 05, 2005

What good is Statspack really?

I remember the days of bstat/estat, and especially the Oracle Support bulletin (version 6 days) about how to interpret the numbers in the resulting reports.txt file. The bulletin was a long list of things you could look at in the file, and that was about it. From it, you could of course conclude nothing about performance problems.

Of course? Yes, looking at aggregations will not, ever, tell you the reason for a slowdown in an application or even in a whole system. The tracing of a form, a report or a job can do that. Not various summaries and averages.

The one exception is when only one thing was running :-) ... then the summary becomes just a summary of that one form, report or whatever. Otherwise, things will be mixed and hence messed up.

Dave Ensor, Graham Wood and a few others made the st7 utility, which allowed you to collect more snapshots and get a report of the time interval between any two snapshots. Still I couldn't figure out how to link the users' performance problems to these numbers.

Then came Statspack. Now, Statspack is much better technically, AND it can collect SQL statement information that will help you identify heavy statements. AND it will summarise the waits, etc. at the top.

It's still not possible for me to find anyone who can use a Statspack output to identify the reason for a form, report or job running slow.

Also, having the haviest SQL statements mentioned is fine, but if the response time changes (see one of my former posts) for an end user, its likely cause is a new and exciting execution plan for a SQL statement in the application, and that SQL statement doesn't have to be the heaviest at all.

I fail to see what I can use Statspack for at all. Please enlighten me :-).

18 Comments:

Anonymous Anonymous said...

Like any tool, estat/bstat, statspack, and AWR reports are only as good as the person using it. It doesn't interpret itself (Tools like oraperf, PTA, and ADDM(?) take it a step further). Statspack is not always the primary source of diagnostics for every performance problems nor does it have to be the only source. To paraphrase a colleague, "You might be able to build a house with a paint brush and then paint the house with a hammer, but it's sure gonna take alot longer than necessary."

I tend to ask the complainer if the problem appears to be system wide or isolated to a user or action. If it's the latter, then 10046 or sql_trace is the thing to look at first. Otherwise at the other end of the spectrum, you gather statspack, OS stats, network stats, middle-tier stats, etc. [Another colleague worked on once case where the biggest impact turned out to be the version of the browser being used on the client side.]

Now if you had your baselines [aka points of reference] :-), then you might see that the most expensive SQL was the most expensive SQL even when performance was okay. I recently helped out on an issue where the performance problem was accompanied by the sudden appearance of a new 3rd highest SQL (top 2 were always the same so we ignored it). I might write that case up in my blog someday soon.

Granted when you only have an AFTER picture then all you can do is see if the problem looks SQL related and go from there. Of course, that may be easier said than done sometimes. And there are more questions to answer than the one above.

10:40 PM  
Anonymous Anonymous said...

I have used Statspack with success for 2 different situations.

1) To get a general idea of where to look for a performance problem. Sometimes users would complain of the performance on one page, but it was someone else doing something else that was causing the problem.

2) To proactively tune. The top 10 sql during a high activity time is a great list of tuning spots. Things that are executed a lot or that take a long time are worth looking into for normal tuning. Sometimes the frequency of complaints is reduced by simply tuning a few of the top 10 statements.

7:13 AM  
Anonymous Anonymous said...

System-level tools are for system-level problems, I guess.

When your whole system appears to have some problem, whether it is caused by an initialization parameter or by a piece of code that is pervasive to many parts of the application, then statspack is a fine tool. If the problem is already isolatable to a single area through user feedback or batch job timing then tracing is more appropriate.

Tim Gorman made a presentation of "Using STATSPACK as a Performance Data Warehouse" at a recent RMOUG meeting, which I unfortunately missed, but using statspack as a long term tracking tool sounded like a pretty interesting idea.

7:15 AM  
Anonymous Anonymous said...

Personally, I could weep everytime I'm asked to use Statspack to solve somebody's performance problem. Even if they are taking frequent snapshots, as opposed to two snapshots per day (I see this all too often).

I keep trying to tell myself that Staspack is OK really, it has at least some useful information. But all the time I'm secretly planning how I can convince them to gather some 10046 trace.

10046 really isn't evil and your system won't curl up and die (provided you pay attention to the *_dump_dest parameters).

I wish I could say "We do not use Statspack".

11:28 AM  
Anonymous Anonymous said...

I use statspack, and before that bstat/estat, to establish trends. Which sort of ties statspack back to your earlier question about baselines. I have a very interesting chart created from a collection of bstat/estat reports.
I came to a client site once (actually my former employer) and was told that the system is "sluggish". Looking around I found that someone had impemented a cron job to take a bstat at 8am and estat at 5pm and there were a year's worth of reports sitting on disk. I began to plot event waits over time, beginning, per chance, with the most recent 2 weeks. The shared_pool latch wait chart showed a very distinct "inverse v" shape (^^^^...) with the valleys on weekends (actually Mondays since I excluded Sat and Sun). The valleys coincided with db restarts after a cold backup on the weekend. When I extended the chart further into the past, I found that that pattern only started 2 weeks ago. Through their change tools I found that on that date someone had increased (!!) the shared pool from 100M to 164M. That was on a 7.3 database and since the Peoplesoft system didn't (still doesn't to a great extent) bind variables the shared pool got filled with non-reusable sql and every parse now took longer - and held onto the shared_pool latch longer - to go through all those sql, find no match and then again go through the shared pool and find sql to age out to make room for the new sql. An example where more was less.
Later I used the same technique - charting events and statistics over time - to identify, in conjunction with the change logging policy, application changes that caused increased sorting, etc.

10:42 AM  
Anonymous Anonymous said...

Yup, I'm very much with Wolfgang: use it for baselining and medium to long term trend analysis.

Like you , I can't see someone solving a single form or report problem by looking at a statspack report and going: "Ah-HAH!". Highly dependent of course on how much that someone charges for the service!

Of course, there are always exceptions. Abnormal situations are always possible and it's really worth having a quick look to see if it ain't one of them. But if I can't make out anything out of the ordinary in the first 15 minutes or so, then it's not worth beating the horse: it's dead!

8:53 PM  
Anonymous Anonymous said...

Nuno,
I think I'm in that space you describe - I do take a look at statspack if it is available. And, like the rest of you, I can recall cases where I was lucky and there was a SQL statement or an insane parameter setting that needed fixing.

But more and more I'm frustrated by statspack's inability to shed any light on how a given Top 5 wait affects the application they want me to tune. Often the Top 5 Waits are irrelevant to that question. Ditto for the SQL statements it reports.

10046 gives me that positive link between the metrics and the application. And I've found it useful to "mine" the Trace Analyzer database repository when I have a collection of trace files for a problem.

I'm usually not involved at that stage where we might collect baselines. They usually ask me when they're already in trouble.

10:48 AM  
Blogger Moans Nogood said...

Good insights into how you guys use or at least view Statspack. Thank you for that.

I think I'll open another topic about system-wide slowdowns versus a form, report or job being "too slow". That, in my mind, is the next natural question to ask after "What is baselining good for?" and "What is Statspack good for?".

Maybe one day I'll see a picture. Perhaps even a big one...

1:32 PM  
Anonymous Anonymous said...

kInteresting topic this. Here's a question; I'm genuinely interested in other's thoughts on this. How much more useful does statspack become when we take a snapshot for an individual session (i_session_id parameter)?

Austin

3:05 PM  
Blogger Moans Nogood said...

I would argue that it only becomes interesting if you have information about a certain session :-).

2:23 PM  
Anonymous Anonymous said...

Mogens,
I use statspack to try to tune a system when I cannot get to the session to set 10046. I'm currently working on a system that uses weblogic to go to the db. and we have several apps running, and it's not easy to catch the session from the pool that will get used. So it's hard to 10046 the sessions. Statspack's list of sql statements gives an idea of where to look, also, if you take spanshots at level 7, the segment statistics are very usefull to see where your 'buffer busy waits' are coming from (i.e. get the object, then let the developper find, with help of the sql statements that usually access it, the culprit program).
Statspack (and AWR) are not the mother of tuning tools, but they are very useful when you cannot easily trace the sessions.

5:00 AM  
Anonymous Anonymous said...

I tend to use Statspack (where available) as a sanity check, have a quick look at the report to see if anything jumps out at me before going more detailed.

Obviously, as other's have mentioned, it's good for where you've got a system wide problem. Although, to be honest, most of the problems I see are of that type (generally someone has decided to save money by using RAID5 with slow (cheap) disks rather than RAID0+1 with fast (more expensive) disks or problems of that ilk) rather than individual forms/reports/queries.

6:19 AM  
Anonymous Anonymous said...

Statspack is essentially useless for diagnosing any problem. It might provide a good chart or view of progress over time. But it has no relevance for diagnosing or finding problems.

11:03 AM  
Blogger oakleyses said...

oakley sunglasses, michael kors outlet, louboutin outlet, prada handbags, air max, burberry outlet, louis vuitton outlet, ray ban sunglasses, tory burch outlet, cheap uggs, michael kors outlet, michael kors outlet, chanel handbags, burberry outlet, oakley sunglasses, kate spade outlet, uggs outlet, oakley sunglasses cheap, prada outlet, longchamp handbags, gucci outlet, air max, louis vuitton, ray ban sunglasses, nike shoes, jordan shoes, nike free, polo ralph lauren, michael kors outlet, polo ralph lauren outlet, ray ban sunglasses, oakley sunglasses, michael kors outlet, christian louboutin, tiffany and co, louboutin, tiffany and co, rolex watches, longchamp outlet, louis vuitton outlet, longchamp handbags, ugg boots clearance, michael kors outlet online sale, uggs, ugg outlet, louis vuitton outlet stores, louboutin shoes, replica watches, louis vuitton handbags

4:36 PM  
Blogger oakleyses said...

vanessa bruno, air max, michael kors uk, coach outlet, michael kors, sac guess, true religion outlet, nike free pas cher, nike free, kate spade handbags, timberland, lululemon outlet online, michael kors outlet, north face, abercrombie and fitch, coach purses, ralph lauren, polo lacoste, oakley pas cher, north face, new balance pas cher, sac hermes, coach outlet store online, converse pas cher, longchamp, nike tn, air max pas cher, hollister, sac burberry, air max, michael kors pas cher, nike air force, hogan outlet, ray ban pas cher, louboutin, polo ralph lauren, true religion jeans, mulberry, sac longchamp pas cher, replica handbags, true religion outlet, nike blazer, vans pas cher, ray ban sunglasses, nike roshe, true religion jeans, nike roshe run pas cher, hollister, nike air max, air jordan

4:38 PM  
Blogger yanmaneee said...

jordan shoes
jordan 13
ultra boost
nike air huarache
kyrie 5 shoes
jordan shoes
yeezy boost
fila
christian louboutin shoes
nike shoes

9:38 PM  
Anonymous Anonymous said...

jordan shoes
golden goose
birkin bags
yeezy 350
kd shoes
curry 8
hermes
off white shoes
yeezy
yeezy

8:45 PM  
Blogger johni said...

SIM registration is a fundamental requirement for using mobile services responsibly and securely. By understanding the process and its significance, individuals can ensure compliance with regulations and protect their interests in the digital realm.

5:08 AM  

Post a Comment

<< Home