Wednesday, November 23, 2005

How often should stats be collected?

If you ask Microsoft (or Oracle from version 10g), then just leave it to the database and the clever developers to find out when new stats should be gathered.

Right.

When you collect new stats you are almost guaranteed to change some execution plans in your system. Do you want that? No, you want the same execution plans as long as they have the response time your users want. They haven't asked you for a better (or worse) plan.

Here's a response I saw on a list the other day:

-------------------------------------------------------------
Oracle suggests the following best practices for collecting statistics. http://www.oracle.com/technology/books/pdfs/book_10g_chap9_ce1.pdf

· Collect statistics for normal data dictionary objects using the same interval that you would analyze objects in your schemas. In addition, you need to analyze the dictionary objects after a sufficient amount of DDL operations have occurred.

· Use the procedures gather_database_stats or gather_schema_stats with options set to GATHER AUTO. With this feature, only the objects that need to be re-analyzed are processed every time.

· For fixed objects, the initial collection of statistics is usually sufficient. A subsequent collection is not usually needed, unless workload characteristics have changed dramatically. In the next section, we will examine the changes introduced
-------------------------------------------------------------

And here's my reply:
---------------------------------------------------------------------------
I'd argue that you should be scared of socalled best practices on this and other issues.

How can people seriously write about best practices, when in reality we're talking practices that worked for a few sites that will never, ever resemble your site anyway?

Better to document more and more worst practices, as Cary Millsap suggested several years ago - that way you raise the bar instead of just placing it at a certain position.

By the way: Note the vagueness: "...the same interval" (what interval?), "...after a sufficient amount of DDL operations have occurred" (how much?). Can't be otherwise, but that doesn't make it more usable.

It's like the recommendations from Microsoft regarding SQL Server: Always run automatic stats gathering. Always set parallel to automatic. Always....

Two schools inside Oracle are interesting to study (and we're talking the real experts, not the document writers):

Benchmark guys: Collect 1% estimate after initial load, then never again.

Real World Performance group: Collect compute stats once, then never again.

But there's a better way, I think:

Analyze if it's neccessary!

It's only neccessary if the response time for a SQL statement becomes worse. So monitor the elapsed_time and the hash_plan_value (or plan_hash_value) in V$SQL to see if something goes South. If yes, one possible reason could be stats that are no longer representative for the data (more employees in department 42, much more data due to aquisition of a new company, whatever). THEN it's time to analyze.

Don't do it if you don't have to. Monitor the things that matter.

The reason you can't find any useful rules is that ... you can't find any useful rules.
------------------------------------------------------------------

Plan_hash_value and elapsed_time were added to V$SQL in 9i. Before that you'd need to do your own hashing after an "explain plan for..." command in SQL*Plus and store the result. Peter Gram from Miracle has done it for a 7.3 database.

I could talk for ages about this. How customers have run splendidly for years without analyzing (or precisely because of it), and how it's always ground-hoc day tuning when something changes: A system doesn't "become slow", an application doesn't "become slow". It's one or more SQL statements that change execution plan(s) for the worse. Since we never monitor those we do the same thing over and over again: Find the SQL statement with the largest response time, tune it, back to start.

But what if the SQL statement with the longest response time was NOT the one that changed? What if the SQL statement with the longest response time already had been optimised as best as could be done? That happens more often than you'd like to know.

So instead of zero'ing in on the SQL statement that changed execution plan (and therefor response time) for the worse, and return it to its former state or find out why it went bad, we end up tuning another statement that might not be easily tunable.

Hence my abstract for the Hotsos Symposium 2006: "No, Cary, we're STILL not tuning the right SQL statements" :-).

9 Comments:

Anonymous derkoert said...

I've written a capacity planner a couple of years ago based on statspack, korn shell etc. I've always used it to do forecasts but more importantly - detect trend breaches. I've noticed that on many sites a trend breach on performance statistics happen after gathering stats.
Just this week I was thinking about adding a module to my program to monitoring execution plans. SO - if I know a trend breach has occured AND I know which execution plans changed for the worse then, I *know* where to look instead of wasting my time searching for the needle in the hay stack...

So *great* idea for a paper !

3:31 AM  
Blogger Moans Nogood said...

That makes sense! I discussed these issues with Jonathan Lewis this morning, and apparently he's writing a few "notes" as he likes to call his papers, along these lines.

2:24 PM  
Anonymous Anonymous said...

An excellent topic. However I just wanted to play devil's advocate on the "when to analyse" side of things.

Is there not an argument for telling the optimiser as much truth as you can and hence analysing frequently to achieve this? I agree you may occasionally cause plans to change but you are occasionally going to experience the flip side of the problem by letting the stats stay stale and having a static plan become an inefficient one.

Either way you still need to be on top of it (using a sql plan monitor like you describe).

Just a thought.

Neil

PS Really enjoyed the Oak Table Day at UKOUG - thanks for putting that on.

1:40 AM  
Blogger Doug Burns said...

Mogens,

This is a very interesting topic and I've swung backwards and forwards (and sideways sometimes) over the years.

On the one hand, it seems sensible to give the optimizer the most accurate information that you can. It's pointless having an optimiser that can respond to changes in data distribution if you tell it not to.

On the other, I've always found it very scary that execution plans will change on a production application without anyone really understanding why. Even if the optimiser picks a better plan, the inconsistency would *still* worry me. Sure, you could find out why, but it's going to take time. Unpredicted changes in behaviour are difficult to explain to managers and business types. Well, they can be explained, but that doesn't mean they will be popular!

Statistics gathering has been one of the hot subjects at work this week because of an initial data load for a production system, followed by go-live and the subsequent population of tables that, as far as the CBO is concerned, are all empty. It seems even more likely that you will hit problems with changing execution plans in the early months of the life of a new system, just when you want good performance!

When you're a day-to-day DBA, it's amazing how often you hit the same old problems.

3:42 AM  
Blogger Norman Brightside said...

Performance is very slow.

What's changed ?

Well, we did a large data load and gathered new stats.

Why ?

Well, we have a nightly job to recompute stats for all objects.

Why ?

Because Oracle recommends always updating stats after a large data load.

I see. Well let's reinstate the old statistics to buy us some time while we analyse what's going on ?

What exactly do you mean by 'reinstate the old stats' ?

5:14 AM  
Blogger Roderick said...

In 10gR1, the DBMS_STATS package now includes procedures to specify a stats retention history of N days. Correspondingly, there are RESTORE_*_STATS procedures to revert information to old values at different levels of granularity (object, schema, database). One can also lock stats at different levels so that they will not get overwritten if you like things the way they are and do not anticipate a need to change them. Hopefully maintaining some baseline statspack report and/or sql_trace/tkprof output of typical user interactions / batch jobs can help one pinpoint the reason why performance has changed (for better or worse) at a database or operation level.

8:14 PM  
Blogger David Aldridge said...

One exception to the "never re-gather stats" would be with rolling partitions, of course. There you'd constantly be regathering statistics on partitions, and maybe at the global level as well. On the other hand you'd stop gathering statistics at the partition level once each was full.

Would it be fair to say that the aim of not re-gathering statistics is really just a workaround for getting optimizer plan stability without using the real plan stability functionality? I wonder what the trade-off is between the two approaches? What are the (dis)advantages of formal optimizer plan stability in comparison to maintaining stable statistics?

10:23 AM  
Blogger pandorasell said...

Clip flees and this Chrismas costs buy pandora jewelery coming near. Surprise! Most belated cheap pandora jewelery at Discount Price I plan to explore the net because an allow Christmas gift discount pandora jewelery .In the online search appendage, I unexpectedly read a rattling occupying pandora bracelets and unbelievable story, and I hence besides chose pandora silver bracelets blimey Christmas Day empowers.A-list gross revenue from colligates from london sweetheart cheap pandora bracelets Decades ago in London, a small restaurant enjoyed good business concern and attacted a lot returned discount pandora bracelets buyers.called for to express thanks to these pandora bracelets sale ,

8:36 AM  
Blogger carrera sunglasses said...

Be that a conventional pbngtwq occasion venturing out Oakley Sunglasses Cheap to meet litigant or to wait a achieving Half X Oakley or you might be moving from a weekend Oakley Sunglasses Discount or even a holiday package deal to loosen up Sale Oakley Sunglasses on marine beaches or perhaps drive extended Oakley Jawbone stays to have a great time and satisfaction wholesale Oakley Stpl Jawbone for guys offer every type of Half X. The components used are usually of good quality and are usually durable, rough and also tough merely matching the necessity of a person. The from suppliers sun spectacles Oakley Sunglasses Outlet just easily blend with all the requirement regarding men and also trend which is followed.

4:34 AM  

Post a Comment

Links to this post:

Create a Link

<< Home