Intersoft Development, Inc. Home Link
If you are currently using SQL-Server for your business applications, are you sure you are realizing its full potential? SQL 2000 and 2005 are quite powerful, and now SQL Server 2008 extends this power...
Read More...
Intersoft Development emphatically recommends CodeGear Delphi for custom software development! With the newest Embarcadero Delphi 2010 release, Delphi continues its tradition of providing industry-leading unmatched software development capabilities in this RAD programming language and IDE.

Delphi 2010 builds upon the already excellent features and capabilities of Delphi 2009 by adding touch/gesturing support to recent Delphi language enhancements for Generics, Closures / Anonymous-Methods, Unicode, and much more...
Read More...
CEO's Blog Topics Below
Recent Blog Topics:
Looking for affordable professional 2D/3D CAD Software? VariCAD is a full-featured comprehensive 2D/3D CAD package offering:
  • Windows / Linux support
  • 3D modeling with automatic export 2D.
  • Sizing of mechanical components.
  • Plus, predefined library of commnon parts.
  • Quick and intuitive 3D/2D GUI
  • Compatibility with DWG, DXF, IGES, STEP and STL...

FREE 3D CAD Software Trial and Sale - VariCAD

Free Software, Source Code, and Best-Practice Documentation Index from your
Cleveland Software, Database, Web Design, Consulting, and SQL / Delphi Experts

SQL Server - Optimization:
Tuning Very Large SQL-Server Databases - Part 2 (Physical Storage Optimization - Continued)

The following is the second in a series of guides to help you achieve optimal performance, speed, and efficiency from your Very Large SQL-Server Databases. These techniques will apply to Microsoft SQL Server 2000 , SQL Server 2005, and SQL Server 2008 as well as many other SQL-Server versions and/or relational database systems (RDMBS) including Oracle, DB2, Sybase, MySQL, Postgres, and others.

The series continues with a focus on optimizing physical data-storage requirements (i.e., space reduction), since it is perhaps the most important overall tuning consideration for large databases.

Untitled Document

Tuning Very Large SQL-Server Databases - Part 2
by Mike Eberhart (5/09/2007)

In Part 1 of SQL-Server Large Database Performance Optimization, I discussed the need to Reduce Physical Storage Requirements and use proper data-normalization as the first foundation-step in achieving optimal SQL-Server processing Speed. I am now going to get into the reasons why this is so important and how it can make considerable speed and performance improvements possible in SQL Server relational databases.

Your Database Server has Finite Resources

I do not care how modern or powerful your database server is, how many processors it has, or how much physical memory (RAM) or hard drive space it has. The fact is, your SQL-Server machine has finite resources and bandwidth / processing-power, and it is utterly foolish to waste that power on pushing around extra data when it is not necessary.

In my years of dealing with large databases for various clients, I repeatedly see management more willing to throw huge sums of money at a "hardware solution" to a performance problem instead of applying some common sense database tuning and optimization techniques. This attempt to fix a performance problem by just throwing ever larger and more powerful computers at it may work in some cases, but in many situations it will just act to postpone the inevitable performance implosion that is lurking in poorly optimized databases. Certainly your database server equipment needs to be up to the task, but the "task" should not include performing completely avoidable disk I/O and extraneous processing.

I don't want to get too deep into how to configure an optimal SQL-Server hardware solution here, but the general things you will need to be aware of when tuning and optimizing for very large SQL databases include:

  1. First and foremost, you will need to understand the various (interacting) limitations of Microsoft Operating Systems and Microsoft SQL Server versions. Both the Operating System (e.g., Windows 2003 Server) and the database (e.g., SQL Server 2005, SQL Server 2008) impose limits on the physical resources that can be utilized by your database application. And, even the "bit-versions" (32-bit vs. 64-bit versions) alter the limitations. There are "Standard" and "Enterprise" versions of each product, and they impose limitations on the number of processors and RAM that will be made available to your database.
  2. Processors (CPUs) - SQL Server can really benefit from multiple CPUs. At a minimum, you should have a dual-processor machine if you really want to improve performance.
  3. RAM - RAM is your friend, and SQL-Server's friend too. Get as much as you can afford, and be sure to couple the version of SQL-Server and Windows Server with the amount of RAM in your system.
  4. Disk Space - you had better have a RAID array of some sort, or an ultra high-speed SAN or such for your physical storage device. And, though you will need a large amount of space, increasing the physical number of hard-drive spindles in motion is how to really start boosting performance. I.e., if you have your database on one or two huge disks, instead of a dozen or more (probably smaller) drives in an array that in aggregate offer the space you need, chances are you'll quickly reach the limitations of the hard disk subsystem's throughput.

    Note: our main focus is to keep physical disk I/O to a minimum, and use RAM wherever possible since physical disk I/O, even on the fastest drive arrays, is many orders of magnitude slower than going to RAM for data.
  5. Network- if you'll be moving large quantities of data to and from the database server machine(s), you best have a Gigabit network in place (or better).

 

Moving Data around is Costly, and occurs within the Confines of your Hardware

I've focused on minimizing the data-storage requirements in your massive SQL-Server database for the sole purpose of increasing performance. The primary way this strategy improves performance is by reducing the amount of data that is moved around, especially to and from (slow) disk drives.

Here are some explanations for how minimizing data-storage requirements increases performance and speed within your SQL Server database solution:

  1. Just think about it: if you have less data (actual bytes) to move, regardless of what speed your system can move this data (whether in RAM or to and from disk), it will still perform the same operation in less time — and, there is a good chance the elapsed processing time will decrease by a percentage greater than the percentage of storage-space you have saved...
  2. ...this is especially true if the reduced data-size causes operations that otherwise were unable to be performed in RAM to now fit within the confines of your server's RAM. If disk-swapping has been eliminated, performance improvements will be stellar.
  3. Every SQL-Server operation, whether you are performing a SELECT, INSERT, UPDATE, or DELETE, must move data between RAM and hard-drives. Smaller data equals faster movement and higher row-throughputs per time-slice. Period.
  4. If an operation is logged, as are any operations that modify data, the amount of data that must be written to the log will also be reduced, and thus speed up your database operation. This can also help keep your transaction-log size under control.
  5. Index-widths (in bytes) should be smaller, thus reducing storage requirements, and even more so allowing much more index information to be cached in RAM at any given time.
  6. Any comparison operations will be faster. And, all sorts of operation are comparison operations. Things like JOIN operators, and ORDER BY clauses, and WHERE conditions — these all require comparisons to be performed. And, the less data to compare, the smaller the amount of resources required to perform the comparison (RAM / Disk). If you've done a great job of reducing physical data-storage requirements, these operations will all run much faster.
  7. Backup jobs will run much faster, since there is much less data to write out to tape or copy to another storage device. Likewise, data-recovery and transfer is much more efficient.


REAL-WORLD SQL TUNING EXAMPLE(S)

Now I will quickly discuss a real-world time-savings and performance comparison I have witnessed before and after optimizing data-storage requirements, so you can get a feel for the magnitude of performance increases that are possible:

Before Tuning: I was confronted with a SQL-Server 2000 database a client had, where they basically "hit a wall" with their internally designed database and stored procedures. Their (relatively new) database was growing at such a phenomenal pace, both in row-counts and physical size, that their database-server hardware was being pushed to its limits, and still couldn't finish some reporting processes in under a couple hours. There were a few main tables, each in excess of 50 million rows, that were involved in the queries which were so long running. The total database size, with indexes, was pushing past 45 Gigabytes, and growing by nearly 10 Gigabytes a month (and, Indexes were about 1/3 of the total DB size). This thing was a time-bomb ready to implode, and it was (of course) meant to be a data-warehouse critical to management-reporting in a Fortune-100 firm.


Tuning: This was one of the largest databases I had ever seen built by someone with absolutely no clue how to properly normalize data and use the minimum amount of storage space possible. Though the few core tables had relatively "narrow" (byte-width) rows (just a couple hundred characters per row), a large part of each row was used as an index (NON-Clustered), and the bulk of the columns in this index were VarChar fields where the columns were mostly filled with data on each row. There were columns like "DivisionName" and "OfficeName" and the like in this Index, each containing data like "Midwest US" and "Cleveland, Ohio Branch". I couldn't even believe it when I saw it.

So, I began by ripping those Varchars out, and replacing them with a TinyInt foreign-key to a Divisions table I made (there were only like eight total values!), and a SmallInt foreign-key to an Offices table I made (with less than a thousand offices). Needless to say, the database size took a large step downward in size. I continued with this type of work...


After Tuning: By the time I was done, the database was only a third of the size it had been to start with. And, I rewrote the reporting process that was taking hours to run to now join on the various Integer columns (vs. all the varchars) and only bring in the "lookup data" (Office Names, etc) when needed at the final output stage. As soon as I broke through the size-threshold where much more of the information required to perform complex JOIN conditions could fit in the database server's 4 Gigabytes of RAM, performance was incredible. Procedures that took hours to run before were now running in a minute or less. This all required very little coding changes, and just some logical application of normalization techniques. The biggest challenge was getting enough processing time to convert the existing production database to the new layout, since it's poor design and massive size made alterations a very slow process — the production server didn't have enough disk-space to allow the conversion to take place at first even!

More examples to come...


Continue with Part 3 of SQL-Server Database Tuning now.

 
.NET framework version: 2.0.50727.3615