Performance Whack-a-Mole by Josh Berkus

This turned out to be a good talk, chock-full of concrete practical advice.

Here’s our audience poll:
95% use DB apps
15% are consultants (like the presenter)

Clients just say: “the database is slow,” but anything in the big application stack (application, middleware, database, operating system, hardware_ can make things slow. Every layer in the stack needs do its job to scale the application. Otherwise, we are leaving significant performance gains on the table. In fact, “database” problems are usually somewhere else — like in the middleware.

You have many performance problems in your application, but most aren’t significant. Performance is usually degraded by a few problems — 10% of problems cause 90% of slowdown. (And big problems mask smaller problems.)

Apps tune differently. The presenter uses three rough categories: web, online transaction processing, data warehouse.
Web

  • Entire database can fit in RAM
  • 90% or more of database queries are simple reads
  • CPU-bound
  • Performance moles: caching, pooling, connection time

OLTP

  • Database slightly larger than RAM to 1TB
  • Many small data write queries
  • CPU or I/O bound
  • Moles: locks, cache, transactions, write speed, log

Data Warehouse

  • 100GB to 100TB database
  • Large complex reporting queris
  • Large bulk loads of data
  • Also called “business intelligence”
  • Moles: Sequential scans, resources, bad queries

Your first step should be to measure a baseline. Check basic setup (versions, configurations) for sanity. Then measure each layer.

Tools

  • OS tools are simple, easy to use, and non-invasive
    • ps, mpstat
  • Benchmark. Invasive
    • micorbenchmarkL bonnie++ for files system perf
  • DB
    • DBTrace
    • DB query log
    • Query analysis (troublehoot “bad” queries)
    • ogbench, Wisconsin, TPCB, OSDB, PolePosition
  • Application
    • app server tools
    • workload simulation and screen scraping
  • lwp and log replay tools
  • bug detectors
  • valgrind, MDB, GDB, DTrace

OK, we’ve got the baseline, now what? What are the symptoms; when do they occur?

The Performance Moles

I/O

  • Symptoms: one device saturating I/O — other resources are OK
  • heavy writes or very large DB
  • causes
    • bad I/O hardware, software or configuration
    • not enough RAM
    • too much data requested
    • bad schema

CPU

  • Symptoms: maxed-out CPU, but RAM available
  • mostly-read loads or complex calculations
  • causes
    • too many queries
    • insufficient caching or pooling
    • too much data requested
    • bad queries or schema
  • Note: most DB servers should be CPU-bound at max load

Locking Mole

  • Symptoms: Resource aren’t maxed-out
  • lots of pessimistic locking
  • causes
    • long-running transaction or stored procedures
    • longly held open cursors
    • unneeded pessimistic locking
    • poor transaction management

App Mole

  • Symptoms: DB server maxed, app server is OK
  • common in J2EE
  • causes
    • not enough app servers
    • too much data/ too many queries
    • bad caching.pooling
    • driver issues
    • ORM frameworks

Leave a Reply

You must be logged in to post a comment.

gears