2010-03-15

Database related general application performance tips

Most of the serious applications nowadays use some kind of relational database engine. Database becomes main data source and data processing module in application architecture. That's the reason why it's common application performance bottleneck and main scalability factor.

Having design and data flow requirements its possible to prepare database system at the system planning and implementation phase. Don't get to much into performance details at that stage because "premature optimization is the root of the evil". Sometimes system requirements and design meets reality and final expectations, but changes in requirements that could change application data flow are more common. The most accurate results are coming from real data and users usage patterns, so getting system responsiveness metrics is crucial even at the early prototype stages.

Having such data about performance "weak points" is a start point to optimize and improve overall system scalability. It's good to begin from top more abstract layers in application architecture before rushing to lower level database storage tweaking.

There are some tips divided by subsystem scopes and ordered by top to bottom abstraction level:

Application scope
  • discuss with project stakeholders responsiveness requirements for various application functionalities
  • analyze data usage patterns like writes vs reads, common data usage vs casual reporting, written once or changed often etc.- it gives image what could be improved, and what kind of underlying database mechanism you will need
  • remove worst bottleneck (having biggest performance impact) first to get best results
  • use cache for most used data (reads and writes if its possible)
  • design transactions smart- long transactions cause performance problems
  • at first you should use normalized data schema, but there are situations where little denormalization is crucial for good performance (f.e. copying some data to most read table to eliminate need for joining other big tables)

Database system scope
  • use indexes where it works best (every index adds performance penalty for data writes)
  • use vertical and horizontal data partitioning - move less used data into other tables or use database engine specific features
  • configure database and use its features like special table types, special indexes for your best

Operating system scope:
  • use database dedicated host or performance cluster - for large scale systems
  • check network latency and throughput for large data transmissions
  • tune underlying disks structure and file system- separate partitions or disks for database files, use low overhead file system or custom database "raw" partitions (like in Oracle DB)

No comments: