BI Decision

Friday, July 07, 2006

Space is cheap, but every byte in a fact row counts

Replacing natural fields with synthetic integer keys reduces the row length. This can really speed up queries that scan significant numbers of rows.

For example, on a table I recently worked on, switching from natural fields to integer keys reduced the average row size from 122 bytes to 50 bytes. Switching the integer keys to smallints further reduced the row length to 32 . On a 200 Million row table, that reduced the core data storage from 24.4 GB to 6.4 GB (a 74% reduction).

Some quick testing on our prototyping box, a DB2 on XP machine, showed a pretty dramatic performance difference in full table scans. A table scan on the large table took 18 minutes, on the small table, 8 minutes.

0 Comments:

Post a Comment

<< Home