BI Decision

Sunday, April 23, 2006

Is the spell checker telling me something?

MS Word replaces denormalized with demoralized.

How much does a terabyte cost?

A Terabyte is a “Big Round Number”. For while, it was the benchmark of a big data warehouse. However, Moore’s law seems to apply to the cost of hard disks as well. The cost of disks keeps falling.

Just for fun, here are some quick benchmarks:

1 Terabyte of inexpensive PC disks: $344
(4 x 250 GB SATA disks
http://www.newegg.com/Product/Product.asp?Item=N82E16822148065)

1 Terabyte of iPods: $6,650
(17 x 60 GB iPods http://store.apple.com/1-800-MY-APPLE/WebObjects/AppleStore.woa/wo/1.RSLID?mco=CC4D3CBB&nclm=iPod)

1 Terabyte of RAID 5 SCSI attached to a bare bones Xeon server: $8,148
(Dell Poweredge 2800 4x300GB SCSI 10,000 RPM)

1 Terabye of RAID 5 Direct Attached Storage Server: $16,159
(Dell
AX150 iSCSI SAN 3x500 GB)


The problem is that the cost of storage hardware is only a small part of the cost of a Terabyte of data warehouse. A Terabyte has direct costs in server hardware, backup, and supporting software. It has indirect costs in program complexity, mostly in the labor of DBAs, ETL developers, and related program staff.

Storage is cheap. As the cost goes down, substituting storage for development effort will save time and money. For example: storing several aggregates, keeping both normalized and dimensional data, and extending retention periods. The direct costs of storage are unavoidable, but indirect costs scale with complexity.

Friday, April 21, 2006

The beginning

This blog will talk about business intelligence and data warehousing from a practitioner's perspective.

Particular interests are:
1. Applying information to business processes
2. Data modeling & system design
3. Project management