What is dimensional data?

Traditional relational databases, such as OLTP databases, are organized around a list of records. Each record contains related information that is organized into attributes (fields). The customer table of the stores_demo demonstration database, which includes fields for name, company, address, phone, and so forth, is a typical example. While this table has several fields of information, each row in the table pertains to only one customer. If you wanted to create a two-dimensional matrix with customer name and any other field, for example, phone number), you would realize that there is only a one-to-one correspondence. The following table is an example of a database table with fields that have only a one-to-one correspondence.

Table 1. A table with a one-to-one correspondences between fields
A four-column table displaying names and phone numbers.
Customer	Phone number --->
Ludwig Pauli	408-789-8075	----------------	----------------
Carole Sadler	----------------	415-822-1289	----------------
Philip Currie	----------------	----------------	414-328-4543

You could put any combination of fields from the preceding customer table in this matrix, but you would always end up with a one-to-one correspondence, which shows that this table is not multidimensional and would not be well suited for a dimensional database.

However, consider a relational table that contains more than a one-to-one correspondence between the fields of the table. Suppose you create a table that contains sales data for products sold in each region of the country. For simplicity, the company has three products that are sold in three regions. The following table shows how you might store this data in a table, using a normalized data model. This table lends itself to multidimensional representation because it has more than one product per region and more than one region per product.

Table 2. A simple table with a many-to-many correspondence
A three-column table showing sales data for products sold in each region of the country.
Product	Region	Unit Sales
Football	East	2300
Football	West	4000
Football	Central	5600
Tennis racket	East	5500
Tennis racket	West	8000
Tennis racket	Central	2300
Baseball	East	10000
Baseball	West	22000
Baseball	Central	34000

Although this data can be forced into the three-field relational table, the data fits more naturally into the two-dimensional matrix in the following table. This matrix better represents the many-to-many relationship of product and region data shown in the previous table.

Table 3. A simple two-dimensional example
A five-column table showing many-to-many relationship of product and region data.
	Region	Central	East	West
Product	Football	5600	2300	4000
	Tennis Racket	2300	5500	8000
	Baseball	34000	10000	22000

The performance advantages of the dimensional model over the normalized model can be great. A dimensional approach simplifies access to the data that you want to summarize or compare. For example, using the dimensional model to query the number of products sold in the West, the database server finds the West column and calculates the total for all row values in that column. To perform the same query on the normalized table, the database server has to search and retrieve each row where the Region column equals 'West' and then aggregate the data. In queries of this kind, the dimensional table can total all values of the West column in a fraction of the time it takes the relational table to find all the 'West' records.